# Knowledge Retrieval Configuration Guide This guide documents the configuration options for knowledge retrieval in `KnowledgeAwareAgent`. ## Table of Contents 1. [Configuration Overview](#configuration-overview) 2. [Retrieval Strategy Options](#retrieval-strategy-options) 3. [Caching Configuration](#caching-configuration) 4. [Entity Extraction Configuration](#entity-extraction-configuration) 5. [Performance Tuning](#performance-tuning) 6. [Configuration Examples](#configuration-examples) --- ## Configuration Overview The `KnowledgeAwareAgent` supports comprehensive configuration for knowledge retrieval through the `AgentConfiguration` model. All configuration options can be set when creating the agent. ### Basic Configuration ```python from aiecs.domain.agent import KnowledgeAwareAgent, AgentConfiguration from aiecs.infrastructure.graph_storage import InMemoryGraphStore config = AgentConfiguration( retrieval_strategy="hybrid", enable_knowledge_caching=True, cache_ttl=300, max_context_size=50, ) agent = KnowledgeAwareAgent( agent_id="agent_1", name="Knowledge Agent", llm_client=llm_client, tools=[], config=config, graph_store=graph_store ) ``` --- ## Retrieval Strategy Options The `retrieval_strategy` parameter controls how knowledge is retrieved from the graph. ### Available Strategies #### 1. `"vector"` - Vector Similarity Search **Description**: Uses semantic similarity via embeddings to find relevant entities. **How It Works**: - Converts query to embedding vector - Computes cosine similarity with entity embeddings - Returns top-k most similar entities **Use Cases**: - Semantic similarity queries - Content discovery - Finding conceptually related entities **Example**: ```python config = AgentConfiguration( retrieval_strategy="vector", max_context_size=10 ) ``` **Performance**: O(n) where n = number of entities with embeddings **Requirements**: - LLM client must support `get_embeddings()` method - Entities must have embeddings stored --- #### 2. `"graph"` - Graph Traversal Search **Description**: Explores graph structure starting from seed entities. **How It Works**: - Extracts seed entities from query (via entity extraction) - Traverses graph edges using BFS - Scores entities by depth (closer = higher score) **Use Cases**: - Relationship exploration - Network analysis - Finding connected entities - Multi-hop queries **Example**: ```python config = AgentConfiguration( retrieval_strategy="graph", max_context_size=20 ) ``` **Performance**: O(b^d) where b = branching factor, d = depth **Requirements**: - Seed entities must be extractable from query - Graph must have relationships between entities **Note**: If no seed entities are found, the system will attempt to use vector search to find seeds. --- #### 3. `"hybrid"` - Combined Vector + Graph Search **Description**: Combines vector similarity with graph structure traversal. **How It Works**: 1. Performs vector search to find initial candidates 2. Expands results with graph neighbors 3. Combines scores using weighted averaging (default: 60% vector, 40% graph) **Use Cases**: - Comprehensive search requiring both semantic and structural signals - Context-aware retrieval - Balanced results **Example**: ```python config = AgentConfiguration( retrieval_strategy="hybrid", max_context_size=15 ) ``` **Performance**: O(n + b^d) - combines vector and graph search **Requirements**: - LLM client must support embeddings - Entities must have embeddings - Graph must have relationships --- #### 4. `"auto"` - Automatic Strategy Selection **Description**: Automatically selects the best strategy based on query characteristics. **How It Works**: - Analyzes query using `QueryIntentClassifier` - Selects strategy based on query type: - Semantic queries → `vector"` - Relationship queries → `"graph"` - General queries → `"hybrid"` **Use Cases**: - Dynamic query handling - Optimal strategy selection without manual configuration **Example**: ```python config = AgentConfiguration( retrieval_strategy="auto", max_context_size=20 ) ``` **Performance**: Varies based on selected strategy **Requirements**: - QueryIntentClassifier must be available - Falls back to hybrid if classification fails --- ### Strategy Selection Guidelines | Query Type | Recommended Strategy | Reason | |------------|---------------------|--------| | "Find articles about machine learning" | `vector` | Semantic similarity | | "Who does Alice know?" | `graph` | Relationship traversal | | "Find people working at TechCorp" | `hybrid` | Needs both semantic and structural | | "What is machine learning?" | `auto` | Let system decide | --- ## Caching Configuration Knowledge retrieval results can be cached to improve performance for repeated queries. ### Configuration Options #### `enable_knowledge_caching` **Type**: `bool` **Default**: `True` **Description**: Enable/disable caching for knowledge retrieval results. ```python config = AgentConfiguration( enable_knowledge_caching=True # Enable caching ) ``` **Benefits**: - Faster response times for repeated queries - Reduced load on graph store - Lower API costs (fewer embedding generations) --- #### `cache_ttl` **Type**: `int` **Default**: `300` (5 minutes) **Description**: Cache time-to-live in seconds. ```python config = AgentConfiguration( enable_knowledge_caching=True, cache_ttl=600 # Cache for 10 minutes ) ``` **Guidelines**: - **Short TTL (60-300s)**: Frequently changing knowledge, real-time data - **Medium TTL (300-1800s)**: Stable knowledge, moderate update frequency - **Long TTL (1800-3600s)**: Static knowledge, rarely changes **Cache Backend**: - Uses Redis if available (via `REDIS_HOST` environment variable) - Falls back to in-memory cache if Redis not available --- ### Cache Invalidation Caches are automatically invalidated when: - TTL expires - Graph store data changes (if supported) - Manual invalidation via agent methods --- ## Entity Extraction Configuration Entity extraction identifies entities from queries to use as seed entities for graph traversal. ### Configuration Option #### `entity_extraction_provider` **Type**: `str` **Default**: `"llm"` **Description**: Entity extraction provider to use. **Available Providers**: 1. **`"llm"`** - LLM-based extraction (default) - Uses LLM to extract entities from text - Supports custom entity types - More accurate but slower 2. **`"ner"`** - Named Entity Recognition - Uses NER models for extraction - Faster but less flexible - Limited to standard entity types 3. **Custom provider name** - Use custom provider registered via `LLMClientFactory` - Allows integration with external extraction services **Example**: ```python config = AgentConfiguration( entity_extraction_provider="llm" # Use LLM-based extraction ) ``` --- ## Performance Tuning ### `max_context_size` **Type**: `int` **Default**: `50` **Description**: Maximum number of knowledge entities to include in context. ```python config = AgentConfiguration( max_context_size=20 # Limit to 20 entities ) ``` **Guidelines**: - **Small (10-20)**: Fast retrieval, focused context - **Medium (20-50)**: Balanced performance and context - **Large (50-100)**: Comprehensive context, slower retrieval **Impact**: - Larger values → more context but slower retrieval - Smaller values → faster retrieval but less context --- ### Context Prioritization Entities are prioritized using: 1. **Relevance Score**: From search strategy (vector similarity or graph distance) 2. **Recency**: More recent entities prioritized 3. **Relevance Threshold**: Entities below threshold are filtered out **Configuration** (internal, not directly configurable): - Relevance weight: 60% - Recency weight: 40% - Default relevance threshold: 0.3 --- ## Configuration Examples ### Example 1: Fast Semantic Search ```python config = AgentConfiguration( retrieval_strategy="vector", enable_knowledge_caching=True, cache_ttl=600, # 10 minutes max_context_size=10 # Small context for speed ) agent = KnowledgeAwareAgent( agent_id="fast_agent", name="Fast Semantic Agent", llm_client=llm_client, tools=[], config=config, graph_store=graph_store ) ``` **Use Case**: Fast semantic similarity queries with caching --- ### Example 2: Comprehensive Graph Exploration ```python config = AgentConfiguration( retrieval_strategy="graph", enable_knowledge_caching=False, # Disable cache for fresh results max_context_size=50, # Larger context entity_extraction_provider="llm" ) agent = KnowledgeAwareAgent( agent_id="explorer_agent", name="Graph Explorer Agent", llm_client=llm_client, tools=[], config=config, graph_store=graph_store ) ``` **Use Case**: Deep graph exploration with fresh results --- ### Example 3: Balanced Hybrid Search ```python config = AgentConfiguration( retrieval_strategy="hybrid", enable_knowledge_caching=True, cache_ttl=300, # 5 minutes max_context_size=30, # Balanced context size entity_extraction_provider="llm" ) agent = KnowledgeAwareAgent( agent_id="hybrid_agent", name="Hybrid Search Agent", llm_client=llm_client, tools=[], config=config, graph_store=graph_store ) ``` **Use Case**: General-purpose knowledge retrieval --- ### Example 4: Auto Strategy with Custom Cache ```python config = AgentConfiguration( retrieval_strategy="auto", # Automatic strategy selection enable_knowledge_caching=True, cache_ttl=1800, # 30 minutes (for stable knowledge) max_context_size=25 ) agent = KnowledgeAwareAgent( agent_id="auto_agent", name="Auto Strategy Agent", llm_client=llm_client, tools=[], config=config, graph_store=graph_store ) ``` **Use Case**: Flexible queries with optimal strategy selection --- ## Environment Variables Some configuration can also be set via environment variables: ```bash # Retrieval strategy export KG_RETRIEVAL_STRATEGY="hybrid" # Cache configuration export KG_ENABLE_CACHE="true" export KG_CACHE_TTL="300" # Entity extraction export KG_ENTITY_EXTRACTION_PROVIDER="llm" ``` **Note**: Programmatic configuration (via `AgentConfiguration`) takes precedence over environment variables. --- ## Best Practices ### 1. Choose Strategy Based on Query Type - **Semantic queries**: Use `"vector"` - **Relationship queries**: Use `"graph"` - **General queries**: Use `"hybrid"` or `"auto"` ### 2. Tune Cache TTL Based on Data Stability - **Stable knowledge**: Longer TTL (10-30 minutes) - **Dynamic knowledge**: Shorter TTL (1-5 minutes) ### 3. Balance Context Size - Start with default (50) - Reduce if retrieval is too slow - Increase if context is insufficient ### 4. Monitor Performance - Track cache hit rates - Monitor retrieval latency - Adjust configuration based on metrics --- ## Troubleshooting ### Issue: Retrieval Strategy Not Working **Symptom**: Selected strategy doesn't seem to be used **Solutions**: - Verify LLM client supports `get_embeddings()` for vector/hybrid strategies - Check that entities have embeddings stored - Ensure graph has relationships for graph strategy - Verify seed entities can be extracted for graph strategy ### Issue: Cache Not Working **Symptom**: Cache hit rate is 0% **Solutions**: - Verify `enable_knowledge_caching=True` - Check Redis connection (if using Redis backend) - Ensure queries are identical (cache key includes query text and strategy) ### Issue: Too Many/Few Entities Retrieved **Symptom**: Context size doesn't match expectations **Solutions**: - Adjust `max_context_size` parameter - Check relevance threshold settings - Verify graph has sufficient entities --- ## Related Documentation - [Retrieval Strategies Guide](../search/SEARCH_STRATEGIES.md) - [Agent Integration Guide](./AGENT_INTEGRATION.md) - [Performance Guide](../PERFORMANCE_GUIDE.md) - [Configuration Guide](../CONFIGURATION_GUIDE.md)