# Graph Search Tool Documentation

## Overview

The **Graph Search Tool** is an AIECS tool that provides powerful knowledge graph search capabilities through multiple search modes. It enables AI agents to query knowledge graphs using vector similarity, graph structure, hybrid approaches, and advanced retrieval strategies.

## Tool Registration

- **Tool Name**: `graph_search`
- **Tool Class**: `GraphSearchTool`
- **Auto-registered**: Yes (via `@register_tool` decorator)

## Features

The tool supports **7 search modes**:

1. **Vector Search** - Semantic similarity search using embeddings
2. **Graph Search** - Structure-based exploration from seed entities
3. **Hybrid Search** - Combined vector + graph search
4. **PageRank** - Importance ranking using Personalized PageRank
5. **Multi-Hop** - N-hop neighbor discovery
6. **Filtered** - Property-based entity filtering
7. **Traverse** - Pattern-based graph traversal

## Input Schema

### Required Parameters

- `mode` (string): Search mode - one of:
  - `"vector"` - Vector similarity search
  - `"graph"` - Graph structure search
  - `"hybrid"` - Combined search
  - `"pagerank"` - PageRank importance
  - `"multihop"` - Multi-hop neighbors
  - `"filtered"` - Filtered retrieval
  - `"traverse"` - Pattern traversal

### Optional Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | string | None | Natural language query (auto-converted to embedding) |
| `query_embedding` | List[float] | None | Pre-computed query embedding vector |
| `seed_entity_ids` | List[string] | None | Starting entity IDs (for graph/pagerank/multihop/traverse) |
| `entity_type` | string | None | Filter by entity type (e.g., "Person", "Company") |
| `property_filters` | object | None | Filter by properties (e.g., `{"role": "Engineer"}`) |
| `relation_types` | List[string] | None | Filter by relation types |
| `max_results` | integer | 10 | Maximum results to return (1-100) |
| `max_depth` | integer | 2 | Maximum traversal depth (1-5) |
| `vector_threshold` | float | 0.0 | Minimum similarity threshold (0.0-1.0) |
| `vector_weight` | float | 0.6 | Vector weight in hybrid mode (0.0-1.0) |
| `graph_weight` | float | 0.4 | Graph weight in hybrid mode (0.0-1.0) |
| `expand_results` | boolean | true | Expand results with neighbors (hybrid) |
| `use_cache` | boolean | true | Enable result caching |
| `enable_reranking` | boolean | false | Enable result reranking for improved relevance |
| `rerank_strategy` | string | "text" | Reranking strategy: "text", "semantic", "structural", "hybrid" |

### Reranking Parameters

The tool supports **result reranking** to improve search relevance by re-scoring initial results using additional signals:

- **`enable_reranking`** (boolean, default: false): Enable/disable reranking
- **`rerank_strategy`** (string, default: "text"): Reranking strategy:
  - `"text"`: Text similarity reranking (BM25-based)
  - `"semantic"`: Semantic similarity using embeddings
  - `"structural"`: Graph structure importance (centrality, PageRank)
  - `"hybrid"`: Combines all signals for best results

**When to use reranking:**
- When initial search results need refinement
- For complex queries requiring multiple relevance signals
- When combining vector and graph search (hybrid mode)
- To boost precision at the cost of slight latency increase

## Output Format

### Success Response

```json
{
  "success": true,
  "mode": "hybrid",
  "num_results": 5,
  "results": [
    {
      "entity_id": "person_1",
      "entity_type": "Person",
      "properties": {
        "name": "Alice",
        "role": "Engineer"
      },
      "score": 0.95,
      "score_type": "combined"  // Optional, depends on mode
    }
  ]
}
```

### Error Response

```json
{
  "success": false,
  "error": "Error message here"
}
```

## Usage Examples

### Example 1: Vector Search

Find entities semantically similar to a query.

```python
result = tool.execute({
    "mode": "vector",
    "query": "machine learning researchers",
    "max_results": 10,
    "vector_threshold": 0.7
})
```

**Use Cases:**
- Content discovery
- Semantic similarity matching
- Finding related entities

### Example 2: Graph Search

Explore graph structure from seed entities.

```python
result = tool.execute({
    "mode": "graph",
    "seed_entity_ids": ["company_1"],
    "max_depth": 2,
    "max_results": 20
})
```

**Use Cases:**
- Relationship exploration
- Network analysis
- Connected entity discovery

### Example 3: Hybrid Search

Combine vector similarity with graph structure.

```python
result = tool.execute({
    "mode": "hybrid",
    "query": "AI research papers",
    "seed_entity_ids": ["author_1"],
    "vector_weight": 0.6,
    "graph_weight": 0.4,
    "max_results": 15
})
```

**Use Cases:**
- Comprehensive search
- Balanced semantic + structural results
- Context-aware retrieval

### Example 4: PageRank Search

Find influential entities in the graph.

```python
result = tool.execute({
    "mode": "pagerank",
    "seed_entity_ids": ["key_person"],
    "max_results": 10
})
```

**Use Cases:**
- Influence analysis
- Authority ranking
- Central entity identification

### Example 5: Multi-Hop Search

Discover entities within N hops.

```python
result = tool.execute({
    "mode": "multihop",
    "seed_entity_ids": ["person_1"],
    "max_depth": 3,
    "max_results": 25
})
```

**Use Cases:**
- Friend-of-friend discovery
- Local network exploration
- Proximity-based search

### Example 6: Filtered Search

Precise filtering by entity properties.

```python
result = tool.execute({
    "mode": "filtered",
    "entity_type": "Person",
    "property_filters": {
        "role": "Engineer",
        "level": "Senior",
        "location": "SF"
    },
    "max_results": 50
})
```

**Use Cases:**
- Attribute-based selection
- Data validation queries
- Precise entity lookup

### Example 7: Pattern-Based Traversal

Follow specific relationship patterns.

```python
result = tool.execute({
    "mode": "traverse",
    "seed_entity_ids": ["start_node"],
    "relation_types": ["WORKS_FOR", "LOCATED_IN"],
    "max_depth": 2,
    "max_results": 15
})
```

**Use Cases:**
- Pattern matching
- Path discovery
- Relationship chain exploration

### Example 8: Search with Reranking

Improve search relevance using reranking.

```python
# Hybrid search with semantic reranking
result = tool.execute({
    "mode": "hybrid",
    "query": "machine learning experts in computer vision",
    "max_results": 20,
    "enable_reranking": True,
    "rerank_strategy": "semantic"
})
```

**Reranking Strategies:**

1. **Text Reranking** - BM25-based text similarity
```python
result = tool.execute({
    "mode": "vector",
    "query": "database optimization",
    "enable_reranking": True,
    "rerank_strategy": "text"
})
```

2. **Semantic Reranking** - Deep semantic similarity
```python
result = tool.execute({
    "mode": "hybrid",
    "query": "natural language processing",
    "enable_reranking": True,
    "rerank_strategy": "semantic"
})
```

3. **Structural Reranking** - Graph importance signals
```python
result = tool.execute({
    "mode": "graph",
    "seed_entity_ids": ["person_1"],
    "enable_reranking": True,
    "rerank_strategy": "structural"
})
```

4. **Hybrid Reranking** - All signals combined (best results)
```python
result = tool.execute({
    "mode": "hybrid",
    "query": "AI researchers",
    "enable_reranking": True,
    "rerank_strategy": "hybrid",
    "max_results": 10
})
```

**Use Cases:**
- Improving precision for complex queries
- Combining multiple relevance signals
- Refining large result sets
- Production search systems

## Advanced Usage

### Combining with Other Tools

```python
# First, search for relevant entities
search_result = graph_search_tool.execute({
    "mode": "hybrid",
    "query": "AI research",
    "max_results": 5
})

# Then, build more knowledge from found entities
for entity in search_result["results"]:
    entity_id = entity["entity_id"]
    # Use entity_id for further operations
```

### Caching for Performance

```python
# Enable caching (default)
result1 = tool.execute({
    "mode": "vector",
    "query": "frequent query",
    "use_cache": true
})

# Second call uses cache
result2 = tool.execute({
    "mode": "vector",
    "query": "frequent query",
    "use_cache": true
})  # Much faster!
```

### Entity Type Filtering

```python
# Only search Person entities
result = tool.execute({
    "mode": "hybrid",
    "query": "software engineer",
    "entity_type": "Person",
    "max_results": 20
})
```

## Performance Considerations

### Search Mode Performance

| Mode | Complexity | Best For | Typical Time |
|------|-----------|----------|--------------|
| Vector | O(n) | Small-medium graphs (< 10K entities) | Fast |
| Graph | O(b^d) | Local exploration (depth ≤ 3) | Fast |
| Hybrid | O(n + b^d) | Balanced search | Medium |
| PageRank | O(iterations × edges) | Graphs < 10K nodes | Medium |
| Multi-Hop | O(b^d) | Shallow depth (≤ 3) | Fast |
| Filtered | O(n) | Property-based queries | Fast |
| Traverse | O(paths) | Pattern matching | Medium |

### Optimization Tips

1. **Use appropriate max_results**: Lower values are faster
2. **Limit max_depth**: Keep ≤ 3 for graph/multihop modes
3. **Enable caching**: Significantly improves repeated queries
4. **Use vector_threshold**: Reduces vector search space
5. **Apply entity_type filter**: Narrows search scope

## Error Handling

### Common Errors

**Invalid Mode**
```json
{
  "success": false,
  "error": "Unknown search mode: invalid_mode"
}
```

**Missing Required Parameters**
- Vector mode requires `query` or `query_embedding`
- Graph/PageRank/Multi-Hop modes require `seed_entity_ids`

### Error Recovery

```python
result = tool.execute({
    "mode": "vector",
    "query": "search query"
})

if not result["success"]:
    # Handle error
    print(f"Search failed: {result['error']}")
    # Fallback to different mode or parameters
else:
    # Process results
    for entity in result["results"]:
        print(f"Found: {entity['entity_id']}")
```

## Integration with Agent Workflows

### Agentic Pattern 1: Exploratory Search

```python
# Agent explores graph iteratively
current_entities = ["start_node"]
discovered = []

for depth in range(3):
    result = tool.execute({
        "mode": "multihop",
        "seed_entity_ids": current_entities,
        "max_depth": 1
    })
    
    discovered.extend(result["results"])
    current_entities = [e["entity_id"] for e in result["results"]]
```

### Agentic Pattern 2: Ranked Discovery

```python
# Agent finds and ranks important entities
pagerank_result = tool.execute({
    "mode": "pagerank",
    "seed_entity_ids": ["topic_entity"],
    "max_results": 20
})

# Filter by properties
filtered = [
    e for e in pagerank_result["results"]
    if e["properties"].get("verified") == True
]
```

### Agentic Pattern 3: Multi-Modal Search

```python
# Combine different search modes
vector_results = tool.execute({
    "mode": "vector",
    "query": "AI research"
})

# Use vector results as seeds for graph exploration
graph_results = tool.execute({
    "mode": "graph",
    "seed_entity_ids": [e["entity_id"] for e in vector_results["results"][:3]]
})
```

## Testing

The tool includes comprehensive unit tests covering:
- All 7 search modes
- Entity type filtering
- Property-based filtering
- Error handling
- Parameter validation

Run tests:
```bash
poetry run pytest test/unit_tests/tools/test_graph_search_tool.py -v
```

## See Also

- [Graph Builder Tool Documentation](./GRAPH_BUILDER_TOOL.md)
- [Search Strategies Documentation](../search/SEARCH_STRATEGIES.md)
- [Graph Reasoning Tool Documentation](./GRAPH_REASONING_TOOL.md)

## Support

For issues or questions:
- Check test examples in `test/unit_tests/tools/test_graph_search_tool.py`
- Review implementation in `aiecs/tools/knowledge_graph/graph_search_tool.py`
- See usage examples in `docs/knowledge_graph/examples/`