# Tool Result Caching Best Practices

This guide covers how to use tool result caching to reduce API calls, improve performance, and lower costs by caching tool execution results.

## Table of Contents

1. [Overview](#overview)
2. [Basic Caching](#basic-caching)
3. [Cache Configuration](#cache-configuration)
4. [Per-Tool TTL Configuration](#per-tool-ttl-configuration)
5. [Cache Management](#cache-management)
6. [Cache Invalidation](#cache-invalidation)
7. [Performance Monitoring](#performance-monitoring)
8. [Best Practices](#best-practices)

## Overview

Tool result caching provides:

- **Cost Reduction**: 30-50% reduction in API costs by avoiding redundant calls
- **Performance Improvement**: Faster responses for cached results
- **Configurable TTL**: Different cache durations for different tools
- **Automatic Cleanup**: Automatic cache cleanup when capacity threshold reached
- **Memory Management**: Size limits to prevent memory exhaustion

### When to Use Caching

- ✅ Expensive API calls (search, weather, translation)
- ✅ Results don't change frequently
- ✅ Same queries repeated often
- ✅ Cost reduction is important

### When NOT to Use Caching

- ❌ Time-sensitive data (real-time prices, live data)
- ❌ Results change frequently
- ❌ Unique queries each time
- ❌ Memory constraints

## Basic Caching

### Pattern 1: Enable Caching

Enable caching with default configuration.

```python
from aiecs.domain.agent import HybridAgent, AgentConfiguration, CacheConfig
from aiecs.llm import OpenAIClient

# Configure caching
cache_config = CacheConfig(
    enabled=True,
    default_ttl=300  # 5 minutes default
)

agent = HybridAgent(
    agent_id="agent-1",
    name="My Agent",
    llm_client=OpenAIClient(),
    tools=["search", "calculator", "weather"],
    config=AgentConfiguration(),
    cache_config=cache_config
)

await agent.initialize()

# First call - executes tool and caches result
result1 = await agent.execute_tool_with_cache("search", {"query": "Python"})

# Second call with same parameters - uses cache!
result2 = await agent.execute_tool_with_cache("search", {"query": "Python"})
# No API call made - result from cache
```

### Pattern 2: Disable Caching

Disable caching for specific use cases.

```python
# Disable caching
cache_config = CacheConfig(enabled=False)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["search"],
    config=config,
    cache_config=cache_config
)

# All tool calls execute directly (no caching)
result = await agent.execute_tool_with_cache("search", {"query": "Python"})
```

### Pattern 3: Automatic Caching

Agent automatically caches tool results when caching is enabled.

```python
# Agent automatically caches results
result = await agent.execute_task(
    {"description": "Search for Python"},
    {}
)
# Tool result cached automatically
```

## Cache Configuration

### Pattern 1: Basic Configuration

Configure basic caching settings.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=300,  # 5 minutes
    max_cache_size=1000,  # Maximum 1000 cached entries
    max_memory_mb=100  # Maximum 100MB cache memory
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["search"],
    config=config,
    cache_config=cache_config
)
```

### Pattern 2: Aggressive Caching

Configure aggressive caching for expensive operations.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=3600,  # 1 hour default
    max_cache_size=5000,  # Larger cache
    max_memory_mb=500,  # More memory
    cleanup_threshold=0.95  # Cleanup at 95% capacity
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["search"],
    config=config,
    cache_config=cache_config
)
```

### Pattern 3: Conservative Caching

Configure conservative caching for frequently changing data.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=60,  # 1 minute (short TTL)
    max_cache_size=100,  # Small cache
    cleanup_threshold=0.8  # Cleanup at 80% capacity
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["weather"],  # Weather changes frequently
    config=config,
    cache_config=cache_config
)
```

## Per-Tool TTL Configuration

### Pattern 1: Tool-Specific TTL

Configure different TTL for different tools.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=300,  # 5 minutes default
    tool_specific_ttl={
        "search": 600,  # Search cached for 10 minutes
        "calculator": 3600,  # Calculator cached for 1 hour
        "weather": 1800,  # Weather cached for 30 minutes
        "translation": 7200  # Translation cached for 2 hours
    }
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["search", "calculator", "weather", "translation"],
    config=config,
    cache_config=cache_config
)
```

### Pattern 2: Disable Caching for Specific Tools

Disable caching for specific tools.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=300,
    tool_specific_ttl={
        "live_data": 0,  # Disable caching (0 TTL)
        "real_time": 0
    }
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["search", "live_data", "real_time"],
    config=config,
    cache_config=cache_config
)
```

### Pattern 3: Long-Term Caching

Configure long-term caching for stable data.

```python
cache_config = CacheConfig(
    enabled=True,
    default_ttl=300,
    tool_specific_ttl={
        "dictionary": 86400,  # 24 hours
        "encyclopedia": 86400,  # 24 hours
        "historical_data": 604800  # 7 days
    }
)

agent = HybridAgent(
    agent_id="agent-1",
    llm_client=llm_client,
    tools=["dictionary", "encyclopedia", "historical_data"],
    config=config,
    cache_config=cache_config
)
```

## Cache Management

### Pattern 1: Cache Statistics

Get cache statistics to monitor performance.

```python
# Get cache statistics
stats = agent.get_cache_stats()

print(f"Cache size: {stats['size']}")
print(f"Cache hits: {stats['hits']}")
print(f"Cache misses: {stats['misses']}")
print(f"Hit rate: {stats['hit_rate']:.1%}")
print(f"Memory usage: {stats['memory_mb']:.1f}MB")
```

### Pattern 2: Cache Cleanup

Manually trigger cache cleanup.

```python
# Clean up expired entries
cleaned_count = agent.cleanup_cache()

print(f"Cleaned up {cleaned_count} expired entries")
```

### Pattern 3: Cache Size Monitoring

Monitor cache size and trigger cleanup when needed.

```python
stats = agent.get_cache_stats()

if stats['size'] > cache_config.max_cache_size * 0.9:
    # Cache approaching limit - cleanup
    cleaned_count = agent.cleanup_cache()
    print(f"Cleaned up {cleaned_count} entries")
```

## Cache Invalidation

### Pattern 1: Invalidate Specific Tool

Invalidate cache for a specific tool.

```python
# Invalidate all cache entries for "search" tool
invalidated_count = agent.invalidate_cache(tool_name="search")

print(f"Invalidated {invalidated_count} cache entries")
```

### Pattern 2: Invalidate by Pattern

Invalidate cache entries matching a pattern.

```python
# Invalidate cache entries matching pattern
invalidated_count = agent.invalidate_cache(pattern="query:Python*")

print(f"Invalidated {invalidated_count} cache entries")
```

### Pattern 3: Clear All Cache

Clear entire cache.

```python
# Clear all cache
invalidated_count = agent.invalidate_cache()

print(f"Cleared {invalidated_count} cache entries")
```

### Pattern 4: Time-Based Invalidation

Invalidate cache based on age.

```python
import time

# Invalidate entries older than 1 hour
stats = agent.get_cache_stats()
current_time = time.time()

# Get cache timestamps and invalidate old entries
# (Implementation depends on agent's cache structure)
```

## Performance Monitoring

### Pattern 1: Cache Hit Rate Monitoring

Monitor cache hit rate to optimize configuration.

```python
# Get cache statistics
stats = agent.get_cache_stats()

if stats['hit_rate'] < 0.5:  # Less than 50% hit rate
    logger.warning("Low cache hit rate - consider adjusting TTL")
elif stats['hit_rate'] > 0.9:  # More than 90% hit rate
    logger.info("High cache hit rate - caching is effective")
```

### Pattern 2: Cost Savings Calculation

Calculate cost savings from caching.

```python
stats = agent.get_cache_stats()

# Estimate cost savings
api_call_cost = 0.01  # $0.01 per API call
cache_hits = stats['hits']
cost_saved = cache_hits * api_call_cost

print(f"Cache hits: {cache_hits}")
print(f"Estimated cost saved: ${cost_saved:.2f}")
```

### Pattern 3: Performance Impact

Measure performance impact of caching.

```python
import time

# Without cache
start = time.time()
result1 = await agent.execute_tool("search", {"query": "Python"})
time_without_cache = time.time() - start

# With cache (second call)
start = time.time()
result2 = await agent.execute_tool_with_cache("search", {"query": "Python"})
time_with_cache = time.time() - start

speedup = time_without_cache / time_with_cache
print(f"Speedup: {speedup:.1f}x faster with cache")
```

## Best Practices

### 1. Configure Appropriate TTL

Set TTL based on data freshness requirements:

```python
# Stable data: Long TTL
cache_config = CacheConfig(
    tool_specific_ttl={
        "dictionary": 86400,  # 24 hours
        "encyclopedia": 86400
    }
)

# Frequently changing data: Short TTL
cache_config = CacheConfig(
    tool_specific_ttl={
        "weather": 1800,  # 30 minutes
        "stock_prices": 60  # 1 minute
    }
)
```

### 2. Monitor Cache Performance

Regularly monitor cache performance:

```python
stats = agent.get_cache_stats()

if stats['hit_rate'] < 0.3:
    # Low hit rate - consider disabling caching or adjusting TTL
    logger.warning("Low cache hit rate")
```

### 3. Set Appropriate Cache Size

Set cache size based on available memory:

```python
cache_config = CacheConfig(
    max_cache_size=1000,  # Adjust based on memory
    max_memory_mb=100  # Monitor memory usage
)
```

### 4. Invalidate Stale Data

Invalidate cache when data changes:

```python
# After updating data
agent.invalidate_cache(tool_name="search")
```

### 5. Use for Expensive Operations

Use caching for expensive operations:

```python
# Good: Expensive API calls
cache_config = CacheConfig(
    tool_specific_ttl={
        "search": 600,  # Cache expensive search
        "translation": 3600  # Cache expensive translation
    }
)

# Less useful: Cheap operations
cache_config = CacheConfig(
    tool_specific_ttl={
        "calculator": 0  # Don't cache cheap calculations
    }
)
```

### 6. Handle Cache Errors Gracefully

Handle cache errors without breaking functionality:

```python
try:
    result = await agent.execute_tool_with_cache("search", {"query": "Python"})
except CacheError as e:
    logger.error(f"Cache error: {e}")
    # Fall back to direct execution
    result = await agent.execute_tool("search", {"query": "Python"})
```

## Summary

Tool result caching provides:
- ✅ 30-50% cost reduction
- ✅ Faster responses for cached results
- ✅ Configurable TTL per tool
- ✅ Automatic cleanup
- ✅ Memory management

**Key Takeaways**:
- Use for expensive operations
- Configure appropriate TTL
- Monitor cache performance
- Invalidate stale data
- Set appropriate cache size

For more details, see:
- [Agent Integration Guide](./AGENT_INTEGRATION.md)
- [Parallel Tool Execution](./PARALLEL_TOOL_EXECUTION.md)