Custom LLM Client Integration
This document describes how to integrate custom LLM providers into the AIECS framework using the LLMClientProtocol interface.
Overview
AIECS supports custom LLM providers through a protocol-based interface that allows you to:
Integrate local LLM models (Ollama, LM Studio, etc.)
Use proprietary or specialized LLM services
Implement custom embedding providers
Configure different LLMs for different operations (entity extraction, RAG strategy selection, embeddings)
LLMClientProtocol Interface
Custom LLM clients must implement the LLMClientProtocol interface defined in aiecs/llm/protocols.py:
from typing import Protocol, List, Optional, AsyncGenerator, Dict, Any
@runtime_checkable
class LLMClientProtocol(Protocol):
"""Protocol that all LLM clients must implement"""
@property
def provider_name(self) -> str:
"""Name of the LLM provider"""
...
@property
def model_name(self) -> str:
"""Name of the model being used"""
...
async def generate_text(
self,
prompt: str,
max_tokens: Optional[int] = None,
temperature: Optional[float] = None,
**kwargs
) -> str:
"""Generate text from a prompt"""
...
async def stream_text(
self,
prompt: str,
max_tokens: Optional[int] = None,
temperature: Optional[float] = None,
**kwargs
) -> AsyncGenerator[str, None]:
"""Stream text generation from a prompt"""
...
async def get_embeddings(
self,
texts: List[str],
**kwargs
) -> List[List[float]]:
"""Generate embeddings for a list of texts"""
...
Implementing a Custom LLM Client
Example 1: Local Model Client (Ollama)
from typing import List, Optional, AsyncGenerator
import httpx
class OllamaClient:
"""Custom client for Ollama local models"""
def __init__(self, model: str = "llama2", base_url: str = "http://localhost:11434"):
self._model = model
self._base_url = base_url
self._client = httpx.AsyncClient()
@property
def provider_name(self) -> str:
return "ollama"
@property
def model_name(self) -> str:
return self._model
async def generate_text(
self,
prompt: str,
max_tokens: Optional[int] = None,
temperature: Optional[float] = None,
**kwargs
) -> str:
"""Generate text using Ollama API"""
payload = {
"model": self._model,
"prompt": prompt,
"stream": False
}
if temperature is not None:
payload["temperature"] = temperature
response = await self._client.post(
f"{self._base_url}/api/generate",
json=payload
)
response.raise_for_status()
return response.json()["response"]
async def stream_text(
self,
prompt: str,
max_tokens: Optional[int] = None,
temperature: Optional[float] = None,
**kwargs
) -> AsyncGenerator[str, None]:
"""Stream text generation using Ollama API"""
payload = {
"model": self._model,
"prompt": prompt,
"stream": True
}
if temperature is not None:
payload["temperature"] = temperature
async with self._client.stream(
"POST",
f"{self._base_url}/api/generate",
json=payload
) as response:
response.raise_for_status()
async for line in response.aiter_lines():
if line:
import json
data = json.loads(line)
if "response" in data:
yield data["response"]
async def get_embeddings(
self,
texts: List[str],
**kwargs
) -> List[List[float]]:
"""Generate embeddings using Ollama API"""
embeddings = []
for text in texts:
response = await self._client.post(
f"{self._base_url}/api/embeddings",
json={"model": self._model, "prompt": text}
)
response.raise_for_status()
embeddings.append(response.json()["embedding"])
return embeddings
async def close(self):
"""Clean up resources"""
await self._client.aclose()
Example 2: Custom Embedding Client
from typing import List
class CustomEmbeddingClient:
"""Custom embedding client for specialized embeddings"""
def __init__(self, model: str = "all-MiniLM-L6-v2"):
self._model = model
@property
def provider_name(self) -> str:
return "custom-embeddings"
@property
def model_name(self) -> str:
return self._model
async def generate_text(self, prompt: str, **kwargs) -> str:
"""Not used for embedding-only clients"""
raise NotImplementedError("This client only supports embeddings")
async def stream_text(self, prompt: str, **kwargs):
"""Not used for embedding-only clients"""
raise NotImplementedError("This client only supports embeddings")
async def get_embeddings(self, texts: List[str], **kwargs) -> List[List[float]]:
"""Generate custom embeddings"""
# Example: Using a local sentence transformer model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(self._model)
embeddings = model.encode(texts)
return embeddings.tolist()
Registering Custom Providers
Once you’ve implemented a custom client, register it with the LLMClientFactory:
from aiecs.llm import LLMClientFactory
# Create your custom client instance
ollama_client = OllamaClient(model="llama2")
# Register it with a unique provider name
LLMClientFactory.register_custom_provider("ollama", ollama_client)
# Now you can use it anywhere in AIECS
from aiecs.llm import resolve_llm_client
client = resolve_llm_client("ollama")
response = await client.generate_text("What is the capital of France?")
Configuration-Driven Usage
You can configure AIECS to use custom providers through environment variables:
Entity Extraction Configuration
# Use custom LLM for entity extraction
export KG_ENTITY_EXTRACTION_PROVIDER="ollama"
export KG_ENTITY_EXTRACTION_MODEL="llama2"
from aiecs.application.knowledge_graph.extraction import LLMEntityExtractor
# Register your custom provider first
LLMClientFactory.register_custom_provider("ollama", OllamaClient(model="llama2"))
# Create extractor using configuration
extractor = LLMEntityExtractor.from_config()
# Will automatically use the ollama provider from environment variables
RAG Strategy Selection Configuration
# Use custom LLM for query intent classification
export KG_STRATEGY_SELECTION_PROVIDER="ollama"
export KG_STRATEGY_SELECTION_MODEL="llama2"
from aiecs.application.knowledge_graph.retrieval import QueryIntentClassifier
# Register your custom provider first
LLMClientFactory.register_custom_provider("ollama", OllamaClient(model="llama2"))
# Create classifier using configuration
classifier = QueryIntentClassifier.from_config()
# Will automatically use the ollama provider from environment variables
Embedding Configuration
# Use custom embedding provider
export KG_EMBEDDING_PROVIDER="custom-embeddings"
export KG_EMBEDDING_MODEL="all-MiniLM-L6-v2"
export KG_EMBEDDING_DIMENSION=384
from aiecs.application.knowledge_graph.builder import GraphBuilder
# Register your custom embedding provider first
embedding_client = CustomEmbeddingClient(model="all-MiniLM-L6-v2")
LLMClientFactory.register_custom_provider("custom-embeddings", embedding_client)
# Create graph builder using configuration
builder = GraphBuilder.from_config(
graph_store=my_graph_store,
entity_extractor=my_entity_extractor,
relation_extractor=my_relation_extractor
)
# Will automatically use the custom embedding provider from environment variables
Cost Optimization Example
Use different LLMs for different operations to optimize costs:
from aiecs.llm import LLMClientFactory
from aiecs.application.knowledge_graph.extraction import LLMEntityExtractor
from aiecs.application.knowledge_graph.retrieval import QueryIntentClassifier
from aiecs.application.knowledge_graph.builder import GraphBuilder
import os
# Register custom providers
ollama_client = OllamaClient(model="llama2") # Free local model
LLMClientFactory.register_custom_provider("ollama", ollama_client)
# Use free local model for classification (lightweight task)
os.environ["KG_STRATEGY_SELECTION_PROVIDER"] = "ollama"
os.environ["KG_STRATEGY_SELECTION_MODEL"] = "llama2"
# Use powerful cloud model for entity extraction (complex task)
os.environ["KG_ENTITY_EXTRACTION_PROVIDER"] = "OpenAI"
os.environ["KG_ENTITY_EXTRACTION_MODEL"] = "gpt-4"
# Use local embeddings to avoid API costs
embedding_client = CustomEmbeddingClient(model="all-MiniLM-L6-v2")
LLMClientFactory.register_custom_provider("local-embeddings", embedding_client)
os.environ["KG_EMBEDDING_PROVIDER"] = "local-embeddings"
os.environ["KG_EMBEDDING_MODEL"] = "all-MiniLM-L6-v2"
os.environ["KG_EMBEDDING_DIMENSION"] = "384"
# Create components - they'll use the configured providers
classifier = QueryIntentClassifier.from_config() # Uses ollama
extractor = LLMEntityExtractor.from_config() # Uses OpenAI GPT-4
builder = GraphBuilder.from_config(...) # Uses local embeddings
Best Practices
Protocol Compliance: Ensure your client implements all required methods of
LLMClientProtocolError Handling: Implement proper error handling and retries in your client
Resource Cleanup: Implement
close()method to clean up resources (connections, etc.)Async Support: Use async/await for all I/O operations
Type Hints: Use proper type hints for better IDE support and type checking
Testing: Test your custom client thoroughly before using in production
Configuration: Use environment variables for configuration to avoid hardcoding values
Troubleshooting
Client Not Found
If you get “Unknown provider” errors:
Ensure you’ve registered the provider before using it
Check that the provider name matches exactly (case-sensitive)
Register providers at application startup, before creating components
Settings Cache Issues
If environment variable changes aren’t picked up:
from aiecs.config.config import get_settings
get_settings.cache_clear() # Clear settings cache
Client Cache Issues
If you need to force re-resolution of clients:
from aiecs.llm import clear_client_cache
clear_client_cache() # Clear all cached clients
clear_client_cache("ollama") # Clear specific provider
See Also
LLM Configuration - General LLM configuration
Base LLM Client - Built-in LLM clients
LLM AI Clients - Standard provider clients