LLM API
This section documents the LLM integration components.
Client Factory
- class aiecs.llm.client_factory.AIProvider[source]
-
- OPENAI = 'OpenAI'
- VERTEX = 'Vertex'
- GOOGLEAI = 'GoogleAI'
- XAI = 'xAI'
- OPENROUTER = 'OpenRouter'
- ANTHROPIC_VERTEX = 'AnthropicVertex'
- VERTEX_MAAS = 'VertexMaaS'
- __new__(value)
- class aiecs.llm.client_factory.LLMClientFactory[source]
Bases:
objectFactory for creating and managing LLM provider clients
- classmethod register_custom_provider(name, client)[source]
Register a custom LLM client provider.
This allows registration of custom LLM clients that implement the LLMClientProtocol without inheriting from BaseLLMClient. Custom providers can be retrieved by name using get_client().
- Parameters:
name (str) – Custom provider name (e.g., “my-llm”, “llama-local”, “custom-gpt”)
client (LLMClientProtocol) – Client implementing LLMClientProtocol
- Raises:
ValueError – If client doesn’t implement LLMClientProtocol
ValueError – If name conflicts with standard AIProvider enum values
- Return type:
None
- classmethod get_client(provider)[source]
Get or create a client for the specified provider.
Supports both standard AIProvider enum values and custom provider names registered via register_custom_provider().
- Parameters:
provider (str | AIProvider) – AIProvider enum or custom provider name string
- Returns:
LLM client (BaseLLMClient for standard providers, LLMClientProtocol for custom)
- Raises:
ValueError – If provider is unknown (not standard and not registered)
- Return type:
BaseLLMClient | LLMClientProtocol
- async classmethod close_client(provider)[source]
Close a specific client (standard or custom)
- Parameters:
provider (str | AIProvider)
- class aiecs.llm.client_factory.LLMClientManager[source]
Bases:
objectHigh-level manager for LLM operations with context-aware provider selection
- async generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]
Generate text using context-aware provider selection
- Parameters:
messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects
provider (str | AIProvider | None) – AI provider to use (can be overridden by context)
model (str | None) – Specific model to use (can be overridden by context)
context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference
temperature (float) – Sampling temperature (0.0 to 2.0)
max_tokens (int | None) – Maximum tokens to generate
callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls
**kwargs – Additional provider-specific parameters
- Returns:
LLMResponse object with generated text and metadata
- Return type:
LLMResponse
- async stream_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]
Stream text generation using context-aware provider selection
- Parameters:
messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects
provider (str | AIProvider | None) – AI provider to use (can be overridden by context)
model (str | None) – Specific model to use (can be overridden by context)
context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference
temperature (float) – Sampling temperature (0.0 to 2.0)
max_tokens (int | None) – Maximum tokens to generate
callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls
**kwargs – Additional provider-specific parameters
- Yields:
str – Incremental text chunks
- async aiecs.llm.client_factory.get_llm_manager()[source]
Get the global LLM manager instance
- Return type:
- async aiecs.llm.client_factory.generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]
Generate text using the global LLM manager
LLM Clients
OpenAI Client
- class aiecs.llm.clients.openai_client.OpenAIClient[source]
Bases:
BaseLLMClient,OpenAICompatibleFunctionCallingMixinOpenAI provider client
- async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]
Generate text using OpenAI API with optional function calling support.
- Parameters:
messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)
**kwargs – Additional arguments passed to OpenAI API
input_price (float | None)
output_price (float | None)
- Returns:
LLMResponse with content and optional function_call information
- Return type:
LLMResponse
- async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]
Stream text using OpenAI API with optional function calling support.
- Parameters:
messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)
return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only
**kwargs – Additional arguments passed to OpenAI API
input_price (float | None)
output_price (float | None)
- Yields:
str or StreamChunk – Text tokens as they are generated, or StreamChunk objects if return_chunks=True
- Return type:
AsyncGenerator[Any, None]
Vertex AI Client
- class aiecs.llm.clients.vertex_client.VertexAIClient[source]
Bases:
BaseLLMClient,GoogleFunctionCallingMixinVertex AI provider client
- async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]
Generate text using Vertex AI.
- Parameters:
messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
system_instruction (str | None) – System instruction for the model
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)
- Returns:
LLMResponse with generated text and metadata
- Return type:
LLMResponse
- async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]
Stream text using Vertex AI real streaming API with Function Calling support.
- Parameters:
messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format)
tool_choice (Any | None) – Tool choice strategy (not used for Google Vertex AI)
return_chunks (bool) – If True, returns GoogleStreamChunk objects; if False, returns str tokens only
system_instruction (str | None) – System instruction for prompt caching support
**kwargs – Additional arguments
input_price (float | None)
output_price (float | None)
- Yields:
str or GoogleStreamChunk – Text tokens or StreamChunk objects
- Return type:
AsyncGenerator[Any, None]
xAI Client
- class aiecs.llm.clients.xai_client.XAIClient[source]
Bases:
BaseLLMClient,OpenAICompatibleFunctionCallingMixinxAI (Grok) provider client
- async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]
Generate text using xAI API via OpenAI library (supports all Grok models).
xAI API is OpenAI-compatible, so it supports Function Calling.
- Parameters:
messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)
- Returns:
LLMResponse with generated text and metadata
- Return type:
LLMResponse
- async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]
Stream text using xAI API via OpenAI library (supports all Grok models).
xAI API is OpenAI-compatible, so it supports Function Calling.
- Parameters:
messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)
- Yields:
str or StreamChunk – Text tokens or StreamChunk objects
- Return type:
AsyncGenerator[Any, None]
Configuration
Model Config
Pydantic models for LLM configuration management.
This module defines the configuration schema for all LLM providers and models, enabling centralized, type-safe configuration management.
- class aiecs.llm.config.model_config.ModelCostConfig[source]
Bases:
BaseModelToken cost configuration for a model
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiecs.llm.config.model_config.ModelCapabilities[source]
Bases:
BaseModelCapabilities and limits for a model
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiecs.llm.config.model_config.ModelDefaultParams[source]
Bases:
BaseModelDefault parameters for model inference
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiecs.llm.config.model_config.ModelConfig[source]
Bases:
BaseModelComplete configuration for a single model
- costs: ModelCostConfig
- capabilities: ModelCapabilities
- default_params: ModelDefaultParams
- __init__(**data)[source]
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiecs.llm.config.model_config.ProviderConfig[source]
Bases:
BaseModelConfiguration for a single LLM provider
- models: List[ModelConfig]
- classmethod validate_models_not_empty(v)[source]
Ensure at least one model is configured
- Parameters:
v (List[ModelConfig])
- Return type:
- get_model_config(model_name)[source]
Get configuration for a specific model
- Parameters:
model_name (str)
- Return type:
ModelConfig | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiecs.llm.config.model_config.LLMModelsConfig[source]
Bases:
BaseModelRoot configuration containing all providers
- providers: Dict[str, ProviderConfig]
- classmethod validate_providers_not_empty(v)[source]
Ensure at least one provider is configured
- Parameters:
v (Dict[str, ProviderConfig])
- Return type:
- get_provider_config(provider_name)[source]
Get configuration for a specific provider
- Parameters:
provider_name (str)
- Return type:
ProviderConfig | None
- get_model_config(provider_name, model_name)[source]
Get configuration for a specific model from a provider
- Parameters:
- Return type:
ModelConfig | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Config Loader
Configuration loader for LLM models.
This module provides a singleton configuration loader that loads and manages LLM model configurations from YAML files with support for hot-reloading.
- class aiecs.llm.config.config_loader.LLMConfigLoader[source]
Bases:
objectSingleton configuration loader for LLM models.
Supports: - Loading configuration from YAML files - Hot-reloading (manual refresh) - Thread-safe access - Caching for performance
- load_config(config_path=None)[source]
Load configuration from YAML file.
- Parameters:
config_path (Path | None) – Optional path to configuration file. If not provided, will search in standard locations.
- Returns:
Loaded and validated configuration
- Return type:
- Raises:
FileNotFoundError – If config file doesn’t exist
ValueError – If config file is invalid
- reload_config()[source]
Reload configuration from the current config file.
This supports the hybrid loading mode - configuration is loaded at startup but can be manually refreshed without restarting the application.
- Returns:
Reloaded configuration
- Return type:
- get_config()[source]
Get the current configuration.
Loads configuration on first access if not already loaded.
- Returns:
Current configuration
- Return type:
- get_provider_config(provider_name)[source]
Get configuration for a specific provider.
- Parameters:
provider_name (str) – Name of the provider (e.g., “Vertex”, “OpenAI”)
- Returns:
ProviderConfig if found, None otherwise
- Return type:
ProviderConfig | None
- get_model_config(provider_name, model_name)[source]
Get configuration for a specific model.
- Parameters:
- Returns:
ModelConfig if found, None otherwise
- Return type:
ModelConfig | None
- aiecs.llm.config.config_loader.get_llm_config_loader()[source]
Get the global LLM configuration loader instance.
- Returns:
Global singleton instance
- Return type:
- aiecs.llm.config.config_loader.get_llm_config()[source]
Get the current LLM configuration.
Convenience function that returns the configuration from the global loader.
- Returns:
Current configuration
- Return type: