LLM API

This section documents the LLM integration components.

Client Factory

class aiecs.llm.client_factory.AIProvider[source]

Bases: str, Enum

OPENAI = 'OpenAI'

VERTEX = 'Vertex'

GOOGLEAI = 'GoogleAI'

XAI = 'xAI'

OPENROUTER = 'OpenRouter'

ANTHROPIC_VERTEX = 'AnthropicVertex'

VERTEX_MAAS = 'VertexMaaS'

__new__(value)

class aiecs.llm.client_factory.LLMClientFactory[source]

Bases: object

Factory for creating and managing LLM provider clients

classmethod register_custom_provider(name, client)[source]

Register a custom LLM client provider.

This allows registration of custom LLM clients that implement the LLMClientProtocol without inheriting from BaseLLMClient. Custom providers can be retrieved by name using get_client().

Parameters:

name (str) – Custom provider name (e.g., “my-llm”, “llama-local”, “custom-gpt”)
client (LLMClientProtocol) – Client implementing LLMClientProtocol

Raises:

ValueError – If client doesn’t implement LLMClientProtocol
ValueError – If name conflicts with standard AIProvider enum values

Return type:

None

Example

```python # Register custom LLM client custom_client = MyCustomLLMClient() LLMClientFactory.register_custom_provider(“my-llm”, custom_client)

# Use custom client client = LLMClientFactory.get_client(“my-llm”) response = await client.generate_text(messages) ```

classmethod get_client(provider)[source]

Get or create a client for the specified provider.

Supports both standard AIProvider enum values and custom provider names registered via register_custom_provider().

Parameters:: provider (str | AIProvider) – AIProvider enum or custom provider name string
Returns:: LLM client (BaseLLMClient for standard providers, LLMClientProtocol for custom)
Raises:: ValueError – If provider is unknown (not standard and not registered)
Return type:: BaseLLMClient | LLMClientProtocol

async classmethod close_all()[source]: Close all active clients (both standard and custom)

async classmethod close_client(provider)[source]

Close a specific client (standard or custom)

Parameters:: provider (str | AIProvider)

classmethod reload_config()[source]

Reload LLM models configuration.

This reloads the configuration from the YAML file, allowing for hot-reloading of model settings without restarting the application.

class aiecs.llm.client_factory.LLMClientManager[source]

Bases: object

High-level manager for LLM operations with context-aware provider selection

__init__()[source]

async generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Generate text using context-aware provider selection

Parameters:

messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects
provider (str | AIProvider | None) – AI provider to use (can be overridden by context)
model (str | None) – Specific model to use (can be overridden by context)
context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference
temperature (float) – Sampling temperature (0.0 to 2.0)
max_tokens (int | None) – Maximum tokens to generate
callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls
**kwargs – Additional provider-specific parameters

Returns:

LLMResponse object with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Stream text generation using context-aware provider selection

Parameters:

messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects
provider (str | AIProvider | None) – AI provider to use (can be overridden by context)
model (str | None) – Specific model to use (can be overridden by context)
context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference
temperature (float) – Sampling temperature (0.0 to 2.0)
max_tokens (int | None) – Maximum tokens to generate
callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls
**kwargs – Additional provider-specific parameters

Yields:

str – Incremental text chunks

async close()[source]: Close all clients

async aiecs.llm.client_factory.get_llm_manager()[source]

Get the global LLM manager instance

Return type:: LLMClientManager

async aiecs.llm.client_factory.generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Generate text using the global LLM manager

Parameters:

messages (str | list[LLMMessage])
provider (str | AIProvider | None)
model (str | None)
context (Dict[str, Any] | None)
temperature (float)
max_tokens (int | None)
callbacks (List[CustomAsyncCallbackHandler] | None)

Return type:

LLMResponse

async aiecs.llm.client_factory.stream_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Stream text using the global LLM manager

Parameters:

messages (str | list[LLMMessage])
provider (str | AIProvider | None)
model (str | None)
context (Dict[str, Any] | None)
temperature (float)
max_tokens (int | None)
callbacks (List[CustomAsyncCallbackHandler] | None)

LLM Clients

OpenAI Client

class aiecs.llm.clients.openai_client.OpenAIClient[source]

Bases: BaseLLMClient, OpenAICompatibleFunctionCallingMixin

OpenAI provider client

__init__()[source]

Return type:: None

async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using OpenAI API with optional function calling support.

Parameters:

messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)
**kwargs – Additional arguments passed to OpenAI API
input_price (float | None)
output_price (float | None)

Returns:

LLMResponse with content and optional function_call information

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]

Stream text using OpenAI API with optional function calling support.

Parameters:

messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)
return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only
**kwargs – Additional arguments passed to OpenAI API
input_price (float | None)
output_price (float | None)

Yields:

str or StreamChunk – Text tokens as they are generated, or StreamChunk objects if return_chunks=True

Return type:

AsyncGenerator[Any, None]

async close()[source]: Clean up resources

Vertex AI Client

class aiecs.llm.clients.vertex_client.VertexAIClient[source]

Bases: BaseLLMClient, GoogleFunctionCallingMixin

Vertex AI provider client

__init__()[source]

async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using Vertex AI.

Parameters:

messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
system_instruction (str | None) – System instruction for the model
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)

Returns:

LLMResponse with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]

Stream text using Vertex AI real streaming API with Function Calling support.

Parameters:

messages (List[LLMMessage]) – List of LLM messages
model (str | None) – Model name (optional)
temperature (float) – Temperature for generation
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format)
tool_choice (Any | None) – Tool choice strategy (not used for Google Vertex AI)
return_chunks (bool) – If True, returns GoogleStreamChunk objects; if False, returns str tokens only
system_instruction (str | None) – System instruction for prompt caching support
**kwargs – Additional arguments
input_price (float | None)
output_price (float | None)

Yields:

str or GoogleStreamChunk – Text tokens or StreamChunk objects

Return type:

AsyncGenerator[Any, None]

get_part_count_stats()[source]

Get statistics about part count variations in Vertex AI responses.

Returns:: Dictionary containing part count statistics and analysis
Return type:: Dict[str, Any]

log_part_count_summary()[source]: Log a summary of part count statistics

async get_embeddings(texts, model=None)[source]

Generate embeddings using Vertex AI embedding model.

Passes all texts in a single batched call to self._client.aio.models.embed_content, which maps to the Vertex AI {model}:predict endpoint internally.

Parameters:

texts (List[str]) – List of texts to embed
model (str | None) – Embedding model name (default: gemini-embedding-001)

Returns:

List of embedding vectors (each is a list of floats)

Return type:

List[List[float]]

async close()[source]: Clean up resources

xAI Client

class aiecs.llm.clients.xai_client.XAIClient[source]

Bases: BaseLLMClient, OpenAICompatibleFunctionCallingMixin

xAI (Grok) provider client

__init__()[source]

Return type:: None

async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using xAI API via OpenAI library (supports all Grok models).

xAI API is OpenAI-compatible, so it supports Function Calling.

Parameters:

messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)

Returns:

LLMResponse with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]

Stream text using xAI API via OpenAI library (supports all Grok models).

xAI API is OpenAI-compatible, so it supports Function Calling.

Parameters:

messages (List[LLMMessage]) – List of conversation messages
model (str | None) – Model name (optional, uses default if not provided)
temperature (float) – Sampling temperature (0.0 to 1.0)
max_tokens (int | None) – Maximum tokens to generate
context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware
functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)
tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)
tool_choice (Any | None) – Tool choice strategy
return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only
**kwargs – Additional provider-specific parameters
input_price (float | None)
output_price (float | None)

Yields:

str or StreamChunk – Text tokens or StreamChunk objects

Return type:

AsyncGenerator[Any, None]

async close()[source]: Clean up resources

Configuration

Model Config

Pydantic models for LLM configuration management.

This module defines the configuration schema for all LLM providers and models, enabling centralized, type-safe configuration management.

class aiecs.llm.config.model_config.ModelCostConfig[source]

Bases: BaseModel

Token cost configuration for a model

input: float

output: float

classmethod validate_positive(v)[source]

Ensure costs are non-negative

Parameters:: v (float)
Return type:: float

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelCapabilities[source]

Bases: BaseModel

Capabilities and limits for a model

streaming: bool

vision: bool

function_calling: bool

max_tokens: int

context_window: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelDefaultParams[source]

Bases: BaseModel

Default parameters for model inference

temperature: float

max_tokens: int

top_p: float

top_k: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelConfig[source]

Bases: BaseModel

Complete configuration for a single model

name: str

costs: ModelCostConfig

capabilities: ModelCapabilities

default_params: ModelDefaultParams

description: str | None

__init__(**data)[source]

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

display_name: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ProviderConfig[source]

Bases: BaseModel

Configuration for a single LLM provider

provider_name: str

default_model: str

models: List[ModelConfig]

model_mappings: Dict[str, str] | None

classmethod validate_models_not_empty(v)[source]

Ensure at least one model is configured

Parameters:: v (List[ModelConfig])
Return type:: List[ModelConfig]

get_model_config(model_name)[source]

Get configuration for a specific model

Parameters:: model_name (str)
Return type:: ModelConfig | None

get_model_names()[source]

Get list of all model names

Return type:: List[str]

get_all_model_names_with_aliases()[source]

Get list of all model names including aliases

Return type:: List[str]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.LLMModelsConfig[source]

Bases: BaseModel

Root configuration containing all providers

providers: Dict[str, ProviderConfig]

classmethod validate_providers_not_empty(v)[source]

Ensure at least one provider is configured

Parameters:: v (Dict[str, ProviderConfig])
Return type:: Dict[str, ProviderConfig]

get_provider_config(provider_name)[source]

Get configuration for a specific provider

Parameters:: provider_name (str)
Return type:: ProviderConfig | None

get_model_config(provider_name, model_name)[source]

Get configuration for a specific model from a provider

Parameters:

provider_name (str)
model_name (str)

Return type:

ModelConfig | None

get_provider_names()[source]

Get list of all provider names

Return type:: List[str]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Config Loader

Configuration loader for LLM models.

This module provides a singleton configuration loader that loads and manages LLM model configurations from YAML files with support for hot-reloading.

class aiecs.llm.config.config_loader.LLMConfigLoader[source]

Bases: object

Singleton configuration loader for LLM models.

Supports: - Loading configuration from YAML files - Hot-reloading (manual refresh) - Thread-safe access - Caching for performance

static __new__(cls)[source]: Ensure singleton instance

__init__()[source]

Initialize the configuration loader

Return type:: None

load_config(config_path=None)[source]

Load configuration from YAML file.

Parameters:

config_path (Path | None) – Optional path to configuration file. If not provided, will search in standard locations.

Returns:

Loaded and validated configuration

Return type:

LLMModelsConfig

Raises:

FileNotFoundError – If config file doesn’t exist
ValueError – If config file is invalid

reload_config()[source]

Reload configuration from the current config file.

This supports the hybrid loading mode - configuration is loaded at startup but can be manually refreshed without restarting the application.

Returns:: Reloaded configuration
Return type:: LLMModelsConfig

get_config()[source]

Get the current configuration.

Loads configuration on first access if not already loaded.

Returns:: Current configuration
Return type:: LLMModelsConfig

get_provider_config(provider_name)[source]

Get configuration for a specific provider.

Parameters:: provider_name (str) – Name of the provider (e.g., “Vertex”, “OpenAI”)
Returns:: ProviderConfig if found, None otherwise
Return type:: ProviderConfig | None

get_model_config(provider_name, model_name)[source]

Get configuration for a specific model.

Parameters:

provider_name (str) – Name of the provider
model_name (str) – Name of the model (or alias)

Returns:

ModelConfig if found, None otherwise

Return type:

ModelConfig | None

get_default_model(provider_name)[source]

Get the default model name for a provider.

Parameters:: provider_name (str) – Name of the provider
Returns:: Default model name if found, None otherwise
Return type:: str | None

is_loaded()[source]

Check if configuration has been loaded

Return type:: bool

get_config_path()[source]

Get the path to the current configuration file

Return type:: Path | None

aiecs.llm.config.config_loader.get_llm_config_loader()[source]

Get the global LLM configuration loader instance.

Returns:: Global singleton instance
Return type:: LLMConfigLoader

aiecs.llm.config.config_loader.get_llm_config()[source]

Get the current LLM configuration.

Convenience function that returns the configuration from the global loader.

Returns:: Current configuration
Return type:: LLMModelsConfig

aiecs.llm.config.config_loader.reload_llm_config()[source]

Reload the LLM configuration.

Convenience function that reloads the configuration in the global loader.

Returns:: Reloaded configuration
Return type:: LLMModelsConfig