LLM API

This section documents the LLM integration components.

Client Factory

class aiecs.llm.client_factory.AIProvider[source]

Bases: str, Enum

OPENAI = 'OpenAI'
VERTEX = 'Vertex'
GOOGLEAI = 'GoogleAI'
XAI = 'xAI'
OPENROUTER = 'OpenRouter'
ANTHROPIC_VERTEX = 'AnthropicVertex'
VERTEX_MAAS = 'VertexMaaS'
__new__(value)
class aiecs.llm.client_factory.LLMClientFactory[source]

Bases: object

Factory for creating and managing LLM provider clients

classmethod register_custom_provider(name, client)[source]

Register a custom LLM client provider.

This allows registration of custom LLM clients that implement the LLMClientProtocol without inheriting from BaseLLMClient. Custom providers can be retrieved by name using get_client().

Parameters:
  • name (str) – Custom provider name (e.g., “my-llm”, “llama-local”, “custom-gpt”)

  • client (LLMClientProtocol) – Client implementing LLMClientProtocol

Raises:
  • ValueError – If client doesn’t implement LLMClientProtocol

  • ValueError – If name conflicts with standard AIProvider enum values

Return type:

None

Example

```python # Register custom LLM client custom_client = MyCustomLLMClient() LLMClientFactory.register_custom_provider(“my-llm”, custom_client)

# Use custom client client = LLMClientFactory.get_client(“my-llm”) response = await client.generate_text(messages) ```

classmethod get_client(provider)[source]

Get or create a client for the specified provider.

Supports both standard AIProvider enum values and custom provider names registered via register_custom_provider().

Parameters:

provider (str | AIProvider) – AIProvider enum or custom provider name string

Returns:

LLM client (BaseLLMClient for standard providers, LLMClientProtocol for custom)

Raises:

ValueError – If provider is unknown (not standard and not registered)

Return type:

BaseLLMClient | LLMClientProtocol

async classmethod close_all()[source]

Close all active clients (both standard and custom)

async classmethod close_client(provider)[source]

Close a specific client (standard or custom)

Parameters:

provider (str | AIProvider)

classmethod reload_config()[source]

Reload LLM models configuration.

This reloads the configuration from the YAML file, allowing for hot-reloading of model settings without restarting the application.

class aiecs.llm.client_factory.LLMClientManager[source]

Bases: object

High-level manager for LLM operations with context-aware provider selection

__init__()[source]
async generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Generate text using context-aware provider selection

Parameters:
  • messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects

  • provider (str | AIProvider | None) – AI provider to use (can be overridden by context)

  • model (str | None) – Specific model to use (can be overridden by context)

  • context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference

  • temperature (float) – Sampling temperature (0.0 to 2.0)

  • max_tokens (int | None) – Maximum tokens to generate

  • callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls

  • **kwargs – Additional provider-specific parameters

Returns:

LLMResponse object with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Stream text generation using context-aware provider selection

Parameters:
  • messages (str | list[LLMMessage]) – Either a string prompt or list of LLMMessage objects

  • provider (str | AIProvider | None) – AI provider to use (can be overridden by context)

  • model (str | None) – Specific model to use (can be overridden by context)

  • context (Dict[str, Any] | None) – TaskContext or dict containing aiPreference

  • temperature (float) – Sampling temperature (0.0 to 2.0)

  • max_tokens (int | None) – Maximum tokens to generate

  • callbacks (List[CustomAsyncCallbackHandler] | None) – List of callback handlers to execute during LLM calls

  • **kwargs – Additional provider-specific parameters

Yields:

str – Incremental text chunks

async close()[source]

Close all clients

async aiecs.llm.client_factory.get_llm_manager()[source]

Get the global LLM manager instance

Return type:

LLMClientManager

async aiecs.llm.client_factory.generate_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Generate text using the global LLM manager

Parameters:
Return type:

LLMResponse

async aiecs.llm.client_factory.stream_text(messages, provider=None, model=None, context=None, temperature=0.7, max_tokens=None, callbacks=None, **kwargs)[source]

Stream text using the global LLM manager

Parameters:

LLM Clients

OpenAI Client

class aiecs.llm.clients.openai_client.OpenAIClient[source]

Bases: BaseLLMClient, OpenAICompatibleFunctionCallingMixin

OpenAI provider client

__init__()[source]
Return type:

None

async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using OpenAI API with optional function calling support.

Parameters:
  • messages (List[LLMMessage]) – List of LLM messages

  • model (str | None) – Model name (optional)

  • temperature (float) – Temperature for generation

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)

  • tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)

  • **kwargs – Additional arguments passed to OpenAI API

  • input_price (float | None)

  • output_price (float | None)

Returns:

LLMResponse with content and optional function_call information

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]

Stream text using OpenAI API with optional function calling support.

Parameters:
  • messages (List[LLMMessage]) – List of LLM messages

  • model (str | None) – Model name (optional)

  • temperature (float) – Temperature for generation

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)

  • tool_choice (Any | None) – Tool choice strategy (“auto”, “none”, or specific tool)

  • return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only

  • **kwargs – Additional arguments passed to OpenAI API

  • input_price (float | None)

  • output_price (float | None)

Yields:

str or StreamChunk – Text tokens as they are generated, or StreamChunk objects if return_chunks=True

Return type:

AsyncGenerator[Any, None]

async close()[source]

Clean up resources

Vertex AI Client

class aiecs.llm.clients.vertex_client.VertexAIClient[source]

Bases: BaseLLMClient, GoogleFunctionCallingMixin

Vertex AI provider client

__init__()[source]
async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using Vertex AI.

Parameters:
  • messages (List[LLMMessage]) – List of conversation messages

  • model (str | None) – Model name (optional, uses default if not provided)

  • temperature (float) – Sampling temperature (0.0 to 1.0)

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)

  • tool_choice (Any | None) – Tool choice strategy

  • system_instruction (str | None) – System instruction for the model

  • **kwargs – Additional provider-specific parameters

  • input_price (float | None)

  • output_price (float | None)

Returns:

LLMResponse with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, system_instruction=None, input_price=None, output_price=None, **kwargs)[source]

Stream text using Vertex AI real streaming API with Function Calling support.

Parameters:
  • messages (List[LLMMessage]) – List of LLM messages

  • model (str | None) – Model name (optional)

  • temperature (float) – Temperature for generation

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format)

  • tool_choice (Any | None) – Tool choice strategy (not used for Google Vertex AI)

  • return_chunks (bool) – If True, returns GoogleStreamChunk objects; if False, returns str tokens only

  • system_instruction (str | None) – System instruction for prompt caching support

  • **kwargs – Additional arguments

  • input_price (float | None)

  • output_price (float | None)

Yields:

str or GoogleStreamChunk – Text tokens or StreamChunk objects

Return type:

AsyncGenerator[Any, None]

get_part_count_stats()[source]

Get statistics about part count variations in Vertex AI responses.

Returns:

Dictionary containing part count statistics and analysis

Return type:

Dict[str, Any]

log_part_count_summary()[source]

Log a summary of part count statistics

async get_embeddings(texts, model=None)[source]

Generate embeddings using Vertex AI embedding model.

Passes all texts in a single batched call to self._client.aio.models.embed_content, which maps to the Vertex AI {model}:predict endpoint internally.

Parameters:
  • texts (List[str]) – List of texts to embed

  • model (str | None) – Embedding model name (default: gemini-embedding-001)

Returns:

List of embedding vectors (each is a list of floats)

Return type:

List[List[float]]

async close()[source]

Clean up resources

xAI Client

class aiecs.llm.clients.xai_client.XAIClient[source]

Bases: BaseLLMClient, OpenAICompatibleFunctionCallingMixin

xAI (Grok) provider client

__init__()[source]
Return type:

None

async generate_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, input_price=None, output_price=None, **kwargs)[source]

Generate text using xAI API via OpenAI library (supports all Grok models).

xAI API is OpenAI-compatible, so it supports Function Calling.

Parameters:
  • messages (List[LLMMessage]) – List of conversation messages

  • model (str | None) – Model name (optional, uses default if not provided)

  • temperature (float) – Sampling temperature (0.0 to 1.0)

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)

  • tool_choice (Any | None) – Tool choice strategy

  • **kwargs – Additional provider-specific parameters

  • input_price (float | None)

  • output_price (float | None)

Returns:

LLMResponse with generated text and metadata

Return type:

LLMResponse

async stream_text(messages, model=None, temperature=0.7, max_tokens=None, context=None, functions=None, tools=None, tool_choice=None, return_chunks=False, input_price=None, output_price=None, **kwargs)[source]

Stream text using xAI API via OpenAI library (supports all Grok models).

xAI API is OpenAI-compatible, so it supports Function Calling.

Parameters:
  • messages (List[LLMMessage]) – List of conversation messages

  • model (str | None) – Model name (optional, uses default if not provided)

  • temperature (float) – Sampling temperature (0.0 to 1.0)

  • max_tokens (int | None) – Maximum tokens to generate

  • context (Dict[str, Any] | None) – Optional context dictionary containing metadata such as: - user_id: User identifier for tracking/billing - tenant_id: Tenant identifier for multi-tenant setups - request_id: Request identifier for tracing - session_id: Session identifier - Any other custom metadata for observability or middleware

  • functions (List[Dict[str, Any]] | None) – List of function schemas (legacy format)

  • tools (List[Dict[str, Any]] | None) – List of tool schemas (new format, recommended)

  • tool_choice (Any | None) – Tool choice strategy

  • return_chunks (bool) – If True, returns StreamChunk objects with tool_calls info; if False, returns str tokens only

  • **kwargs – Additional provider-specific parameters

  • input_price (float | None)

  • output_price (float | None)

Yields:

str or StreamChunk – Text tokens or StreamChunk objects

Return type:

AsyncGenerator[Any, None]

async close()[source]

Clean up resources

Configuration

Model Config

Pydantic models for LLM configuration management.

This module defines the configuration schema for all LLM providers and models, enabling centralized, type-safe configuration management.

class aiecs.llm.config.model_config.ModelCostConfig[source]

Bases: BaseModel

Token cost configuration for a model

input: float
output: float
classmethod validate_positive(v)[source]

Ensure costs are non-negative

Parameters:

v (float)

Return type:

float

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelCapabilities[source]

Bases: BaseModel

Capabilities and limits for a model

streaming: bool
vision: bool
function_calling: bool
max_tokens: int
context_window: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelDefaultParams[source]

Bases: BaseModel

Default parameters for model inference

temperature: float
max_tokens: int
top_p: float
top_k: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ModelConfig[source]

Bases: BaseModel

Complete configuration for a single model

name: str
costs: ModelCostConfig
capabilities: ModelCapabilities
default_params: ModelDefaultParams
description: str | None
__init__(**data)[source]

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

display_name: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.ProviderConfig[source]

Bases: BaseModel

Configuration for a single LLM provider

provider_name: str
default_model: str
models: List[ModelConfig]
model_mappings: Dict[str, str] | None
classmethod validate_models_not_empty(v)[source]

Ensure at least one model is configured

Parameters:

v (List[ModelConfig])

Return type:

List[ModelConfig]

get_model_config(model_name)[source]

Get configuration for a specific model

Parameters:

model_name (str)

Return type:

ModelConfig | None

get_model_names()[source]

Get list of all model names

Return type:

List[str]

get_all_model_names_with_aliases()[source]

Get list of all model names including aliases

Return type:

List[str]

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class aiecs.llm.config.model_config.LLMModelsConfig[source]

Bases: BaseModel

Root configuration containing all providers

providers: Dict[str, ProviderConfig]
classmethod validate_providers_not_empty(v)[source]

Ensure at least one provider is configured

Parameters:

v (Dict[str, ProviderConfig])

Return type:

Dict[str, ProviderConfig]

get_provider_config(provider_name)[source]

Get configuration for a specific provider

Parameters:

provider_name (str)

Return type:

ProviderConfig | None

get_model_config(provider_name, model_name)[source]

Get configuration for a specific model from a provider

Parameters:
  • provider_name (str)

  • model_name (str)

Return type:

ModelConfig | None

get_provider_names()[source]

Get list of all provider names

Return type:

List[str]

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Config Loader

Configuration loader for LLM models.

This module provides a singleton configuration loader that loads and manages LLM model configurations from YAML files with support for hot-reloading.

class aiecs.llm.config.config_loader.LLMConfigLoader[source]

Bases: object

Singleton configuration loader for LLM models.

Supports: - Loading configuration from YAML files - Hot-reloading (manual refresh) - Thread-safe access - Caching for performance

static __new__(cls)[source]

Ensure singleton instance

__init__()[source]

Initialize the configuration loader

Return type:

None

load_config(config_path=None)[source]

Load configuration from YAML file.

Parameters:

config_path (Path | None) – Optional path to configuration file. If not provided, will search in standard locations.

Returns:

Loaded and validated configuration

Return type:

LLMModelsConfig

Raises:
reload_config()[source]

Reload configuration from the current config file.

This supports the hybrid loading mode - configuration is loaded at startup but can be manually refreshed without restarting the application.

Returns:

Reloaded configuration

Return type:

LLMModelsConfig

get_config()[source]

Get the current configuration.

Loads configuration on first access if not already loaded.

Returns:

Current configuration

Return type:

LLMModelsConfig

get_provider_config(provider_name)[source]

Get configuration for a specific provider.

Parameters:

provider_name (str) – Name of the provider (e.g., “Vertex”, “OpenAI”)

Returns:

ProviderConfig if found, None otherwise

Return type:

ProviderConfig | None

get_model_config(provider_name, model_name)[source]

Get configuration for a specific model.

Parameters:
  • provider_name (str) – Name of the provider

  • model_name (str) – Name of the model (or alias)

Returns:

ModelConfig if found, None otherwise

Return type:

ModelConfig | None

get_default_model(provider_name)[source]

Get the default model name for a provider.

Parameters:

provider_name (str) – Name of the provider

Returns:

Default model name if found, None otherwise

Return type:

str | None

is_loaded()[source]

Check if configuration has been loaded

Return type:

bool

get_config_path()[source]

Get the path to the current configuration file

Return type:

Path | None

aiecs.llm.config.config_loader.get_llm_config_loader()[source]

Get the global LLM configuration loader instance.

Returns:

Global singleton instance

Return type:

LLMConfigLoader

aiecs.llm.config.config_loader.get_llm_config()[source]

Get the current LLM configuration.

Convenience function that returns the configuration from the global loader.

Returns:

Current configuration

Return type:

LLMModelsConfig

aiecs.llm.config.config_loader.reload_llm_config()[source]

Reload the LLM configuration.

Convenience function that reloads the configuration in the global loader.

Returns:

Reloaded configuration

Return type:

LLMModelsConfig