Backend Configuration Guide
Complete guide for selecting and configuring graph storage backends in AIECS Knowledge Graph.
Backend Selection Matrix
Backend |
Use Case |
Max Entities |
Concurrency |
Persistence |
Production Ready |
|---|---|---|---|---|---|
InMemory |
Development, Testing |
< 100K |
Low |
❌ No |
❌ No |
SQLite |
Small apps, Embedded |
< 1M |
Low-Medium |
✅ Yes |
⚠️ Limited |
PostgreSQL |
Production, Large scale |
> 100M |
High |
✅ Yes |
✅ Yes |
InMemoryGraphStore
When to Use
✅ Use for:
Development and testing
Prototyping
Small datasets (< 100K entities)
Temporary data
Unit tests
❌ Don’t use for:
Production deployments
Persistent data
Large graphs
Multi-user applications
Configuration
from aiecs.infrastructure.graph_storage import InMemoryGraphStore
# Simple initialization
store = InMemoryGraphStore()
await store.initialize()
# No configuration needed - it's in-memory!
Characteristics
Speed: Fastest for small graphs
Memory: Limited by RAM
Persistence: None (data lost on restart)
Concurrency: Single-threaded (Python GIL)
Features: Full GraphStore interface
SQLiteGraphStore
When to Use
✅ Use for:
Small to medium applications (< 1M entities)
Single-user or low-concurrency
Embedded systems
Desktop applications
Simple deployment (single file)
❌ Don’t use for:
High-concurrency production
Large graphs (> 1M entities)
Multi-user web applications
Write-heavy workloads
Configuration
from aiecs.infrastructure.graph_storage import SQLiteGraphStore
# File-based storage
store = SQLiteGraphStore("knowledge_graph.db")
await store.initialize()
# In-memory (for testing)
store = SQLiteGraphStore(":memory:")
await store.initialize()
# With options
store = SQLiteGraphStore(
"graph.db",
timeout=30.0, # Connection timeout
check_same_thread=False # For async
)
await store.initialize()
Performance Tuning
SQLite Configuration:
# After initialization, optimize SQLite
async with store.conn.execute("PRAGMA journal_mode = WAL"):
pass
async with store.conn.execute("PRAGMA synchronous = NORMAL"):
pass
async with store.conn.execute("PRAGMA cache_size = -64000"): # 64MB
pass
Characteristics
Speed: Fast for small-medium graphs
Storage: Single file database
Persistence: ✅ Yes
Concurrency: Limited (file locking)
Features: Full GraphStore interface
Scalability: Up to ~1M entities
PostgresGraphStore
When to Use
✅ Use for:
Production deployments
Large graphs (1M+ entities)
High concurrency
Multi-user applications
ACID transactions required
Advanced features (JSONB, pgvector)
❌ Don’t use for:
Simple single-user apps
Embedded systems
Minimal deployment requirements
Configuration
Basic Configuration:
from aiecs.infrastructure.graph_storage import PostgresGraphStore
store = PostgresGraphStore(
host="localhost",
port=5432,
user="graph_user",
password="secure_password",
database="knowledge_graph"
)
await store.initialize()
With Connection Pooling:
store = PostgresGraphStore(
host="localhost",
port=5432,
user="graph_user",
password="secure_password",
database="knowledge_graph",
min_pool_size=5, # Minimum connections
max_pool_size=20 # Maximum connections
)
await store.initialize()
With SSL:
store = PostgresGraphStore(
host="postgres.example.com",
port=5432,
user="graph_user",
password="secure_password",
database="knowledge_graph",
ssl=True,
sslmode="require"
)
await store.initialize()
Reusing Existing Connection Pool:
from aiecs.infrastructure.persistence.database_manager import DatabaseManager
# Reuse AIECS DatabaseManager pool
db_manager = DatabaseManager(...)
await db_manager.initialize()
store = PostgresGraphStore(
pool=db_manager.pool, # Reuse existing pool
database_manager=db_manager
)
await store.initialize()
With pgvector (for vector search):
store = PostgresGraphStore(
...,
enable_pgvector=True # Enable vector extension
)
await store.initialize()
Environment Variables
# .env file
DB_HOST=localhost
DB_PORT=5432
DB_USER=graph_user
DB_PASSWORD=secure_password
DB_NAME=knowledge_graph
ENABLE_PGVECTOR=true
import os
from dotenv import load_dotenv
load_dotenv()
store = PostgresGraphStore(
host=os.getenv('DB_HOST'),
port=int(os.getenv('DB_PORT', '5432')),
user=os.getenv('DB_USER'),
password=os.getenv('DB_PASSWORD'),
database=os.getenv('DB_NAME'),
enable_pgvector=os.getenv('ENABLE_PGVECTOR', 'false').lower() == 'true'
)
await store.initialize()
Characteristics
Speed: Fast, scales well
Storage: Enterprise-grade database
Persistence: ✅ Yes, ACID transactions
Concurrency: High (connection pooling)
Features: Full GraphStore + advanced features
Scalability: 100M+ entities
Backend Selection Decision Tree
Start
↓
Need persistence?
├─ No → InMemoryGraphStore
└─ Yes
↓
Production?
├─ No → SQLiteGraphStore
└─ Yes
↓
Graph size?
├─ < 1M entities → SQLiteGraphStore (simple) or PostgresGraphStore (future-proof)
└─ > 1M entities → PostgresGraphStore
Migration Between Backends
Using GraphStorageMigrator
from aiecs.infrastructure.graph_storage.migration import GraphStorageMigrator
# Migrate from SQLite to PostgreSQL
source = SQLiteGraphStore("old_graph.db")
target = PostgresGraphStore(...)
migrator = GraphStorageMigrator()
await migrator.migrate(source, target, batch_size=1000)
Manual Migration
# Export from source
from aiecs.infrastructure.graph_storage.streaming import GraphStreamExporter
exporter = GraphStreamExporter(source_store)
await exporter.export_to_file("migration.jsonl.gz", compress=True)
# Import to target
from aiecs.infrastructure.graph_storage.streaming import GraphStreamImporter
importer = GraphStreamImporter(target_store)
await importer.import_from_file("migration.jsonl.gz", batch_size=1000)
Configuration Examples
Development Environment
# Use InMemory for fast iteration
store = InMemoryGraphStore()
await store.initialize()
Testing Environment
# Use SQLite for test isolation
store = SQLiteGraphStore(":memory:") # In-memory for tests
await store.initialize()
Staging Environment
# Use PostgreSQL matching production
store = PostgresGraphStore(
host=os.getenv('STAGING_DB_HOST'),
...
)
await store.initialize()
Production Environment
# Full production configuration
store = PostgresGraphStore(
host=os.getenv('DB_HOST'),
port=int(os.getenv('DB_PORT', '5432')),
user=os.getenv('DB_USER'),
password=os.getenv('DB_PASSWORD'),
database=os.getenv('DB_NAME'),
min_pool_size=10,
max_pool_size=50,
enable_pgvector=True
)
await store.initialize()
Backend-Specific Features
InMemoryGraphStore
✅ Fastest for small graphs
✅ No setup required
✅ Full interface support
❌ No persistence
❌ Limited by RAM
SQLiteGraphStore
✅ Single file database
✅ Easy deployment
✅ Full interface support
✅ ACID transactions
⚠️ Limited concurrency
⚠️ File locking
PostgresGraphStore
✅ Production-grade
✅ High concurrency
✅ Advanced features (JSONB, pgvector)
✅ Connection pooling
✅ Read replicas support
⚠️ Requires PostgreSQL server
Performance Comparison
Operation |
InMemory |
SQLite |
PostgreSQL |
|---|---|---|---|
add_entity (1K) |
10ms |
500ms |
300ms |
get_entity |
0.01ms |
3ms |
2ms |
find_paths (depth=5) |
10ms |
200ms |
50ms |
Batch insert (10K) |
150ms |
5s |
300ms |
Note: Actual performance depends on data size, hardware, and configuration
Best Practices
1. Choose the Right Backend
Development: InMemoryGraphStore
Testing: SQLiteGraphStore (in-memory)
Small Production: SQLiteGraphStore
Large Production: PostgresGraphStore
2. Connection Pooling (PostgreSQL)
# Size pool based on expected load
# Rule of thumb: max_pool_size = (expected_concurrent_requests / 2)
store = PostgresGraphStore(
...,
min_pool_size=5, # Keep minimum connections warm
max_pool_size=20 # Scale up for load
)
3. Environment-Specific Configuration
import os
def get_graph_store():
env = os.getenv('ENVIRONMENT', 'development')
if env == 'production':
return PostgresGraphStore(...)
elif env == 'staging':
return PostgresGraphStore(...) # Staging DB
elif env == 'testing':
return SQLiteGraphStore(":memory:")
else:
return InMemoryGraphStore()
4. Graceful Degradation
from aiecs.infrastructure.graph_storage.graceful_degradation import GracefulDegradationStore
primary = PostgresGraphStore(...)
store = GracefulDegradationStore(primary, enable_fallback=True)
await store.initialize()
# Automatically falls back to in-memory if PostgreSQL fails
Troubleshooting
SQLite: Database Locked
Problem: Multiple processes accessing SQLite Solution: Use PostgreSQL for multi-process applications
PostgreSQL: Connection Pool Exhausted
Problem: Too many concurrent connections
Solution: Increase max_pool_size or reduce concurrent requests
InMemory: Out of Memory
Problem: Graph too large for RAM Solution: Switch to SQLite or PostgreSQL
Summary
InMemoryGraphStore: Development and testing only
SQLiteGraphStore: Small-medium production (< 1M entities)
PostgresGraphStore: Large-scale production (1M+ entities)
Choose based on your use case, scale, and requirements!