Production Deployment Guide for Knowledge Graph
Complete guide for deploying AIECS Knowledge Graph in production environments.
Table of Contents
Prerequisites
System Requirements
Minimum:
CPU: 2 cores
RAM: 4GB
Disk: 20GB SSD
Network: 100Mbps
Recommended (for 1M+ entities):
CPU: 4+ cores
RAM: 16GB+
Disk: 100GB+ SSD
Network: 1Gbps
Software Requirements
Python 3.9+
PostgreSQL 12+ (for PostgreSQL backend)
Redis 6+ (optional, for caching)
Docker (optional, for containerized deployment)
Backend Selection
When to Use Each Backend
InMemoryGraphStore:
❌ NOT for production
✅ Development and testing only
✅ Small datasets (< 10K entities)
✅ Temporary data
SQLiteGraphStore:
✅ Small to medium applications (< 1M entities)
✅ Single-user or low-concurrency
✅ Embedded systems
✅ Simple deployment
❌ Not for high-concurrency production
PostgresGraphStore:
✅ Recommended for production
✅ Large graphs (1M+ entities)
✅ High concurrency
✅ ACID transactions
✅ Advanced features (JSONB, pgvector)
PostgreSQL Setup
1. Install PostgreSQL
Ubuntu/Debian:
sudo apt update
sudo apt install postgresql-14 postgresql-contrib-14
CentOS/RHEL:
sudo yum install postgresql14-server postgresql14
macOS:
brew install postgresql@14
2. Configure PostgreSQL
Create database:
sudo -u postgres psql
CREATE DATABASE aiecs_knowledge_graph;
CREATE USER graph_user WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE aiecs_knowledge_graph TO graph_user;
\q
Enable pgvector extension (optional, for vector search):
sudo -u postgres psql -d aiecs_knowledge_graph
CREATE EXTENSION IF NOT EXISTS vector;
\q
3. Configure Connection Pooling
PostgreSQL configuration (postgresql.conf):
# Connection settings
max_connections = 200
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 20MB
min_wal_size = 1GB
max_wal_size = 4GB
Restart PostgreSQL:
sudo systemctl restart postgresql
4. Configure Firewall
# Allow PostgreSQL connections
sudo ufw allow 5432/tcp
# Or restrict to specific IPs
sudo ufw allow from 10.0.0.0/8 to any port 5432
Configuration
Environment Variables
Create .env file:
For local PostgreSQL:
# PostgreSQL Configuration (Local Mode)
DB_CONNECTION_MODE=local
DB_HOST=localhost
DB_PORT=5432
DB_USER=graph_user
DB_PASSWORD=secure_password
DB_NAME=aiecs_knowledge_graph
For cloud PostgreSQL:
# PostgreSQL Configuration (Cloud Mode)
DB_CONNECTION_MODE=cloud
POSTGRES_URL=postgresql://user:password@host:port/database?sslmode=require
Common settings:
# Redis Configuration (optional)
REDIS_URL=redis://localhost:6379/0
REDIS_TTL=300
# Graph Store Configuration
GRAPH_STORE_BACKEND=postgres
GRAPH_STORE_POOL_MIN=5
GRAPH_STORE_POOL_MAX=20
GRAPH_STORE_ENABLE_CACHE=true
GRAPH_STORE_ENABLE_MONITORING=true
Application Configuration
from aiecs.infrastructure.graph_storage import PostgresGraphStore
from aiecs.infrastructure.graph_storage.cache import GraphStoreCache, GraphStoreCacheConfig
from aiecs.infrastructure.graph_storage.performance_monitoring import PerformanceMonitor
# Initialize store
store = PostgresGraphStore(
host=os.getenv('DB_HOST'),
port=int(os.getenv('DB_PORT', '5432')),
user=os.getenv('DB_USER'),
password=os.getenv('DB_PASSWORD'),
database=os.getenv('DB_NAME'),
min_pool_size=5,
max_pool_size=20
)
await store.initialize()
# Enable caching
cache = GraphStoreCache(GraphStoreCacheConfig(
redis_url=os.getenv('REDIS_URL'),
ttl=int(os.getenv('REDIS_TTL', '300'))
))
await cache.initialize()
# Enable monitoring
monitor = PerformanceMonitor(
enabled=True,
slow_query_threshold_ms=100.0
)
await monitor.initialize()
Performance Optimization
1. Enable Caching
from aiecs.infrastructure.graph_storage.cache import GraphStoreCache, GraphStoreCacheConfig
cache = GraphStoreCache(GraphStoreCacheConfig(
enabled=True,
redis_url="redis://localhost:6379/0",
ttl=300, # 5 minutes
max_cache_size_mb=1000 # 1GB in-memory fallback
))
await cache.initialize()
2. Optimize Indexes
from aiecs.infrastructure.graph_storage.index_optimization import IndexOptimizer
optimizer = IndexOptimizer(store.pool)
# Get recommendations
recommendations = await optimizer.get_missing_index_recommendations()
# Apply high-priority recommendations
high_priority = [r for r in recommendations if r.estimated_benefit == "high"]
results = await optimizer.apply_recommendations(high_priority)
3. Use Batch Operations
from aiecs.infrastructure.graph_storage.batch_operations import BatchOperationsMixin
class OptimizedStore(PostgresGraphStore, BatchOperationsMixin):
pass
store = OptimizedStore(...)
await store.batch_add_entities(entities, batch_size=1000, use_copy=True)
4. Configure Connection Pooling
store = PostgresGraphStore(
...,
min_pool_size=10, # Minimum connections
max_pool_size=50 # Maximum connections
)
Monitoring and Health Checks
1. Health Checks
from aiecs.infrastructure.graph_storage.health_checks import HealthChecker, HealthMonitor
checker = HealthChecker(store, timeout_seconds=5.0)
result = await checker.check_health()
if result.is_healthy():
print("Store is healthy")
else:
print(f"Store is {result.status}: {result.message}")
# Continuous monitoring
monitor = HealthMonitor(checker, interval_seconds=30)
await monitor.start()
2. Metrics Collection
from aiecs.infrastructure.graph_storage.metrics import MetricsCollector, MetricsExporter
collector = MetricsCollector()
collector.record_latency("get_entity", 12.5)
collector.record_cache_hit()
# Export to Prometheus
exporter = MetricsExporter(collector)
prometheus_metrics = exporter.to_prometheus()
3. Performance Monitoring
from aiecs.infrastructure.graph_storage.performance_monitoring import PerformanceMonitor
monitor = PerformanceMonitor(
enabled=True,
slow_query_threshold_ms=100.0,
log_slow_queries=True
)
# Track queries
async with monitor.track_query("get_entity", query):
result = await conn.fetch(query)
# Get report
report = monitor.get_performance_report()
Security
1. Database Security
Use strong passwords:
# Generate secure password
openssl rand -base64 32
Limit database access:
-- Revoke public access
REVOKE ALL ON DATABASE aiecs_knowledge_graph FROM PUBLIC;
-- Grant only to application user
GRANT CONNECT ON DATABASE aiecs_knowledge_graph TO graph_user;
Enable SSL:
store = PostgresGraphStore(
...,
ssl=True,
sslmode='require'
)
2. Connection Security
Use connection pooling:
Prevents connection exhaustion
Reduces authentication overhead
Improves security
Rotate credentials:
Change passwords regularly
Use secrets management (Vault, AWS Secrets Manager)
3. Input Validation
Validate entity IDs:
import re
def validate_entity_id(entity_id: str) -> bool:
# Only allow alphanumeric, underscore, hyphen
return bool(re.match(r'^[a-zA-Z0-9_-]+$', entity_id))
Sanitize queries:
Use parameterized queries (already implemented)
Backup and Recovery
1. PostgreSQL Backup
Automated backup script:
#!/bin/bash
# backup_graph.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/graph"
DB_NAME="aiecs_knowledge_graph"
# Create backup
pg_dump -U graph_user -F c -f "$BACKUP_DIR/graph_$DATE.dump" $DB_NAME
# Compress
gzip "$BACKUP_DIR/graph_$DATE.dump"
# Keep only last 7 days
find $BACKUP_DIR -name "*.dump.gz" -mtime +7 -delete
Schedule with cron:
# Daily backup at 2 AM
0 2 * * * /path/to/backup_graph.sh
2. Restore from Backup
# Restore
gunzip graph_20231109_020000.dump.gz
pg_restore -U graph_user -d aiecs_knowledge_graph graph_20231109_020000.dump
3. Streaming Backup
from aiecs.infrastructure.graph_storage.streaming import GraphStreamExporter
exporter = GraphStreamExporter(store)
await exporter.export_to_file(
"backup.jsonl.gz",
format=StreamFormat.JSONL,
compress=True
)
Scaling
Vertical Scaling
Increase resources:
More RAM: Better caching
More CPU: Faster queries
SSD storage: Faster I/O
Horizontal Scaling
Read Replicas:
# Primary for writes
primary_store = PostgresGraphStore(host="primary.db", ...)
# Replicas for reads
replica_stores = [
PostgresGraphStore(host="replica1.db", ...),
PostgresGraphStore(host="replica2.db", ...)
]
# Route reads to replicas
def get_read_store():
return random.choice(replica_stores)
Connection Pooling:
Increase
max_pool_sizefor more concurrent connectionsMonitor pool usage
Troubleshooting
Common Issues
Connection Errors:
# Check connection
from aiecs.infrastructure.graph_storage.health_checks import HealthChecker
checker = HealthChecker(store)
result = await checker.check_health()
print(result.to_dict())
Slow Queries:
# Analyze query plans
from aiecs.infrastructure.graph_storage.performance_monitoring import PerformanceMonitor
monitor = PerformanceMonitor()
plan = await monitor.analyze_query_plan(conn, query)
print(plan.get_warnings())
Memory Issues:
Use streaming for large exports
Enable pagination
Use lazy loading
Production Checklist
PostgreSQL installed and configured
Database and user created
Indexes optimized
Caching enabled (Redis)
Health checks configured
Monitoring metrics exported
Backups scheduled
SSL enabled for connections
Firewall configured
Connection pooling tuned
Error handling enabled
Logging configured
Graceful degradation tested
Next Steps
See
PERFORMANCE_GUIDE.mdfor performance tuningSee
CUSTOM_BACKEND_GUIDE.mdfor custom backendsSee
PHASE6_TASK6.5_COMPLETE.mdfor complete implementation details