Knowledge Graph Reasoning Engine

Status: ✅ Complete
Version: 1.0.0
Phase: 4 - Reasoning Engine

Overview

The Knowledge Graph Reasoning Engine provides advanced reasoning capabilities over knowledge graphs. It combines query planning, multi-hop reasoning, logical inference, and evidence synthesis to answer complex questions and discover implicit knowledge.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Reasoning Engine                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │    Query     │  │   Multi-Hop  │  │  Inference   │     │
│  │   Planner    │→ │  Reasoning   │→ │   Engine     │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
│         │                  │                  │             │
│         └──────────────────┴──────────────────┘             │
│                            ↓                                │
│                  ┌──────────────────┐                       │
│                  │    Evidence      │                       │
│                  │  Synthesizer     │                       │
│                  └──────────────────┘                       │
└─────────────────────────────────────────────────────────────┘

Components

1. Query Planner

Purpose: Translates natural language queries into optimized execution plans.

Key Features:

Query decomposition into steps
Dependency resolution
Cost and latency estimation
Three optimization strategies (cost, latency, balanced)

Example:

from aiecs.application.knowledge_graph.reasoning import QueryPlanner

planner = QueryPlanner(graph_store)
plan = planner.plan_query("Who works at companies that Alice knows people at?")

# Optimize for minimal cost
optimized_plan = planner.optimize_plan(plan, OptimizationStrategy.MINIMIZE_COST)

2. Multi-Hop Reasoning Engine

Purpose: Finds and reasons over multi-hop paths in the knowledge graph.

Key Features:

Path finding with depth limits
Evidence collection from paths
Path ranking by relevance
Answer generation from evidence
Query execution with trace

Example:

from aiecs.application.knowledge_graph.reasoning import ReasoningEngine

engine = ReasoningEngine(graph_store)
result = await engine.reason(
    query="How is Alice connected to Company X?",
    context={"start_entity_id": "alice", "target_entity_id": "company_x"},
    max_hops=3
)

print(f"Answer: {result.answer}")
print(f"Confidence: {result.confidence}")
print(f"Evidence: {result.evidence_count} pieces")

3. Inference Engine

Purpose: Applies logical inference rules to discover implicit knowledge.

Key Features:

Transitive inference (A→B, B→C ⇒ A→C)
Symmetric inference (A→B ⇒ B→A)
Rule-based inference
Inference result caching
Full explainability (trace inference steps)

Example:

from aiecs.application.knowledge_graph.reasoning import InferenceEngine
from aiecs.domain.knowledge_graph.models.inference_rule import InferenceRule, RuleType

engine = InferenceEngine(graph_store)

# Add transitive rule for KNOWS relations
engine.add_rule(InferenceRule(
    rule_id="transitive_knows",
    rule_type=RuleType.TRANSITIVE,
    relation_type="KNOWS",
    description="Transitive closure for KNOWS"
))

# Apply inference
result = await engine.infer_relations(
    relation_type="KNOWS",
    max_steps=5,
    use_cache=True
)

print(f"Inferred {len(result.inferred_relations)} new relations")

4. Evidence Synthesizer

Purpose: Combines evidence from multiple sources for robust conclusions.

Key Features:

Evidence grouping by overlap
Multiple synthesis methods (weighted average, max, voting)
Confidence boosting from agreement
Contradiction detection
Reliability ranking

Example:

from aiecs.application.knowledge_graph.reasoning import EvidenceSynthesizer

synthesizer = EvidenceSynthesizer(
    confidence_threshold=0.7,
    contradiction_threshold=0.3
)

# Synthesize overlapping evidence
synthesized = synthesizer.synthesize_evidence(
    evidence_list,
    method="weighted_average"
)

# Estimate overall confidence
overall_confidence = synthesizer.estimate_overall_confidence(synthesized)

# Rank by reliability
ranked = synthesizer.rank_by_reliability(synthesized)

Reasoning Workflow

Complete Reasoning Pipeline

from aiecs.tools.knowledge_graph import GraphReasoningTool

tool = GraphReasoningTool(graph_store)

# Full reasoning with all components
result = await tool._execute(GraphReasoningInput(
    mode="full_reasoning",
    query="How is Alice connected to Company X?",
    start_entity_id="alice",
    target_entity_id="company_x",
    max_hops=3,
    apply_inference=True,
    inference_relation_type="KNOWS",
    synthesize_evidence=True,
    confidence_threshold=0.7
))

# Results include:
# - Query plan steps
# - Multi-hop reasoning results
# - Inferred relations (if enabled)
# - Synthesized evidence
# - Final answer with confidence
# - Complete reasoning trace

Step-by-Step Workflow

1. Query Planning
   ├─ Parse natural language query
   ├─ Identify query type (vector, traversal, path finding, etc.)
   ├─ Decompose into executable steps
   └─ Optimize for cost/latency

2. Multi-Hop Reasoning
   ├─ Execute query plan
   ├─ Find paths in knowledge graph
   ├─ Collect evidence from paths
   ├─ Rank evidence by relevance
   └─ Generate answer

3. Logical Inference (Optional)
   ├─ Apply inference rules
   ├─ Discover implicit relations
   ├─ Track inference steps
   └─ Cache results

4. Evidence Synthesis
   ├─ Group overlapping evidence
   ├─ Combine using synthesis method
   ├─ Boost confidence from agreement
   ├─ Detect contradictions
   └─ Rank by reliability

5. Answer Generation
   ├─ Combine evidence and inferences
   ├─ Calculate overall confidence
   ├─ Generate natural language answer
   └─ Provide reasoning trace

Domain Models

QueryPlan

class QueryPlan(BaseModel):
    plan_id: str
    original_query: str
    steps: List[QueryStep]
    total_estimated_cost: float
    optimized: bool
    explanation: str

QueryStep

class QueryStep(BaseModel):
    step_id: str
    operation: QueryOperation
    query: GraphQuery
    depends_on: List[str]
    description: str
    estimated_cost: float

ReasoningResult

class ReasoningResult(BaseModel):
    query: str
    evidence: List[Evidence]
    answer: str
    confidence: float
    reasoning_trace: List[str]
    execution_time_ms: float

Evidence

class Evidence(BaseModel):
    evidence_id: str
    evidence_type: EvidenceType
    entities: List[Entity]
    relations: List[Relation]
    paths: List[Path]
    confidence: float
    relevance_score: float
    explanation: str
    source: str

InferenceResult

class InferenceResult(BaseModel):
    inferred_relations: List[Relation]
    inference_steps: List[InferenceStep]
    confidence: float
    total_steps: int

Use Cases

1. Complex Question Answering

# Multi-hop question with inference
result = await engine.reason(
    query="Who are the most influential people connected to Alice?",
    context={"start_entity_id": "alice"},
    max_hops=4
)

2. Relationship Discovery

# Find all transitive connections
result = await inference_engine.infer_relations(
    relation_type="KNOWS",
    max_steps=10,
    use_cache=True
)

3. Evidence-Based Decision Making

# Collect and synthesize evidence
evidence = await collect_evidence(query)
synthesized = synthesizer.synthesize_evidence(evidence)
ranked = synthesizer.rank_by_reliability(synthesized)

# Make decision based on top evidence
decision = make_decision(ranked[0])

4. Knowledge Graph Completion

# Infer missing relations
symmetric_rule = InferenceRule(
    rule_id="symmetric_friend",
    rule_type=RuleType.SYMMETRIC,
    relation_type="FRIEND_OF"
)
inference_engine.add_rule(symmetric_rule)

result = await inference_engine.infer_relations("FRIEND_OF")
# Adds reverse friendship relations

Performance

Benchmarks

Operation	Graph Size	Time (ms)	Throughput
Query Planning	Any	<10	>100 queries/sec
Multi-Hop (3 hops)	1K entities	20-50	~20 queries/sec
Multi-Hop (3 hops)	10K entities	50-150	~7 queries/sec
Inference (Transitive)	100 relations	10-30	~30 ops/sec
Inference (Transitive)	1K relations	50-200	~5 ops/sec
Evidence Synthesis	10 pieces	<5	>200 ops/sec

Optimization Tips

Use Caching:
- Enable inference result caching
- Cache query plans for repeated queries
- Use retrieval cache for frequent lookups
Limit Depth:
- Set max_hops appropriately (3-4 is usually sufficient)
- Use max_evidence to limit evidence collection
Optimize Inference:
- Set max_steps based on graph size
- Enable only needed inference rules
- Use cache for repeated relation types
Parallel Execution:
- Query planner identifies parallel steps
- Use execution_order for optimal parallelization

Best Practices

Query Writing

# Good: Specific and focused
"How is Alice connected to Company X?"

# Better: With constraints
"How is Alice connected to Company X through WORKS_FOR relations?"

# Best: With context
context = {
    "start_entity_id": "alice",
    "target_entity_id": "company_x",
    "relation_types": ["WORKS_FOR", "KNOWS"]
}

Inference Rules

# Enable only needed rules
for rule in inference_engine.get_rules("KNOWS"):
    rule.enabled = True  # Only when needed

# Set appropriate confidence decay
InferenceRule(
    rule_id="transitive_knows",
    rule_type=RuleType.TRANSITIVE,
    relation_type="KNOWS",
    confidence_decay=0.1  # 10% decay per hop
)

Evidence Synthesis

# Filter before synthesis
high_confidence = synthesizer.filter_by_confidence(
    evidence_list,
    threshold=0.7
)

# Use appropriate method
synthesized = synthesizer.synthesize_evidence(
    high_confidence,
    method="weighted_average"  # Balanced approach
)

# Check for contradictions
contradictions = synthesizer.detect_contradictions(synthesized)
if contradictions:
    # Handle contradictions
    pass

Error Handling

from aiecs.application.knowledge_graph.reasoning import (
    QueryPlanner,
    ReasoningEngine,
    InferenceEngine
)

try:
    # Query planning
    plan = planner.plan_query(query)
    
    # Multi-hop reasoning
    result = await engine.reason(query, context, max_hops=3)
    
    # Inference
    inferred = await inference_engine.infer_relations(
        relation_type="KNOWS",
        max_steps=5
    )
    
except ValueError as e:
    print(f"Invalid parameter: {e}")
except Exception as e:
    print(f"Reasoning error: {e}")

Testing

All reasoning components are thoroughly tested:

Query Planning: 22 unit tests
Multi-Hop Reasoning: 20 unit tests
Logical Inference: 21 unit tests
Evidence Synthesis: 14 unit tests
Reasoning Tools: 11 unit tests

Total: 88 tests passing

API Reference

QueryPlanner

plan_query(query, context) -> QueryPlan
optimize_plan(plan, strategy) -> QueryPlan
translate_to_graph_query(query) -> GraphQuery

ReasoningEngine

reason(query, context, max_hops, max_evidence) -> ReasoningResult
find_multi_hop_paths(start_id, target_id, max_hops) -> List[Path]
collect_evidence_from_paths(paths) -> List[Evidence]
rank_evidence(evidence) -> List[Evidence]

InferenceEngine

infer_relations(relation_type, max_steps, use_cache) -> InferenceResult
add_rule(rule) -> None
remove_rule(rule_id) -> None
get_rules(relation_type) -> List[InferenceRule]

EvidenceSynthesizer

synthesize_evidence(evidence_list, method) -> List[Evidence]
filter_by_confidence(evidence_list, threshold) -> List[Evidence]
detect_contradictions(evidence_list) -> List[Dict]
estimate_overall_confidence(evidence_list) -> float
rank_by_reliability(evidence_list) -> List[Evidence]

Examples

See docs/knowledge_graph/examples/ for complete examples:

09_multi_hop_qa.py - Multi-hop question answering
10_logical_inference.py - Logical inference over knowledge
11_evidence_reasoning.py - Evidence-based reasoning

Knowledge Graph Reasoning Engine

Overview

Architecture

Components

1. Query Planner

2. Multi-Hop Reasoning Engine

3. Inference Engine

4. Evidence Synthesizer

Reasoning Workflow

Complete Reasoning Pipeline

Step-by-Step Workflow

Domain Models

QueryPlan

QueryStep

ReasoningResult

Evidence

InferenceResult

Use Cases

1. Complex Question Answering

2. Relationship Discovery

3. Evidence-Based Decision Making

4. Knowledge Graph Completion

Performance

Benchmarks

Optimization Tips

Best Practices

Query Writing

Inference Rules

Evidence Synthesis

Error Handling

Testing

API Reference

QueryPlanner

ReasoningEngine

InferenceEngine

EvidenceSynthesizer

Examples

Related Documentation