AIECS Knowledge Graph User Guide

Introduction

Welcome to the AIECS Knowledge Graph User Guide! This guide will help you get started with building, querying, and reasoning over knowledge graphs in your AI applications.

What is a Knowledge Graph?

A knowledge graph is a structured representation of knowledge that captures:

Entities: Things in your domain (people, companies, products, concepts)
Relations: Connections between entities (works_for, located_in, knows)
Properties: Attributes of entities and relations (name, age, start_date)

Knowledge graphs enable:

Structured Knowledge Storage: Organize information in a queryable format
Multi-Hop Reasoning: Answer complex questions by traversing relationships
Knowledge Fusion: Merge information from multiple sources
Semantic Search: Find relevant information using meaning, not just keywords

Why Use AIECS Knowledge Graph?

Self-Contained: No external graph database required
Multiple Backends: InMemory, SQLite, PostgreSQL - choose what fits your needs
Easy to Use: Simple API for common operations
Powerful: Advanced features like reasoning, fusion, and optimization
Extensible: Add custom storage backends easily

Quick Start

Installation

AIECS Knowledge Graph is included with AIECS. Install the optional dependencies for specific backends:

# For SQLite support (included by default)
pip install aiecs

# For PostgreSQL support
pip install aiecs[postgres]

# For all features
pip install aiecs[all]

Your First Knowledge Graph

Let’s create a simple knowledge graph about people and companies:

import asyncio
from aiecs.domain.knowledge_graph.models.entity import Entity
from aiecs.domain.knowledge_graph.models.relation import Relation
from aiecs.infrastructure.graph_storage.in_memory import InMemoryGraphStore

async def main():
    # 1. Initialize storage
    store = InMemoryGraphStore()
    await store.initialize()

    # 2. Create entities
    alice = Entity(
        id="alice",
        entity_type="Person",
        properties={"name": "Alice Smith", "age": 30, "role": "Engineer"}
    )

    bob = Entity(
        id="bob",
        entity_type="Person",
        properties={"name": "Bob Jones", "age": 25, "role": "Designer"}
    )

    tech_corp = Entity(
        id="tech_corp",
        entity_type="Company",
        properties={"name": "Tech Corp", "industry": "Technology"}
    )

    # 3. Add entities to graph
    await store.add_entity(alice)
    await store.add_entity(bob)
    await store.add_entity(tech_corp)

    # 4. Create relations
    alice_works = Relation(
        id="rel_1",
        relation_type="WORKS_FOR",
        source_id="alice",
        target_id="tech_corp",
        properties={"start_date": "2020-01-01"}
    )

    bob_works = Relation(
        id="rel_2",
        relation_type="WORKS_FOR",
        source_id="bob",
        target_id="tech_corp",
        properties={"start_date": "2021-06-01"}
    )

    alice_knows_bob = Relation(
        id="rel_3",
        relation_type="KNOWS",
        source_id="alice",
        target_id="bob"
    )

    # 5. Add relations to graph
    await store.add_relation(alice_works)
    await store.add_relation(bob_works)
    await store.add_relation(alice_knows_bob)

    # 6. Query the graph
    # Get Alice's neighbors
    neighbors = await store.get_neighbors("alice", direction="outgoing")
    print(f"Alice is connected to: {[n.properties['name'] for n in neighbors]}")

    # Find paths from Alice
    paths = await store.traverse("alice", max_depth=2)
    print(f"Found {len(paths)} paths from Alice")

    # 7. Cleanup
    await store.close()

# Run
asyncio.run(main())

Output:

Alice is connected to: ['Bob Jones', 'Tech Corp']
Found 3 paths from Alice

Congratulations! You’ve created your first knowledge graph.

Core Concepts

Entities

Entities represent nodes in your knowledge graph. Each entity has:

ID: Unique identifier
Type: Category (Person, Company, Product, etc.)
Properties: Key-value attributes
Metadata: Optional metadata (source, confidence, timestamps)

entity = Entity(
    id="unique_id",
    entity_type="Person",
    properties={
        "name": "Alice",
        "age": 30,
        "email": "alice@example.com"
    },
    metadata={
        "source": "document_1",
        "confidence": 0.95
    }
)

Relations

Relations represent edges connecting entities. Each relation has:

ID: Unique identifier
Type: Relationship type (WORKS_FOR, KNOWS, LOCATED_IN, etc.)
Source: Starting entity ID
Target: Ending entity ID
Properties: Relationship attributes
Metadata: Optional metadata

relation = Relation(
    id="rel_id",
    relation_type="WORKS_FOR",
    source_id="person_1",
    target_id="company_1",
    properties={
        "role": "Engineer",
        "start_date": "2020-01-01"
    }
)

Paths

Paths represent sequences of entities connected by relations:

from aiecs.domain.knowledge_graph.models.path import Path

path = Path(
    entities=[alice, tech_corp, project],
    relations=[works_for_relation, manages_relation],
    score=0.85
)

print(f"Path length: {path.length()} hops")
print(f"Entities: {path.get_entity_ids()}")

Storage Backends

AIECS provides three built-in storage backends:

InMemoryGraphStore

Best for: Development, testing, small graphs (< 100K nodes)

from aiecs.infrastructure.graph_storage.in_memory import InMemoryGraphStore

store = InMemoryGraphStore()
await store.initialize()

Pros: Very fast, no setup required Cons: Data lost when process ends, limited by RAM

SQLiteGraphStore

Best for: Production apps, persistent storage, medium graphs (< 1M nodes)

from aiecs.infrastructure.graph_storage.sqlite import SQLiteGraphStore

store = SQLiteGraphStore(db_path="knowledge.db")
await store.initialize()

Pros: Persistent, no server required, optimized queries Cons: Single-process only, slower than in-memory

PostgreSQLGraphStore

Best for: Large-scale production, multi-user apps, huge graphs (10M+ nodes)

from aiecs.infrastructure.graph_storage.postgresql import PostgreSQLGraphStore

store = PostgreSQLGraphStore(
    connection_string="postgresql://user:pass@localhost/db"
)
await store.initialize()

Pros: Scalable, concurrent access, pgvector support Cons: Requires PostgreSQL server

Common Tasks

Task 1: Building a Graph from Text

Extract entities and relations from unstructured text:

from aiecs.tools.knowledge_graph import KnowledgeGraphBuilderTool

# Initialize tool
builder = KnowledgeGraphBuilderTool()
await builder._initialize()

# Extract from text
text = """
Alice Smith is a software engineer at Tech Corp in San Francisco.
She has been working there since 2020 and leads the AI team.
Bob Jones, a designer, also works at Tech Corp.
"""

result = await builder.run(
    op="kg_builder",
    action="build_from_text",
    text=text,
    entity_types=["Person", "Company", "Location"]
)

print(f"Extracted {result['entities_added']} entities")
print(f"Extracted {result['relations_added']} relations")

Task 2: Importing CSV Data

Import structured data from CSV files:

from aiecs.application.knowledge_graph.builder.schema_mapping import (
    SchemaMapping,
    EntityMapping,
    RelationMapping
)
from aiecs.application.knowledge_graph.builder.structured_pipeline import (
    StructuredDataPipeline
)

# Define schema mapping
mapping = SchemaMapping(
    entity_mappings=[
        EntityMapping(
            source_columns=["person_id", "name", "age"],
            entity_type="Person",
            property_mapping={
                "id": "person_id",
                "name": "name",
                "age": "age"
            },
            id_column="person_id"
        ),
        EntityMapping(
            source_columns=["company_id", "company_name"],
            entity_type="Company",
            property_mapping={
                "id": "company_id",
                "name": "company_name"
            },
            id_column="company_id"
        )
    ],
    relation_mappings=[
        RelationMapping(
            source_id_column="person_id",
            target_id_column="company_id",
            relation_type="WORKS_FOR",
            property_mapping={
                "role": "role",
                "start_date": "start_date"
            }
        )
    ]
)

# Import CSV
pipeline = StructuredDataPipeline(mapping=mapping, graph_store=store)
result = await pipeline.import_from_csv("employees.csv")

print(f"Imported {result.entities_added} entities")
print(f"Imported {result.relations_added} relations")

Task 3: Searching the Graph

Perform different types of searches:

from aiecs.tools.knowledge_graph import GraphSearchTool

search_tool = GraphSearchTool()
await search_tool._initialize()

# Vector search (semantic similarity)
result = await search_tool.run(
    op="graph_search",
    mode="vector",
    query="machine learning experts",
    top_k=10
)

# Graph traversal search
result = await search_tool.run(
    op="graph_search",
    mode="graph",
    start_entity_id="alice",
    max_depth=3,
    relation_types=["WORKS_FOR", "KNOWS"]
)

# Hybrid search (combines vector + graph)
result = await search_tool.run(
    op="graph_search",
    mode="hybrid",
    query="senior engineers in San Francisco",
    top_k=10,
    enable_reranking=True,
    rerank_strategy="hybrid"
)

Task 4: Multi-Hop Reasoning

Answer complex questions by traversing the graph:

from aiecs.tools.knowledge_graph import GraphReasoningTool

reasoning_tool = GraphReasoningTool()
await reasoning_tool._initialize()

# Multi-hop question answering
result = await reasoning_tool.run(
    op="graph_reasoning",
    mode="multi_hop",
    query="How is Alice connected to Project X?",
    start_entity_id="alice",
    end_entity_id="project_x",
    max_hops=5
)

print(f"Answer: {result['answer']}")
print(f"Reasoning steps: {result['reasoning_steps']}")
print(f"Evidence paths: {len(result['paths'])}")

Task 5: Knowledge Fusion

Merge duplicate entities from multiple sources:

from aiecs.application.knowledge_graph.fusion import KnowledgeFusion

# After importing data from multiple sources
fusion = KnowledgeFusion(
    graph_store=store,
    similarity_threshold=0.85,
    conflict_resolution_strategy="most_complete"
)

# Fuse entities
stats = await fusion.fuse_cross_document_entities(
    entity_types=["Person", "Company"]
)

print(f"Analyzed {stats['entities_analyzed']} entities")
print(f"Merged {stats['entities_merged']} duplicates")
print(f"Resolved {stats['conflicts_resolved']} conflicts")

Schema Management

Define and validate your knowledge graph schema:

from aiecs.domain.knowledge_graph.schema import (
    SchemaManager,
    EntityType,
    RelationType,
    PropertySchema,
    PropertyType
)

# Create schema manager
manager = SchemaManager()

# Define entity type
person_type = EntityType(
    name="Person",
    description="A person entity",
    properties={
        "name": PropertySchema(
            name="name",
            property_type=PropertyType.STRING,
            required=True
        ),
        "age": PropertySchema(
            name="age",
            property_type=PropertyType.INTEGER,
            min_value=0,
            max_value=150
        ),
        "email": PropertySchema(
            name="email",
            property_type=PropertyType.STRING,
            required=False
        )
    }
)
manager.create_entity_type(person_type)

# Define relation type
works_for_type = RelationType(
    name="WORKS_FOR",
    description="Employment relationship",
    source_entity_types=["Person"],
    target_entity_types=["Company"],
    properties={
        "role": PropertySchema(
            name="role",
            property_type=PropertyType.STRING
        ),
        "start_date": PropertySchema(
            name="start_date",
            property_type=PropertyType.DATE
        )
    }
)
manager.create_relation_type(works_for_type)

# Validate entities
is_valid = manager.validate_entity("Person", {
    "name": "Alice",
    "age": 30,
    "email": "alice@example.com"
})
print(f"Entity valid: {is_valid}")

Best Practices

1. Choose the Right Storage Backend

Development/Testing: Use InMemoryGraphStore for fast iteration
Small Production Apps: Use SQLiteGraphStore for persistence without server
Large Production Apps: Use PostgreSQLGraphStore for scale and concurrency

2. Define Your Schema

Always define entity and relation types before building your graph:

# Define schema first
manager.create_entity_type(person_type)
manager.create_relation_type(works_for_type)

# Then build graph
await store.add_entity(entity)

3. Use Meaningful IDs

Use descriptive, stable IDs for entities:

# Good: Stable, meaningful IDs
entity = Entity(id="person_alice_smith", ...)
entity = Entity(id="company_tech_corp", ...)

# Avoid: Random UUIDs unless necessary
entity = Entity(id="a1b2c3d4-...", ...)

4. Add Metadata

Include source and confidence information:

entity = Entity(
    id="person_1",
    entity_type="Person",
    properties={"name": "Alice"},
    metadata={
        "source": "document_1",
        "confidence": 0.95,
        "extracted_at": "2025-11-15T10:00:00Z"
    }
)

5. Use Knowledge Fusion

When importing from multiple sources, use fusion to merge duplicates:

# Import from multiple sources
await pipeline.import_from_csv("source1.csv")
await pipeline.import_from_csv("source2.csv")

# Fuse duplicates
fusion = KnowledgeFusion(store, similarity_threshold=0.85)
await fusion.fuse_cross_document_entities()

6. Enable Reranking for Better Results

Use reranking to improve search quality:

result = await search_tool.run(
    op="graph_search",
    mode="hybrid",
    query="machine learning experts",
    enable_reranking=True,
    rerank_strategy="hybrid"  # Combines multiple signals
)

7. Optimize Performance

Use schema caching for repeated queries
Batch operations when possible
Choose appropriate max_depth for traversals
Use filters to reduce result sets

Next Steps

Tutorials: See End-to-End Tutorial and Multi-Hop Reasoning Tutorial for step-by-step guides
Examples: Check CSV-to-Graph Tutorial and JSON-to-Graph Tutorial for working code
API Reference: Read API_REFERENCE.md for detailed API docs
Performance: See PERFORMANCE_GUIDE.md for optimization tips
Troubleshooting: Check TROUBLESHOOTING.md for common issues

Getting Help

Documentation: Browse the docs/knowledge_graph/ directory
Examples: See CSV-to-Graph Tutorial and JSON-to-Graph Tutorial for examples
Issues: Report bugs or request features on GitHub

Happy knowledge graphing! 🚀