Unified Intelligence Demo¶

This demo showcases HyperBinder's unique ability to seamlessly traverse between semantic (fuzzy) and symbolic (exact) data in a single query chain—something that traditionally requires complex multi-system orchestration.

The Knowledge Design Layer¶

Modern AI stacks have a gap. LLMs need structured knowledge, but data lives in fragmented systems. HyperBinder fills this gap with a Knowledge Design Layer:

flowchart TB
    A["LLMs · Agents · RAG"]
    K["<b>KNOWLEDGE DESIGN LAYER</b><br/><i>Compounds · Intersections · Queries</i>"]
    D["Your Data Sources"]

    A <-->|"Declarative queries"| K
    K <-->|"Ingestion"| D

    style K fill:#3182ce,stroke:#63b3ed,stroke-width:2px,color:#fff

Without HyperBinder: 200+ lines of orchestration across 3+ systems per query type.

With HyperBinder: ~10 lines, one system, declarative.

The Scenario¶

A product manager asks:

"Who are our ML experts in Engineering, what projects are they on, and what's the budget allocation?"

This simple question requires traversing:

Semantic search on employee bios
Exact filtering on department
Cross-encoding join from employee IDs to expertise topics
Semantic join from topics to projects
Exact join from projects to budgets

Traditional Approach¶

Without HyperBinder, you'd need to orchestrate multiple systems:

flowchart TB
    ES["Elasticsearch"]
    APP1["App Code"]
    PG1["PostgreSQL"]
    APP2["App Code"]
    NEO["Neo4j"]
    APP3["App Code"]
    PG2["PostgreSQL"]
    APP4["App Code"]

    ES --> APP1 --> PG1 --> APP2 --> NEO --> APP3 --> PG2 --> APP4

    style ES fill:#c05621,stroke:#ed8936,color:#fff
    style PG1 fill:#276749,stroke:#48bb78,color:#fff
    style PG2 fill:#276749,stroke:#48bb78,color:#fff
    style NEO fill:#553c9a,stroke:#9f7aea,color:#fff
    style APP1 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP2 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP3 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP4 fill:#9b2c2c,stroke:#fc8181,color:#fff

Problems with this approach:

Multiple systems to maintain (Elasticsearch, PostgreSQL, Neo4j)
Complex orchestration code for each query type
No type safety across system boundaries
Difficult to modify query logic
Performance overhead from multiple round-trips

HyperBinder Approach¶

from hybi import HyperBinder
import pandas as pd

hb = HyperBinder()

# === DATA SETUP ===

# Employees (semantic search on bios)
employees = pd.DataFrame([
    {"employee_id": "EMP001", "name": "Alice Chen",
     "bio": "Senior ML engineer specializing in NLP", "department": "Engineering"},
    {"employee_id": "EMP002", "name": "Bob Smith",
     "bio": "Backend developer with distributed systems expertise", "department": "Engineering"},
    {"employee_id": "EMP003", "name": "Carol White",
     "bio": "ML researcher focusing on computer vision", "department": "Research"},
    {"employee_id": "EMP004", "name": "David Lee",
     "bio": "Data scientist with machine learning background", "department": "Engineering"},
])

# Expertise topics (semantic encoding)
expertise = pd.DataFrame([
    {"topic": "natural language processing", "domain": "AI", "maturity": "Production"},
    {"topic": "computer vision models", "domain": "AI", "maturity": "Research"},
    {"topic": "distributed systems design", "domain": "Infrastructure", "maturity": "Production"},
    {"topic": "deep learning frameworks", "domain": "AI", "maturity": "Production"},
])

# Projects (joined via expertise)
projects = pd.DataFrame([
    {"project_id": "PROJ001", "name": "ChatBot v2", "focus_area": "natural language processing"},
    {"project_id": "PROJ002", "name": "Visual Search", "focus_area": "computer vision models"},
    {"project_id": "PROJ003", "name": "Model Training Platform", "focus_area": "deep learning frameworks"},
])

# Budgets (exact lookups)
budgets = pd.DataFrame([
    {"project_id": "PROJ001", "allocated": 500000, "spent": 320000},
    {"project_id": "PROJ002", "allocated": 750000, "spent": 180000},
    {"project_id": "PROJ003", "allocated": 1000000, "spent": 50000},
])

# Link mappings (employee IDs → expertise topics)
employee_expertise = pd.DataFrame([
    {"employee_id": "EMP001", "topic": "natural language processing"},
    {"employee_id": "EMP001", "topic": "deep learning frameworks"},
    {"employee_id": "EMP003", "topic": "computer vision models"},
    {"employee_id": "EMP004", "topic": "deep learning frameworks"},
    {"employee_id": "EMP004", "topic": "natural language processing"},
])

# Ingest data
hb.ingest(employees, "employees")
hb.ingest(expertise, "expertise")
hb.ingest(projects, "projects")
hb.ingest(budgets, "budgets")

# === DECLARE INTERSECTIONS ===

# Cross-encoding: EXACT employee_id → SEMANTIC topic
ix = hb.intersect_flexible("employees.employee_id", "expertise.topic")
hb.populate_links(ix, employee_expertise, "employee_id", "topic")

# Same-encoding joins
hb.intersect("expertise.topic", "projects.focus_area")
hb.intersect("projects.project_id", "budgets.project_id")

# === THE POWER QUERY ===

results = (
    hb.query("employees")
    .search("machine learning")           # SEMANTIC: fuzzy bio search
    .filter(department="Engineering")     # EXACT: department filter
    .join("expertise")                    # CROSS-ENCODING: via links!
    .join("projects")                     # SEMANTIC: topic → focus area
    .join("budgets")                      # EXACT: project_id match
)

# That's it. ~10 lines instead of 200+.

for r in results:
    emp = r["employees"]
    exp = r["expertise"]
    proj = r["projects"]
    budget = r["budgets"]

    print(f"{emp['name']} → {exp['topic']} → {proj['name']} (${budget['allocated']:,})")

Output:

Alice Chen → natural language processing → ChatBot v2 ($500,000)
Alice Chen → deep learning frameworks → Model Training Platform ($1,000,000)
David Lee → natural language processing → ChatBot v2 ($500,000)
David Lee → deep learning frameworks → Model Training Platform ($1,000,000)

What Makes This Unique¶

1. Unified Query Language¶

One syntax for semantic, exact, and graph operations:

.search("query")           # Semantic similarity
.filter(field="value")     # Exact equality
.join("collection")        # Graph traversal

2. Cross-Encoding Joins¶

Connect fields with different encoding types via explicit links:

# EXACT employee IDs ↔ SEMANTIC topic descriptions
ix = hb.intersect_flexible("employees.employee_id", "expertise.topic")
hb.populate_links(ix, links_df, "employee_id", "topic")

3. Declarative, Not Procedural¶

Describe what you want, not how to orchestrate it:

# What: ML experts in Engineering with their projects and budgets
# How: HyperBinder figures it out

4. Type-Safe¶

Schema validation catches errors at definition time, not runtime.

5. Composable¶

Each operation builds on the previous, creating a fluent chain.

Additional Capabilities¶

Analogical Reasoning¶

# "Alice is our NLP expert. Who plays a similar role for computer vision?"
results = hb.analogy("Alice Chen", "NLP", "computer vision",
                     field="name", collection="employees")

Bidirectional Traversal¶

# Start from budgets, trace back to employees
results = (
    hb.query("budgets")
    .filter(allocated__gt=500000)
    .join("projects")
    .join("expertise")
    .join("employees")
)

Prototype Search¶

# Find employees similar to ANY of these examples
results = hb.search_prototype(
    examples=["Alice Chen", "David Lee"],
    field="name",
    collection="employees"
)

Running the Demo¶

python examples/compose/unified_intelligence_demo.py

Compound Combinations: Model Any Domain¶

The real power is combining compounds to model any domain. Each compound is optimized for a specific pattern, and intersections wire them together.

Compound	Optimized For
Catalog	Searchable records with weighted fields
RelationalTable	CRUD operations with primary keys
KnowledgeGraph	Entity-relation-entity facts
Hierarchy	Parent-child trees (org charts, taxonomies)
Document	Text chunks with metadata
Network	Graph relationships (social, citations)
TimeSeries	Time-ordered sequences

Example: Healthcare System¶

flowchart LR
    subgraph row1 [" "]
        direction LR
        P["Patients<br/><i>Catalog</i>"]
        D["Diagnoses<br/><i>KnowledgeGraph</i>"]
        T["Treatments<br/><i>TimeSeries</i>"]
    end
    subgraph row2 [" "]
        direction LR
        R["Records<br/><i>RelationalTable</i>"]
        M["Medical Ontology<br/><i>Hierarchy</i>"]
        DR["Drug Database<br/><i>RelationalTable</i>"]
    end
    P --> D --> T
    P --> R
    D --> M
    T --> DR

results = (
    hb.query("patients").search("diabetes symptoms")
    .join("diagnoses")        # Semantic: symptoms → condition
    .join("treatments")       # Identity: condition → treatment plan
    .join("drugs")            # Identity: drug_id → drug record
    .filter(has_contraindications=True)
)

Example: Legal Research Platform¶

flowchart LR
    subgraph row1 [" "]
        direction LR
        C["Cases<br/><i>Document</i>"]
        CT["Citations<br/><i>Network</i>"]
        S["Statutes<br/><i>Hierarchy</i>"]
    end
    subgraph row2 [" "]
        direction LR
        J["Judges<br/><i>Catalog</i>"]
        L["Legal Concepts<br/><i>KnowledgeGraph</i>"]
        JR["Jurisdictions<br/><i>Hierarchy</i>"]
    end
    C --> CT --> S
    C --> J
    CT --> L
    S --> JR

results = (
    hb.query("cases").search("digital privacy")
    .join("citations")        # Network: case → cited cases
    .join("statutes")         # Identity: statute_id → statute
    .filter(amendment="4th")
    .join("concepts")         # Semantic: case text → legal concept
)

Example: Research Knowledge Base¶

flowchart LR
    subgraph row1 [" "]
        direction LR
        PA["Papers<br/><i>Document</i>"]
        AU["Authors<br/><i>Network</i>"]
        IN["Institutions<br/><i>Hierarchy</i>"]
    end
    subgraph row2 [" "]
        direction LR
        CO["Concepts<br/><i>KnowledgeGraph</i>"]
        GR["Grants<br/><i>RelationalTable</i>"]
        VE["Venues<br/><i>Catalog</i>"]
    end
    PA --> AU --> IN
    PA --> CO
    AU --> GR
    IN --> VE

results = (
    hb.query("papers").search("transformer attention mechanism")
    .join("concepts")         # Semantic: abstract → research concept
    .join("authors")          # Network: paper → author collaborations
    .join("institutions")     # Identity: affiliation_id → institution
    .filter(name="Stanford")
    .join("grants")           # Flexible: author_id → grant_id
    .filter(agency="NSF")
)

The Pattern¶

Choose compounds that match your data shapes
Connect with intersections (strict for same-type, flexible for cross-type)
Query declaratively across the entire graph

Each compound handles its specialty. Mix them freely—the intersections handle the glue.

Key Takeaways¶

One System: Replace Elasticsearch + PostgreSQL + Neo4j with one unified interface
Cross-Encoding: Connect any field types via explicit link mappings
Declarative: Focus on the question, not the plumbing
Composable: Build complex queries from simple, chainable operations
Type-Safe: Catch errors early with schema validation
Domain Flexible: Combine compounds to model any knowledge architecture

This is the power of neurosymbolic computing: the best of semantic and symbolic approaches, seamlessly integrated.