Skip to content

Unified Intelligence Demo

This demo showcases HyperBinder's unique ability to seamlessly traverse between semantic (fuzzy) and symbolic (exact) data in a single query chain—something that traditionally requires complex multi-system orchestration.

The Knowledge Design Layer

Modern AI stacks have a gap. LLMs need structured knowledge, but data lives in fragmented systems. HyperBinder fills this gap with a Knowledge Design Layer:

flowchart TB
    A["LLMs · Agents · RAG"]
    K["<b>KNOWLEDGE DESIGN LAYER</b><br/><i>Compounds · Intersections · Queries</i>"]
    D["Your Data Sources"]

    A <-->|"Declarative queries"| K
    K <-->|"Ingestion"| D

    style K fill:#3182ce,stroke:#63b3ed,stroke-width:2px,color:#fff

Without HyperBinder: 200+ lines of orchestration across 3+ systems per query type.

With HyperBinder: ~10 lines, one system, declarative.

The Scenario

A product manager asks:

"Who are our ML experts in Engineering, what projects are they on, and what's the budget allocation?"

This simple question requires traversing:

  1. Semantic search on employee bios
  2. Exact filtering on department
  3. Cross-encoding join from employee IDs to expertise topics
  4. Semantic join from topics to projects
  5. Exact join from projects to budgets

Traditional Approach

Without HyperBinder, you'd need to orchestrate multiple systems:

flowchart TB
    ES["Elasticsearch"]
    APP1["App Code"]
    PG1["PostgreSQL"]
    APP2["App Code"]
    NEO["Neo4j"]
    APP3["App Code"]
    PG2["PostgreSQL"]
    APP4["App Code"]

    ES --> APP1 --> PG1 --> APP2 --> NEO --> APP3 --> PG2 --> APP4

    style ES fill:#c05621,stroke:#ed8936,color:#fff
    style PG1 fill:#276749,stroke:#48bb78,color:#fff
    style PG2 fill:#276749,stroke:#48bb78,color:#fff
    style NEO fill:#553c9a,stroke:#9f7aea,color:#fff
    style APP1 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP2 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP3 fill:#9b2c2c,stroke:#fc8181,color:#fff
    style APP4 fill:#9b2c2c,stroke:#fc8181,color:#fff

Problems with this approach:

  • Multiple systems to maintain (Elasticsearch, PostgreSQL, Neo4j)
  • Complex orchestration code for each query type
  • No type safety across system boundaries
  • Difficult to modify query logic
  • Performance overhead from multiple round-trips

HyperBinder Approach

from hybi import HyperBinder
import pandas as pd

hb = HyperBinder()

# === DATA SETUP ===

# Employees (semantic search on bios)
employees = pd.DataFrame([
    {"employee_id": "EMP001", "name": "Alice Chen",
     "bio": "Senior ML engineer specializing in NLP", "department": "Engineering"},
    {"employee_id": "EMP002", "name": "Bob Smith",
     "bio": "Backend developer with distributed systems expertise", "department": "Engineering"},
    {"employee_id": "EMP003", "name": "Carol White",
     "bio": "ML researcher focusing on computer vision", "department": "Research"},
    {"employee_id": "EMP004", "name": "David Lee",
     "bio": "Data scientist with machine learning background", "department": "Engineering"},
])

# Expertise topics (semantic encoding)
expertise = pd.DataFrame([
    {"topic": "natural language processing", "domain": "AI", "maturity": "Production"},
    {"topic": "computer vision models", "domain": "AI", "maturity": "Research"},
    {"topic": "distributed systems design", "domain": "Infrastructure", "maturity": "Production"},
    {"topic": "deep learning frameworks", "domain": "AI", "maturity": "Production"},
])

# Projects (joined via expertise)
projects = pd.DataFrame([
    {"project_id": "PROJ001", "name": "ChatBot v2", "focus_area": "natural language processing"},
    {"project_id": "PROJ002", "name": "Visual Search", "focus_area": "computer vision models"},
    {"project_id": "PROJ003", "name": "Model Training Platform", "focus_area": "deep learning frameworks"},
])

# Budgets (exact lookups)
budgets = pd.DataFrame([
    {"project_id": "PROJ001", "allocated": 500000, "spent": 320000},
    {"project_id": "PROJ002", "allocated": 750000, "spent": 180000},
    {"project_id": "PROJ003", "allocated": 1000000, "spent": 50000},
])

# Link mappings (employee IDs → expertise topics)
employee_expertise = pd.DataFrame([
    {"employee_id": "EMP001", "topic": "natural language processing"},
    {"employee_id": "EMP001", "topic": "deep learning frameworks"},
    {"employee_id": "EMP003", "topic": "computer vision models"},
    {"employee_id": "EMP004", "topic": "deep learning frameworks"},
    {"employee_id": "EMP004", "topic": "natural language processing"},
])

# Ingest data
hb.ingest(employees, "employees")
hb.ingest(expertise, "expertise")
hb.ingest(projects, "projects")
hb.ingest(budgets, "budgets")

# === DECLARE INTERSECTIONS ===

# Cross-encoding: EXACT employee_id → SEMANTIC topic
ix = hb.intersect_flexible("employees.employee_id", "expertise.topic")
hb.populate_links(ix, employee_expertise, "employee_id", "topic")

# Same-encoding joins
hb.intersect("expertise.topic", "projects.focus_area")
hb.intersect("projects.project_id", "budgets.project_id")

# === THE POWER QUERY ===

results = (
    hb.query("employees")
    .search("machine learning")           # SEMANTIC: fuzzy bio search
    .filter(department="Engineering")     # EXACT: department filter
    .join("expertise")                    # CROSS-ENCODING: via links!
    .join("projects")                     # SEMANTIC: topic → focus area
    .join("budgets")                      # EXACT: project_id match
)

# That's it. ~10 lines instead of 200+.

for r in results:
    emp = r["employees"]
    exp = r["expertise"]
    proj = r["projects"]
    budget = r["budgets"]

    print(f"{emp['name']}{exp['topic']}{proj['name']} (${budget['allocated']:,})")

Output:

Alice Chen → natural language processing → ChatBot v2 ($500,000)
Alice Chen → deep learning frameworks → Model Training Platform ($1,000,000)
David Lee → natural language processing → ChatBot v2 ($500,000)
David Lee → deep learning frameworks → Model Training Platform ($1,000,000)

What Makes This Unique

1. Unified Query Language

One syntax for semantic, exact, and graph operations:

.search("query")           # Semantic similarity
.filter(field="value")     # Exact equality
.join("collection")        # Graph traversal

2. Cross-Encoding Joins

Connect fields with different encoding types via explicit links:

# EXACT employee IDs ↔ SEMANTIC topic descriptions
ix = hb.intersect_flexible("employees.employee_id", "expertise.topic")
hb.populate_links(ix, links_df, "employee_id", "topic")

3. Declarative, Not Procedural

Describe what you want, not how to orchestrate it:

# What: ML experts in Engineering with their projects and budgets
# How: HyperBinder figures it out

4. Type-Safe

Schema validation catches errors at definition time, not runtime.

5. Composable

Each operation builds on the previous, creating a fluent chain.

Additional Capabilities

Analogical Reasoning

# "Alice is our NLP expert. Who plays a similar role for computer vision?"
results = hb.analogy("Alice Chen", "NLP", "computer vision",
                     field="name", collection="employees")

Bidirectional Traversal

# Start from budgets, trace back to employees
results = (
    hb.query("budgets")
    .filter(allocated__gt=500000)
    .join("projects")
    .join("expertise")
    .join("employees")
)
# Find employees similar to ANY of these examples
results = hb.search_prototype(
    examples=["Alice Chen", "David Lee"],
    field="name",
    collection="employees"
)

Running the Demo

python examples/compose/unified_intelligence_demo.py

Compound Combinations: Model Any Domain

The real power is combining compounds to model any domain. Each compound is optimized for a specific pattern, and intersections wire them together.

Compound Optimized For
Catalog Searchable records with weighted fields
RelationalTable CRUD operations with primary keys
KnowledgeGraph Entity-relation-entity facts
Hierarchy Parent-child trees (org charts, taxonomies)
Document Text chunks with metadata
Network Graph relationships (social, citations)
TimeSeries Time-ordered sequences

Example: Healthcare System

flowchart LR
    subgraph row1 [" "]
        direction LR
        P["Patients<br/><i>Catalog</i>"]
        D["Diagnoses<br/><i>KnowledgeGraph</i>"]
        T["Treatments<br/><i>TimeSeries</i>"]
    end
    subgraph row2 [" "]
        direction LR
        R["Records<br/><i>RelationalTable</i>"]
        M["Medical Ontology<br/><i>Hierarchy</i>"]
        DR["Drug Database<br/><i>RelationalTable</i>"]
    end
    P --> D --> T
    P --> R
    D --> M
    T --> DR
results = (
    hb.query("patients").search("diabetes symptoms")
    .join("diagnoses")        # Semantic: symptoms → condition
    .join("treatments")       # Identity: condition → treatment plan
    .join("drugs")            # Identity: drug_id → drug record
    .filter(has_contraindications=True)
)
flowchart LR
    subgraph row1 [" "]
        direction LR
        C["Cases<br/><i>Document</i>"]
        CT["Citations<br/><i>Network</i>"]
        S["Statutes<br/><i>Hierarchy</i>"]
    end
    subgraph row2 [" "]
        direction LR
        J["Judges<br/><i>Catalog</i>"]
        L["Legal Concepts<br/><i>KnowledgeGraph</i>"]
        JR["Jurisdictions<br/><i>Hierarchy</i>"]
    end
    C --> CT --> S
    C --> J
    CT --> L
    S --> JR
results = (
    hb.query("cases").search("digital privacy")
    .join("citations")        # Network: case → cited cases
    .join("statutes")         # Identity: statute_id → statute
    .filter(amendment="4th")
    .join("concepts")         # Semantic: case text → legal concept
)

Example: Research Knowledge Base

flowchart LR
    subgraph row1 [" "]
        direction LR
        PA["Papers<br/><i>Document</i>"]
        AU["Authors<br/><i>Network</i>"]
        IN["Institutions<br/><i>Hierarchy</i>"]
    end
    subgraph row2 [" "]
        direction LR
        CO["Concepts<br/><i>KnowledgeGraph</i>"]
        GR["Grants<br/><i>RelationalTable</i>"]
        VE["Venues<br/><i>Catalog</i>"]
    end
    PA --> AU --> IN
    PA --> CO
    AU --> GR
    IN --> VE
results = (
    hb.query("papers").search("transformer attention mechanism")
    .join("concepts")         # Semantic: abstract → research concept
    .join("authors")          # Network: paper → author collaborations
    .join("institutions")     # Identity: affiliation_id → institution
    .filter(name="Stanford")
    .join("grants")           # Flexible: author_id → grant_id
    .filter(agency="NSF")
)

The Pattern

  1. Choose compounds that match your data shapes
  2. Connect with intersections (strict for same-type, flexible for cross-type)
  3. Query declaratively across the entire graph

Each compound handles its specialty. Mix them freely—the intersections handle the glue.

Key Takeaways

  1. One System: Replace Elasticsearch + PostgreSQL + Neo4j with one unified interface
  2. Cross-Encoding: Connect any field types via explicit link mappings
  3. Declarative: Focus on the question, not the plumbing
  4. Composable: Build complex queries from simple, chainable operations
  5. Type-Safe: Catch errors early with schema validation
  6. Domain Flexible: Combine compounds to model any knowledge architecture

This is the power of neurosymbolic computing: the best of semantic and symbolic approaches, seamlessly integrated.