Compounds¶

Compounds are pre-configured molecules for common domain patterns.

Overview¶

Compound	Based On	Use Case
KnowledgeGraph	Triple	Entity-relation-entity facts
Catalog	Bundle	Generic tabular data (read-heavy)
RelationalTable	Row	Mutable tables with CRUD
TimeSeries	Sequence	Time-ordered data
Hierarchy	Tree	Org charts, taxonomies
Document	Bundle	Document chunks with metadata
Network	Graph	Social graphs, citations

Compounds expand to molecules at definition time, so they have the same capabilities once created.

KnowledgeGraph¶

Pre-configured Triple for knowledge graph data.

from hybi.compose import KnowledgeGraph

schema = KnowledgeGraph(
    entity_field="person",
    relation_field="relationship",
    # Defaults: SEMANTIC for entities, EXACT for relations
)

Equivalent to:

Triple(
    subject=Field("person", encoding=Encoding.SEMANTIC),
    predicate=Field("relationship", encoding=Encoding.EXACT),
    object=Field("target", encoding=Encoding.SEMANTIC),
)

`hybi.compose.KnowledgeGraph` `dataclass` ¶

Bases: BaseMolecule

Knowledge graph compound: entity-relation-entity triples.

A convenience wrapper around Triple with sensible defaults for knowledge graph use cases (semantic entities, exact relations).

Expands to

Triple( subject=Field(entity_field, encoding=SEMANTIC), predicate=Field(relation_field, encoding=EXACT), object=Field(entity_field, encoding=SEMANTIC), )

Example

Simple usage - defaults to entity/relation columns¶

schema = KnowledgeGraph() hb.ingest(facts_df, collection="kg", schema=schema)

Custom field names¶

schema = KnowledgeGraph( ... entity_field="person", ... relation_field="relationship", ... )

With custom encoding¶

schema = KnowledgeGraph( ... entity_field="entity", ... relation_field="predicate", ... entity_encoding=Encoding.EXACT, # For IDs instead of text ... )

`init(entity_field='entity', relation_field='relation', subject_field=None, object_field=None, entity_encoding=Encoding.SEMANTIC, relation_encoding=Encoding.EXACT, entity_weight=1.0, relation_weight=1.0)` ¶

Catalog¶

Pre-configured Bundle for tabular data.

from hybi.compose import Catalog, Field, Encoding

schema = Catalog(
    columns={
        "name": Field(encoding=Encoding.SEMANTIC, weight=1.5),
        "category": Field(encoding=Encoding.EXACT),
        "price": Field(encoding=Encoding.NUMERIC, similar_within=50),
    }
)

`hybi.compose.Catalog` `dataclass` ¶

Bases: BaseMolecule

Catalog compound: searchable collection with SQL-like operations.

A convenience wrapper around Bundle optimized for tabular data with a familiar SQL-like query interface. Catalog provides a bridge between traditional relational thinking and hyperdimensional computing.

Expands to

Bundle(fields={ column_name: Field(encoding=..., weight=...), ... })

Unlike pure relational tables, Catalog supports: - Semantic search: Find rows by meaning, not just exact values - Fuzzy matching: Similarity-based lookups with configurable thresholds - Vector joins: Join collections by semantic similarity, not just key equality

Operations Map

Catalog Method	HDC Operation
select()	Field projection (SelectQuery)
where()	Exact filter + similarity search
join()	JoinQuery (exact or semantic)
aggregate()	AggregateQuery (GROUP BY)
search()	Vector similarity search

Example

Define a products catalog¶

schema = Catalog( ... columns={ ... "name": Field(encoding=Encoding.SEMANTIC, weight=2.0), ... "description": Field(encoding=Encoding.SEMANTIC), ... "category": Field(encoding=Encoding.EXACT), ... "price": Field(encoding=Encoding.NUMERIC, similar_within=50), ... }, ... primary_key="id", ... ) hb.ingest(products_df, collection="products", schema=schema)

Traditional-style query¶

results = hb.query("products").where(category="electronics")

Semantic query (HDC advantage)¶

results = hb.query("products").search("lightweight laptop for travel")

Join with another catalog¶

order_schema = Catalog( ... columns={"product_id": Field(encoding=Encoding.EXACT), ...} ... ) joined = hb.query("orders").join("products", on="product_id")

Aggregation¶

stats = hb.query("products").aggregate( ... group_by=["category"], ... aggregations={"avg_price": ("price", "avg")} ... )

Notes

primary_key is metadata only; HDC doesn't require explicit keys
For semantic joins, use Encoding.SEMANTIC on join columns
The underlying Bundle uses bundle encoding (lossy but searchable)

`init(columns=dict(), primary_key=None, catalog_name=None)` ¶

RelationalTable¶

SQL-like table with full CRUD support.

RelationalTable provides familiar relational database semantics with atomic row-level operations. Unlike Catalog (which is optimized for search), RelationalTable uses structured encoding which enables true field-level updates.

Catalog vs RelationalTable¶

Aspect	Catalog	RelationalTable
Encoding	Search-optimized	Structured
Search	Fast	Moderate
UPDATE/DELETE	Not supported	Full support
Use case	Search catalog	Mutable tables

Use Catalog when you primarily search and append data. Use RelationalTable when you need UPDATE/DELETE operations.

from hybi.compose import RelationalTable, Field, Encoding

schema = RelationalTable(
    columns={
        "user_id": Field(encoding=Encoding.EXACT),
        "email": Field(encoding=Encoding.EXACT),
        "name": Field(encoding=Encoding.SEMANTIC),
        "salary": Field(encoding=Encoding.NUMERIC, similar_within=10000),
    },
    primary_key="user_id",
)

CRUD Operations:

# Ingest data
hb.ingest(users_df, collection="users", schema=schema)

# Read by primary key
user = hb.query("users", schema).get(user_id="U001")

# Update fields atomically
hb.update(
    "users",
    where={"user_id": "U001"},
    set={"email": "new@example.com", "salary": 120000},
    schema=schema,
)

# Delete row
hb.delete("users", where={"user_id": "U001"}, schema=schema)

# Upsert (insert or update)
hb.upsert("users", row={"user_id": "U001", ...}, schema=schema)

Equivalent to:

Row(
    primary_key=Field("user_id", encoding=Encoding.EXACT),
    fields={
        "email": Field(encoding=Encoding.EXACT),
        "name": Field(encoding=Encoding.SEMANTIC),
        "salary": Field(encoding=Encoding.NUMERIC, similar_within=10000),
    },
)

Search & CRUD Architecture¶

For optimal performance, use Catalog for search and RelationalTable for CRUD:

flowchart TB
    subgraph Catalog["CATALOG (Search)"]
        C1[Search-optimized<br/>fast]
        C2[Semantic Discovery]
        C1 --> C2
    end

    subgraph RelationalTable["RELATIONAL TABLE (CRUD)"]
        R1[Structured<br/>exact]
        R2[PK Lookups]
        R1 --> R2
    end

    C2 --> Bridge
    R2 --> Bridge
    Bridge[BRIDGE<br/>Primary Keys] --> Mutations[Deterministic Mutations]

Recommended pattern: - Use Catalog for semantic search (optimized for similarity matching) - Use RelationalTable for CRUD (optimized for exact field updates) - Bridge between them using shared primary keys

Single-schema alternative: RelationalTable can handle both search and CRUD, but search performance is slower than dedicated Catalog.

Fuzzy-to-Exact Bridge Pattern¶

When you need to combine semantic discovery with exact mutations, use the bridge pattern:

Fuzzy search casts a wide net using semantic similarity
Exact filters narrow to deterministic boundaries
CRUD via PKs operates on the refined set

# 1. Semantic search finds candidates
candidates = hb.query("users", schema).search("machine learning expert", top_k=50)

# 2. Exact filtering narrows to deterministic set
refined = [r for r in candidates
           if r.data["department"] == "Engineering"
           and r.data["status"] == "active"]

# 3. CRUD via primary keys (safe - deterministic)
for r in refined:
    hb.update("users", where={"user_id": r.data["user_id"]}, set={...}, schema=schema)

This pattern leverages fuzzy search for discovery ("I don't know the exact term") while ensuring mutations operate on deterministic, exactly-identified rows.

See Fuzzy-to-Exact Pattern for a complete implementation.

`hybi.compose.RelationalTable` `dataclass` ¶

Bases: BaseMolecule

SQL-like table with full CRUD support.

RelationalTable provides familiar relational database semantics: - Row-level UPDATE: Modify individual fields - Row-level DELETE: Remove rows by primary key - Field extraction: Read individual field values cleanly - ACID guarantees: Single-row atomicity

Unlike Catalog (which uses lossy Bundle encoding optimized for search), RelationalTable uses Row encoding with chain binding, which is lossless. This enables true field-level updates without re-encoding entire rows.

Trade-offs vs Catalog

Aspect	Catalog	RelationalTable
Encoding	Bundle (lossy)	Row (lossless)
Search	Fast	Moderate
UPDATE/DELETE	Not supported	Full support
Use case	Search catalog	Mutable tables

Use RelationalTable when you need UPDATE/DELETE operations. Use Catalog when you primarily search and append data.

Expands to

Row( primary_key=Field(pk_column, encoding=EXACT), fields={...other columns...}, )

Example

Define a users table¶

schema = RelationalTable( ... columns={ ... "user_id": Field(encoding=Encoding.EXACT), ... "email": Field(encoding=Encoding.EXACT), ... "name": Field(encoding=Encoding.SEMANTIC), ... "salary": Field(encoding=Encoding.NUMERIC), ... }, ... primary_key="user_id", ... ) hb.ingest(users_df, collection="users", schema=schema)

Read by primary key¶

user = hb.query("users", schema).get(user_id="U001")

Update fields¶

hb.update( ... "users", ... where={"user_id": "U001"}, ... set={"email": "new@example.com"}, ... schema=schema, ... )

Delete row¶

hb.delete("users", where={"user_id": "U001"}, schema=schema)

Notes

Primary key is required and must use EXACT encoding
Primary key cannot be updated (immutable row identity)
Updates are atomic at the row level

`columns = dataclass_field(default_factory=dict)` `class-attribute` `instance-attribute` ¶

Column definitions mapping column names to Field configurations.

Must include the primary key column.

Example

columns={ "id": Field(encoding=Encoding.EXACT), "name": Field(encoding=Encoding.SEMANTIC), "email": Field(encoding=Encoding.EXACT), }

`primary_key = None` `class-attribute` `instance-attribute` ¶

Name of the primary key column.

Required. The referenced column must: - Exist in columns - Use EXACT encoding

The primary key provides: - O(1) row lookup via PK index - Row identity for UPDATE/DELETE operations - Uniqueness constraint on ingest

`init(columns=dict(), primary_key=None, table_name=None)` ¶

TimeSeries¶

Pre-configured molecule for time-ordered data. Supports two modes:

Temporal Mode (with timestamp_field)¶

When timestamp_field is provided, expands to a Pair enabling temporal queries:

from hybi.compose import TimeSeries

schema = TimeSeries(
    value_field="measurement",
    timestamp_field="recorded_at",  # Enables at_time(), time_range(), when()
)

Supported queries: search, find, at_time, time_range, when

Positional Mode (without timestamp_field)¶

When timestamp_field is None, expands to a Sequence enabling position-based queries:

schema = TimeSeries(
    value_field="message",
    timestamp_field=None,  # Position-based mode
    position_encoding="random",
    max_length=512,
)

Supported queries: search, at, contains, prefix

See timeseries_demo.py for a complete example using positional mode.

`hybi.compose.TimeSeries` `dataclass` ¶

Bases: BaseMolecule

Time series compound: temporal data with timestamp-value binding.

TimeSeries encodes time-indexed data using hyperdimensional temporal binding. When a timestamp_field is provided, each row is encoded as:

timestamp ⊛ value

This enables powerful temporal queries: - at_time(ts): Find values at/near a specific timestamp - time_range(start, end): Find values within a time window - when(value): Find timestamps when a value occurred

Expands to (when timestamp_field provided): Pair( left=Field(timestamp_field, encoding=TEMPORAL), right=Field(value_field, encoding=value_encoding), )

Expands to (when timestamp_field is None - legacy mode): Sequence( item=Field(value_field, encoding=value_encoding), position_encoding="sinusoidal", max_length=max_length, )

Example

Recommended: with timestamp field (enables temporal queries)¶

schema = TimeSeries( ... value_field="temperature", ... timestamp_field="recorded_at", ... value_encoding=Encoding.NUMERIC, ... ) hb.ingest(sensor_df, collection="readings", schema=schema)

Query: What was the temperature at 2pm?¶

results = hb.query("readings").at_time("2024-01-15 14:00:00")

Query: Temperatures between 1pm and 3pm¶

results = hb.query("readings").time_range( ... start="2024-01-15 13:00:00", ... end="2024-01-15 15:00:00", ... )

Legacy mode: without timestamp (uses row position)¶

schema = TimeSeries(value_field="price") # timestamp_field=None

Note: This mode only supports positional queries, not temporal¶

`init(value_field='value', timestamp_field=None, value_encoding=Encoding.SEMANTIC, value_weight=1.0, timestamp_weight=1.0, position_encoding='sinusoidal', max_length=512)` ¶

Hierarchy¶

Pre-configured Tree for parent-child relationships.

from hybi.compose import Hierarchy

schema = Hierarchy(
    node_field="employee",
    parent_field="manager",
)

`hybi.compose.Hierarchy` `dataclass` ¶

Bases: BaseMolecule

Hierarchy compound: parent-child organizational structures.

A convenience wrapper around Tree optimized for hierarchical data like org charts, file systems, taxonomies, or nested categories.

Expands to

Tree( child=Field(node_field, encoding=node_encoding), parent=Field(parent_field, encoding=node_encoding), level=Field(level_field) if level_field else None, )

Example

Org chart¶

schema = Hierarchy( ... node_field="employee", ... parent_field="manager", ... ) hb.ingest(org_df, collection="org", schema=schema)

File system with depth tracking¶

schema = Hierarchy( ... node_field="path", ... parent_field="parent_path", ... level_field="depth", ... )

Taxonomy with exact matching¶

schema = Hierarchy( ... node_field="category", ... parent_field="parent_category", ... node_encoding=Encoding.EXACT, ... )

`init(node_field='node', parent_field='parent', level_field=None, node_encoding=Encoding.SEMANTIC, node_weight=1.0)` ¶

Document¶

Pre-configured Bundle for document chunks.

from hybi.compose import Document, Field, Encoding

schema = Document(
    content_field="text",
    metadata_fields={
        "source": Field(),
        "page": Field(encoding=Encoding.NUMERIC),
        "section": Field(encoding=Encoding.EXACT),
    },
)

`hybi.compose.Document` `dataclass` ¶

Bases: BaseMolecule

Document compound: structured content with metadata.

A convenience wrapper around Bundle optimized for document storage with a primary content field and associated metadata fields.

Expands to

Bundle(fields={ content_field: Field(encoding=SEMANTIC, weight=content_weight), **{name: field for name, field in metadata_fields.items()}, })

Example

Simple document with title and content¶

schema = Document( ... content_field="body", ... metadata_fields={"title": Field(), "author": Field()}, ... ) hb.ingest(docs_df, collection="docs", schema=schema)

Article with categories¶

schema = Document( ... content_field="text", ... content_weight=2.0, # Boost content in search ... metadata_fields={ ... "headline": Field(weight=1.5), ... "category": Field(encoding=Encoding.EXACT), ... "published_date": Field(encoding=Encoding.TEMPORAL), ... }, ... )

`init(content_field='content', content_encoding=Encoding.SEMANTIC, content_weight=1.0, metadata_fields=None)` ¶

Network¶

Pre-configured Graph for network data.

from hybi.compose import Network

schema = Network(
    node_field="user",
    edge_field="interaction",
    directed=True,
    # Optional: use separate columns for source/target nodes
    # source_field="from_user",
    # target_field="to_user",
)

`hybi.compose.Network` `dataclass` ¶

Bases: BaseMolecule

Network compound: node-edge-node graph structures.

A convenience wrapper around Graph optimized for social networks, citation graphs, dependency graphs, and other network structures.

Expands to

Graph( node=Field(node_field, encoding=node_encoding), edge=Field(edge_field, encoding=edge_encoding), directed=directed, )

Example

Social network¶

schema = Network( ... node_field="user", ... edge_field="connection_type", ... ) hb.ingest(social_df, collection="social", schema=schema)

Citation network (undirected)¶

schema = Network( ... node_field="paper_id", ... edge_field="citation_type", ... node_encoding=Encoding.EXACT, ... directed=False, ... )

Dependency graph¶

schema = Network( ... node_field="package", ... edge_field="dependency_type", ... source_field="dependent", ... target_field="dependency", ... )

`init(node_field='node', edge_field='edge', source_field=None, target_field=None, node_encoding=Encoding.SEMANTIC, edge_encoding=Encoding.EXACT, node_weight=1.0, edge_weight=1.0, directed=True)` ¶

When to Use Compounds vs Molecules¶

Use compounds when:

Your data fits a common pattern
You want sensible encoding defaults
You're prototyping quickly

Use molecules when:

You need custom encodings per field
You're nesting structures
You need fine-grained control over weights

See Molecules vs Compounds for detailed guidance.

Example Code¶

Complete runnable examples for each compound type:

Compound	Example File	Description
KnowledgeGraph	`knowledge_graph_demo.py`	Entity-relation-entity facts with traversal
Document	`document_demo.py`	Document chunks with metadata
Hierarchy	`hierarchy_demo.py`	Org charts and taxonomies
TimeSeries	`timeseries_demo.py`	Time-ordered data
Network	`network_demo.py`	Social graphs and citations
Catalog	`product_catalog_demo.py`	Product catalogs with search
RelationalTable	`fuzzy_to_exact_demo.py`	CRUD with fuzzy-to-exact pattern

Run any example from the SDK directory:

cd sdk
python examples/compose/knowledge_graph_demo.py

See Examples README for the full example index.

Compounds¶

Overview¶

KnowledgeGraph¶

hybi.compose.KnowledgeGraph dataclass ¶

Simple usage - defaults to entity/relation columns¶

Custom field names¶

With custom encoding¶

__init__(entity_field='entity', relation_field='relation', subject_field=None, object_field=None, entity_encoding=Encoding.SEMANTIC, relation_encoding=Encoding.EXACT, entity_weight=1.0, relation_weight=1.0) ¶

Catalog¶

hybi.compose.Catalog dataclass ¶

Define a products catalog¶

Traditional-style query¶

Semantic query (HDC advantage)¶

Join with another catalog¶

Aggregation¶

__init__(columns=dict(), primary_key=None, catalog_name=None) ¶

RelationalTable¶

Catalog vs RelationalTable¶

Search & CRUD Architecture¶

Fuzzy-to-Exact Bridge Pattern¶

hybi.compose.RelationalTable dataclass ¶

Define a users table¶

Read by primary key¶

Update fields¶

Delete row¶

columns = dataclass_field(default_factory=dict) class-attribute instance-attribute ¶

primary_key = None class-attribute instance-attribute ¶

__init__(columns=dict(), primary_key=None, table_name=None) ¶

TimeSeries¶

Temporal Mode (with timestamp_field)¶

Positional Mode (without timestamp_field)¶

hybi.compose.TimeSeries dataclass ¶

Recommended: with timestamp field (enables temporal queries)¶

Query: What was the temperature at 2pm?¶

Query: Temperatures between 1pm and 3pm¶

Legacy mode: without timestamp (uses row position)¶

Note: This mode only supports positional queries, not temporal¶

__init__(value_field='value', timestamp_field=None, value_encoding=Encoding.SEMANTIC, value_weight=1.0, timestamp_weight=1.0, position_encoding='sinusoidal', max_length=512) ¶

Hierarchy¶

hybi.compose.Hierarchy dataclass ¶

Org chart¶

File system with depth tracking¶

Taxonomy with exact matching¶

__init__(node_field='node', parent_field='parent', level_field=None, node_encoding=Encoding.SEMANTIC, node_weight=1.0) ¶

Document¶

hybi.compose.Document dataclass ¶

Simple document with title and content¶

Article with categories¶

__init__(content_field='content', content_encoding=Encoding.SEMANTIC, content_weight=1.0, metadata_fields=None) ¶

Network¶

hybi.compose.Network dataclass ¶

Social network¶

Citation network (undirected)¶

Dependency graph¶

__init__(node_field='node', edge_field='edge', source_field=None, target_field=None, node_encoding=Encoding.SEMANTIC, edge_encoding=Encoding.EXACT, node_weight=1.0, edge_weight=1.0, directed=True) ¶

When to Use Compounds vs Molecules¶

Example Code¶

`hybi.compose.KnowledgeGraph` `dataclass` ¶

`init(entity_field='entity', relation_field='relation', subject_field=None, object_field=None, entity_encoding=Encoding.SEMANTIC, relation_encoding=Encoding.EXACT, entity_weight=1.0, relation_weight=1.0)` ¶

`hybi.compose.Catalog` `dataclass` ¶

`init(columns=dict(), primary_key=None, catalog_name=None)` ¶

`hybi.compose.RelationalTable` `dataclass` ¶

`columns = dataclass_field(default_factory=dict)` `class-attribute` `instance-attribute` ¶

`primary_key = None` `class-attribute` `instance-attribute` ¶

`init(columns=dict(), primary_key=None, table_name=None)` ¶

`hybi.compose.TimeSeries` `dataclass` ¶

`init(value_field='value', timestamp_field=None, value_encoding=Encoding.SEMANTIC, value_weight=1.0, timestamp_weight=1.0, position_encoding='sinusoidal', max_length=512)` ¶

`hybi.compose.Hierarchy` `dataclass` ¶

`init(node_field='node', parent_field='parent', level_field=None, node_encoding=Encoding.SEMANTIC, node_weight=1.0)` ¶

`hybi.compose.Document` `dataclass` ¶

`init(content_field='content', content_encoding=Encoding.SEMANTIC, content_weight=1.0, metadata_fields=None)` ¶

`hybi.compose.Network` `dataclass` ¶

`init(node_field='node', edge_field='edge', source_field=None, target_field=None, node_encoding=Encoding.SEMANTIC, edge_encoding=Encoding.EXACT, node_weight=1.0, edge_weight=1.0, directed=True)` ¶