Compounds¶
Compounds are pre-configured molecules for common domain patterns.
Overview¶
| Compound | Based On | Use Case |
|---|---|---|
| KnowledgeGraph | Triple | Entity-relation-entity facts |
| Catalog | Bundle | Generic tabular data (read-heavy) |
| RelationalTable | Row | Mutable tables with CRUD |
| TimeSeries | Sequence | Time-ordered data |
| Hierarchy | Tree | Org charts, taxonomies |
| Document | Row | Chunked documents with tree-shaped navigation |
| Network | Graph | Social graphs, citations |
Compounds expand to molecules at definition time, so they have the same capabilities once created.
KnowledgeGraph¶
Pre-configured Triple for knowledge graph data.
from hybi.compose import KnowledgeGraph
schema = KnowledgeGraph(
entity_field="person",
relation_field="relationship",
# Defaults: SEMANTIC for entities, EXACT for relations
)
Equivalent to:
Triple(
subject=Field("person", encoding=Encoding.SEMANTIC),
predicate=Field("relationship", encoding=Encoding.EXACT),
object=Field("target", encoding=Encoding.SEMANTIC),
)
hybi.compose.KnowledgeGraph
dataclass
¶
Bases: BaseMolecule
Knowledge graph compound: entity-relation-entity triples.
A convenience wrapper around Triple with sensible defaults for knowledge graph use cases (semantic entities, exact relations).
Example
Simple usage - defaults to entity/relation columns¶
schema = KnowledgeGraph() hb.ingest(facts_df, collection="kg", schema=schema)
Custom field names¶
schema = KnowledgeGraph( ... entity_field="person", ... relation_field="relationship", ... )
With custom encoding¶
schema = KnowledgeGraph( ... entity_field="entity", ... relation_field="predicate", ... entity_encoding=Encoding.EXACT, # For IDs instead of text ... )
__init__(entity_field='entity', relation_field='relation', subject_field=None, object_field=None, entity_encoding=Encoding.SEMANTIC, relation_encoding=Encoding.EXACT, entity_weight=1.0, relation_weight=1.0)
¶
Catalog¶
Pre-configured Bundle for tabular data.
from hybi.compose import Catalog, Field, Encoding
schema = Catalog(
columns={
"name": Field(encoding=Encoding.SEMANTIC, weight=1.5),
"category": Field(encoding=Encoding.EXACT),
"price": Field(encoding=Encoding.NUMERIC, similar_within=50),
}
)
hybi.compose.Catalog
dataclass
¶
Bases: BaseMolecule
Catalog compound: searchable collection with SQL-like operations.
A convenience wrapper around Bundle optimized for tabular data with a familiar SQL-like query interface. Catalog provides a bridge between traditional relational thinking and hyperdimensional computing.
Unlike pure relational tables, Catalog supports: - Semantic search: Find rows by meaning, not just exact values - Fuzzy matching: Similarity-based lookups with configurable thresholds - Vector joins: Join collections by semantic similarity, not just key equality
Operations Map
| Catalog Method | HDC Operation |
|---|---|
| select() | Field projection (SelectQuery) |
| where() | Exact filter + similarity search |
| join() | JoinQuery (exact or semantic) |
| aggregate() | AggregateQuery (GROUP BY) |
| search() | Vector similarity search |
Example
Define a products catalog¶
schema = Catalog( ... columns={ ... "name": Field(encoding=Encoding.SEMANTIC, weight=2.0), ... "description": Field(encoding=Encoding.SEMANTIC), ... "category": Field(encoding=Encoding.EXACT), ... "price": Field(encoding=Encoding.NUMERIC, similar_within=50), ... }, ... primary_key="id", ... ) hb.ingest(products_df, collection="products", schema=schema)
Traditional-style query¶
results = hb.query("products").where(category="electronics")
Semantic query (HDC advantage)¶
results = hb.query("products").search("lightweight laptop for travel")
Join with another catalog¶
order_schema = Catalog( ... columns={"product_id": Field(encoding=Encoding.EXACT), ...} ... ) joined = hb.query("orders").join("products", on="product_id")
Aggregation¶
stats = hb.query("products").aggregate( ... group_by=["category"], ... aggregations={"avg_price": ("price", "avg")} ... )
Notes
- primary_key is metadata only; HDC doesn't require explicit keys
- For semantic joins, use Encoding.SEMANTIC on join columns
- The underlying Bundle uses bundle encoding (lossy but searchable)
__init__(columns=dict(), primary_key=None, catalog_name=None)
¶
RelationalTable¶
SQL-like table with full CRUD support.
RelationalTable provides familiar relational database semantics with atomic row-level operations. Unlike Catalog (which is optimized for search), RelationalTable uses structured encoding which enables true field-level updates.
Catalog vs RelationalTable¶
| Aspect | Catalog | RelationalTable |
|---|---|---|
| Encoding | Search-optimized | Structured |
| Search | Fast | Moderate |
| UPDATE/DELETE | Not supported | Full support |
| Use case | Search catalog | Mutable tables |
Use Catalog when you primarily search and append data. Use RelationalTable when you need UPDATE/DELETE operations.
from hybi.compose import RelationalTable, Field, Encoding
schema = RelationalTable(
columns={
"user_id": Field(encoding=Encoding.EXACT),
"email": Field(encoding=Encoding.EXACT),
"name": Field(encoding=Encoding.SEMANTIC),
"salary": Field(encoding=Encoding.NUMERIC, similar_within=10000),
},
primary_key="user_id",
)
CRUD Operations:
# Ingest data
hb.ingest(users_df, collection="users", schema=schema)
# Read by primary key
user = hb.query("users", schema).get(user_id="U001")
# Update fields atomically
hb.update(
"users",
where={"user_id": "U001"},
set={"email": "new@example.com", "salary": 120000},
schema=schema,
)
# Delete row
hb.delete("users", where={"user_id": "U001"}, schema=schema)
# Upsert (insert or update)
hb.upsert("users", row={"user_id": "U001", ...}, schema=schema)
Equivalent to:
Row(
primary_key=Field("user_id", encoding=Encoding.EXACT),
fields={
"email": Field(encoding=Encoding.EXACT),
"name": Field(encoding=Encoding.SEMANTIC),
"salary": Field(encoding=Encoding.NUMERIC, similar_within=10000),
},
)
Search & CRUD Architecture¶
For optimal performance, use Catalog for search and RelationalTable for CRUD:
flowchart TB
subgraph Catalog["CATALOG (Search)"]
C1[Search-optimized<br/>fast]
C2[Semantic Discovery]
C1 --> C2
end
subgraph RelationalTable["RELATIONAL TABLE (CRUD)"]
R1[Structured<br/>exact]
R2[PK Lookups]
R1 --> R2
end
C2 --> Bridge
R2 --> Bridge
Bridge[BRIDGE<br/>Primary Keys] --> Mutations[Deterministic Mutations]
Recommended pattern: - Use Catalog for semantic search (optimized for similarity matching) - Use RelationalTable for CRUD (optimized for exact field updates) - Bridge between them using shared primary keys
Single-schema alternative: RelationalTable can handle both search and CRUD, but search performance is slower than dedicated Catalog.
Fuzzy-to-Exact Bridge Pattern¶
When you need to combine semantic discovery with exact mutations, use the bridge pattern:
- Fuzzy search casts a wide net using semantic similarity
- Exact filters narrow to deterministic boundaries
- CRUD via PKs operates on the refined set
# 1. Semantic search finds candidates
candidates = hb.query("users", schema).search("machine learning expert", top_k=50)
# 2. Exact filtering narrows to deterministic set
refined = [r for r in candidates
if r.data["department"] == "Engineering"
and r.data["status"] == "active"]
# 3. CRUD via primary keys (safe - deterministic)
for r in refined:
hb.update("users", where={"user_id": r.data["user_id"]}, set={...}, schema=schema)
This pattern leverages fuzzy search for discovery ("I don't know the exact term") while ensuring mutations operate on deterministic, exactly-identified rows.
See Fuzzy-to-Exact Pattern for a complete implementation.
hybi.compose.RelationalTable
dataclass
¶
Bases: BaseMolecule
SQL-like table with full CRUD support.
RelationalTable provides familiar relational database semantics: - Row-level UPDATE: Modify individual fields - Row-level DELETE: Remove rows by primary key - Field extraction: Read individual field values cleanly - ACID guarantees: Single-row atomicity
Unlike Catalog (which uses lossy Bundle encoding optimized for search), RelationalTable uses Row encoding with chain binding, which is lossless. This enables true field-level updates without re-encoding entire rows.
Trade-offs vs Catalog
| Aspect | Catalog | RelationalTable |
|---|---|---|
| Encoding | Bundle (lossy) | Row (lossless) |
| Search | Fast | Moderate |
| UPDATE/DELETE | Not supported | Full support |
| Use case | Search catalog | Mutable tables |
Use RelationalTable when you need UPDATE/DELETE operations. Use Catalog when you primarily search and append data.
Example
Define a users table¶
schema = RelationalTable( ... columns={ ... "user_id": Field(encoding=Encoding.EXACT), ... "email": Field(encoding=Encoding.EXACT), ... "name": Field(encoding=Encoding.SEMANTIC), ... "salary": Field(encoding=Encoding.NUMERIC), ... }, ... primary_key="user_id", ... ) hb.ingest(users_df, collection="users", schema=schema)
Read by primary key¶
user = hb.query("users", schema).get(user_id="U001")
Update fields¶
hb.update( ... "users", ... where={"user_id": "U001"}, ... set={"email": "new@example.com"}, ... schema=schema, ... )
Delete row¶
hb.delete("users", where={"user_id": "U001"}, schema=schema)
Notes
- Primary key is required and must use EXACT encoding
- Primary key cannot be updated (immutable row identity)
- Updates are atomic at the row level
columns = dataclass_field(default_factory=dict)
class-attribute
instance-attribute
¶
Column definitions mapping column names to Field configurations.
Must include the primary key column.
Example
columns={ "id": Field(encoding=Encoding.EXACT), "name": Field(encoding=Encoding.SEMANTIC), "email": Field(encoding=Encoding.EXACT), }
primary_key = None
class-attribute
instance-attribute
¶
Name of the primary key column.
Required. The referenced column must: - Exist in columns - Use EXACT encoding
The primary key provides: - O(1) row lookup via PK index - Row identity for UPDATE/DELETE operations - Uniqueness constraint on ingest
__init__(columns=dict(), primary_key=None, table_name=None)
¶
TimeSeries¶
Pre-configured molecule for time-ordered data. Supports two modes:
Temporal Mode (with timestamp_field)¶
When timestamp_field is provided, expands to a Pair enabling temporal queries:
from hybi.compose import TimeSeries
schema = TimeSeries(
value_field="measurement",
timestamp_field="recorded_at", # Enables at_time(), time_range(), when()
)
Supported queries: search, find, at_time, time_range, when
Positional Mode (without timestamp_field)¶
When timestamp_field is None, expands to a Sequence enabling position-based queries:
schema = TimeSeries(
value_field="message",
timestamp_field=None, # Position-based mode
position_encoding="random",
max_length=512,
)
Supported queries: search, at, contains, prefix
See timeseries_demo.py for a complete example using positional mode.
hybi.compose.TimeSeries
dataclass
¶
Bases: BaseMolecule
Time series compound: temporal data with timestamp-value binding.
TimeSeries encodes time-indexed data using hyperdimensional temporal binding. When a timestamp_field is provided, each row is encoded as:
timestamp ⊛ value
This enables powerful temporal queries: - at_time(ts): Find values at/near a specific timestamp - time_range(start, end): Find values within a time window - when(value): Find timestamps when a value occurred
Expands to (when timestamp_field provided): Pair( left=Field(timestamp_field, encoding=TEMPORAL), right=Field(value_field, encoding=value_encoding), )
Expands to (when timestamp_field is None - legacy mode): Sequence( item=Field(value_field, encoding=value_encoding), position_encoding="sinusoidal", max_length=max_length, )
Example
Recommended: with timestamp field (enables temporal queries)¶
schema = TimeSeries( ... value_field="temperature", ... timestamp_field="recorded_at", ... value_encoding=Encoding.NUMERIC, ... ) hb.ingest(sensor_df, collection="readings", schema=schema)
Query: What was the temperature at 2pm?¶
results = hb.query("readings").at_time("2024-01-15 14:00:00")
Query: Temperatures between 1pm and 3pm¶
results = hb.query("readings").time_range( ... start="2024-01-15 13:00:00", ... end="2024-01-15 15:00:00", ... )
Legacy mode: without timestamp (uses row position)¶
schema = TimeSeries(value_field="price") # timestamp_field=None
Note: This mode only supports positional queries, not temporal¶
__init__(value_field='value', timestamp_field=None, value_encoding=Encoding.SEMANTIC, value_weight=1.0, timestamp_weight=1.0, position_encoding='sinusoidal', max_length=512)
¶
Hierarchy¶
Pre-configured Tree for parent-child relationships.
from hybi.compose import Hierarchy
schema = Hierarchy(
node_field="employee",
parent_field="manager",
)
hybi.compose.Hierarchy
dataclass
¶
Bases: BaseMolecule
Hierarchy compound: parent-child organizational structures.
A convenience wrapper around Tree optimized for hierarchical data like org charts, file systems, taxonomies, or nested categories.
Example
Org chart¶
schema = Hierarchy( ... node_field="employee", ... parent_field="manager", ... ) hb.ingest(org_df, collection="org", schema=schema)
File system with depth tracking¶
schema = Hierarchy( ... node_field="path", ... parent_field="parent_path", ... level_field="depth", ... )
Taxonomy with exact matching¶
schema = Hierarchy( ... node_field="category", ... parent_field="parent_category", ... node_encoding=Encoding.EXACT, ... )
__init__(node_field='node', parent_field='parent', level_field=None, node_encoding=Encoding.SEMANTIC, node_weight=1.0)
¶
Document¶
Use this compound when you have long-form content and need to search inside it (not just retrieve whole documents), navigate its structure (e.g. "every paragraph under chapter 3"), and pinpoint exact sections (e.g. /ch3/sec2/p4) — all against the same collection.
Alternatives fall short in different ways: collapsing each document to one vector loses chunk identity; a flat chunk store loses the structure the chunker gave you; a plain tree loses sibling order. Document keeps all of it — every chunk is a first-class row with a path, a parent, and a position.
A pluggable Chunker decides how the source splits — paragraphs, Markdown headings, or your own strategy. Swapping chunkers changes only row contents, so the same queries work across formats.
Three ways to address a chunk¶
- Semantic — find chunks whose content matches a query.
- Structural — walk the tree (ancestors / descendants / siblings) without a separate index.
- Path — O(1) lookup of the exact chunk at
/ch1/sec2/p3.
Defining a Document schema¶
from hybi.compose import Document, Field, Encoding
from hybi.compose.chunkers import MarkdownChunker
schema = Document(
content_field="body",
chunker=MarkdownChunker(),
metadata_fields={
"author": Field(encoding=Encoding.EXACT),
"published": Field(encoding=Encoding.TEMPORAL),
},
)
Ingest shape¶
The ingest DataFrame is document-level (one row per source document) and must include a document_id column plus the configured content_field. The chunker expands each row into per-chunk Rows at ingest time.
import pandas as pd
df = pd.DataFrame([
{"document_id": "doc-1", "body": "# Intro\n\nFirst paragraph.\n\n## Details\n\nMore.",
"author": "Alice", "published": "2024-01-15"},
# ...
])
hb.ingest(df, collection="articles", schema=schema)
Reserved structural fields written into every chunk Row: chunk_id (primary key), document_id, path, parent_id, sibling_index, depth. User-supplied metadata_fields cannot collide with these.
Querying¶
Use Document.attach to get a client/collection-bound view:
pdoc = schema.attach(hb, "articles")
# Rank whole documents
docs = pdoc.search_documents("attention mechanisms", top_k=5)
# Rank individual chunks (composable ChunkHandles)
handles = pdoc.find_and_bind("async Python", top_k=3).materialize()
# Structural navigation
sec = pdoc.descendants("/intro") # every chunk under /intro
siblings = pdoc.siblings(parent_id) # peers under the same parent
chunk = pdoc.at("/intro/p0", document_id="doc-1") # direct lookup
# Subtree as a first-class algebraic citizen
subtree = pdoc.subtree("/intro", document_id="doc-1")
related = subtree.intersect(hb.query("concepts")) # cross-compound join
Use search_documents when the answer is "which document?" and find_and_bind when the answer is "which passage?". find_and_bind excludes root chunks by default; pass include_root=True for a mixed candidate set.
When multiple documents share a path (e.g. every doc has an /intro), at() and subtree() require document_id= to disambiguate. Unscoped calls on ambiguous paths raise ValueError.
structural_index="tree" opts into the Rust-accelerated descendant walk (BFS over the chunk namespace's parent_id index); the default "path" backend emits path LIKE /root/% filters and works out of the box.
Document's tree-navigation methods (descendants / ancestors / siblings / subtree) delegate to an internal OrderedTree. The structural_index kwarg forwards to it; you can also construct an OrderedTree directly for the same navigation surface without the chunker / Row-schema apparatus.
hybi.compose.Document
dataclass
¶
Bases: BaseMolecule
Document compound: a queryable subtree of first-class chunks.
Each source document is decomposed by a pluggable Chunker into N chunks; every chunk becomes a lossless Row with three addressing modes:
- Semantic: search hits the SEMANTIC
contentfield, returning individual chunks (not whole documents). - Structural: per-row
parent_id,sibling_index, anddepthcarry the tree shape, so ancestors / descendants / siblings navigate without a separate Tree index in v1. - Path: the EXACT
pathfield (e.g./ch1/sec2/p3) gives O(1) direct lookup.
The compound is fixed — swapping Chunker strategies changes only row contents, not the wire schema, so a FlatChunker and a MarkdownChunker ingest into the same shape.
Example
Phase 1+ will wire a Chunker in; Phase 0 exposes the schema shape.¶
schema = Document( ... content_field="body", ... metadata_fields={"section_title": Field(encoding=Encoding.EXACT)}, ... )
__init__(content_field='content', content_encoding=Encoding.SEMANTIC, content_weight=1.0, metadata_fields=None, chunker=None, structural_index='path')
¶
attach(client, collection)
¶
Return a client/collection-bound view of this Document.
The bound view mirrors every Document method that needs a client
and collection (subtree, find_and_bind, descendants,
ancestors, siblings, at) without the _client=/
_collection= keyword noise. Prefer this for application code;
the raw Document methods remain available as an escape hatch.
Example
pdoc = doc.attach(hb, "papers") pdoc.subtree("/abstract", document_id="1706.03762").materialize() pdoc.rollup("/abstract", document_id="1706.03762") # handle-free shortcut pdoc.find_and_bind("transformers", top_k=3) pdoc.descendants("/")
find_and_bind(query, *, top_k, include_root=False, _client=None, _collection=None)
¶
Semantic search within the document, returning a handle-shaped spec that can be composed with other collections or atoms.
top_k is required positional-or-keyword-only with no default:
the choice between "best match" and "candidate set" is consequential
and should be explicit at the call site.
By default roots (path == "/") are excluded from the search
because their content_vec carries the per-document subtree rollup
— appropriate for search_documents, not chunk-level retrieval.
Pass include_root=True to opt back in when you want a mixed
chunk + document-level candidate set.
The returned object is lazy — it does no server I/O until the caller invokes .materialize() or a method that requires concrete values (e.g. .intersect). For top_k == 1 the spec resolves to a single ChunkHandle on materialization; for top_k > 1 it resolves to a list of ChunkHandles.
subtree(path, *, document_id=None, union=False, _client=None, _collection=None)
¶
Return a virtual ChunkHandle whose content_vec bundles the
chunk at path plus every descendant.
Per-document by default: an ambiguous path (multiple documents)
raises. Pass document_id=<id> to scope, or union=True to
opt into bundling across matching documents.
Delegates to the internal :class:OrderedTree.
descendants(query, root_path, top_k=100)
¶
Return every chunk whose path sits strictly under root_path.
ancestors(query, path, top_k=100)
¶
Return chunks at every ancestor path of path, parent-first.
siblings(query, parent_id, top_k=100)
¶
Return every chunk sharing parent_id. Caller sorts by
sibling_index for ordered traversal.
at(path, *, document_id=None, _client=None, _collection=None)
¶
O(1) direct lookup by path.
Returns the chunk Row at path (or None if no chunk matches).
EXACT encoding on the path field makes this a primary-key-style
hit.
Per-document by default: if multiple documents in the collection
share the same path (e.g. every Markdown doc has an /intro
section), pass document_id to disambiguate. Omitting it when
the path is ambiguous raises ValueError rather than returning
a nondeterministic row — silent cross-document leakage is the
kind of bug that only surfaces once the corpus grows past toy
size.
reingest(client, collection, source_df)
¶
Replace every chunk belonging to the given document_id(s).
For each unique document_id in source_df, looks up that
document's existing chunks by document_id and deletes them by
primary key, then ingests the new rows. Leaves other documents in
the collection untouched.
Fail-before-delete. The new DataFrame is run through
expand_dataframe (chunker + orphan check + required-column
validation) BEFORE any deletion happens. If the new data is
malformed — missing columns, chunker bugs, duplicate paths —
reingest raises without touching storage. This closes the
"deleted old, new ingest failed, collection in half-state"
class of failure for the common malformed-input case.
Ingest-phase failures that surface AFTER the pre-validation (e.g. transient storage errors) still leave the collection in a half-state; the docstring for those conditions remains best-effort. Per-row delete errors are swallowed so a row already removed by a concurrent caller doesn't block the rest.
Raises:
| Type | Description |
|---|---|
SchemaError
|
if |
PK-based deletion avoids row-id-counter issues that a bulk delete-by-filter can induce when mixed with re-ingest.
BoundDocument¶
A Document bound to a specific (client, collection) pair, produced by Document.attach. Mirrors every Document method that needs a client and collection (subtree, find_and_bind, descendants, etc.) without the _client= / _collection= keyword noise.
hybi.compose.BoundDocument
¶
A Document bound to a specific (client, collection) pair.
Produced by :meth:Document.attach. Exposes every structural /
rollup / retrieval operation on Document with client/collection
implicit, so call sites read as pdoc.subtree("/abstract", ...)
instead of schema.subtree("/abstract", ..., _client=hb, _collection="papers").
The Document schema itself stays stateless; BoundDocument is a
thin adapter. Safe to construct, discard, and re-bind the same
Document to different collections.
subtree(path, *, document_id=None, union=False)
¶
Return a virtual ChunkHandle for the subtree rooted at path.
See :meth:Document.subtree for scoping semantics.
rollup(path, *, document_id=None, union=False)
¶
Shortcut: return the subtree's rollup vector directly (np.ndarray).
Equivalent to self.subtree(path, ...).materialize(); provided
because the handle is immediately materialized in most call sites.
find_and_bind(query_text, *, top_k, include_root=False)
¶
Semantic search that returns composable ChunkHandle objects.
Excludes root chunks by default — those carry document-level
rollups and belong to :meth:search_documents. Pass
include_root=True to search across chunks and root rollups
together. See :meth:Document.find_and_bind.
search_documents(query_text, *, top_k)
¶
Document-level semantic search: rank whole documents by the
similarity of their persisted subtree rollup to query_text.
Implementation: a path-scoped semantic search over root chunks
(path == "/"), whose content_vec is the per-document
subtree rollup written by :meth:Document._finalize_rollups
at ingest time. Works out of the box for Documents ingested
through HyperBinder.ingest on a client that exposes
set_row_atom; falls back to empty results when rollups
haven't been persisted.
top_k is required (no default): document-scale search is
expensive to paginate and the choice between "one best match"
and "candidate set" should be explicit.
descendants(root_path, top_k=100)
¶
Every chunk whose path sits strictly under root_path.
ancestors(path, top_k=100)
¶
Every chunk on the path from root to path, parent-first.
siblings(parent_id, top_k=100)
¶
Every chunk sharing parent_id.
at(path, *, document_id=None)
¶
O(1) direct lookup of the chunk at path. Pass document_id
when the path exists in more than one document. See
:meth:Document.at.
Network¶
Pre-configured Graph for network data.
from hybi.compose import Network
schema = Network(
node_field="user",
edge_field="interaction",
directed=True,
# Optional: use separate columns for source/target nodes
# source_field="from_user",
# target_field="to_user",
)
hybi.compose.Network
dataclass
¶
Bases: BaseMolecule
Network compound: node-edge-node graph structures.
A convenience wrapper around Graph optimized for social networks, citation graphs, dependency graphs, and other network structures.
Example
Social network¶
schema = Network( ... node_field="user", ... edge_field="connection_type", ... ) hb.ingest(social_df, collection="social", schema=schema)
Citation network (undirected)¶
schema = Network( ... node_field="paper_id", ... edge_field="citation_type", ... node_encoding=Encoding.EXACT, ... directed=False, ... )
Dependency graph¶
schema = Network( ... node_field="package", ... edge_field="dependency_type", ... source_field="dependent", ... target_field="dependency", ... )
__init__(node_field='node', edge_field='edge', source_field=None, target_field=None, node_encoding=Encoding.SEMANTIC, edge_encoding=Encoding.EXACT, node_weight=1.0, edge_weight=1.0, directed=True)
¶
When to Use Compounds vs Molecules¶
Use compounds when:
- Your data fits a common pattern
- You want sensible encoding defaults
- You're prototyping quickly
Use molecules when:
- You need custom encodings per field
- You're nesting structures
- You need fine-grained control over weights
See Molecules vs Compounds for detailed guidance.
Example Code¶
Complete runnable examples for each compound type:
| Compound | Example File | Description |
|---|---|---|
| KnowledgeGraph | knowledge_graph_demo.py |
Entity-relation-entity facts with traversal |
| Document | document_demo.py |
Chunked documents with tree-shaped navigation |
| Document (arXiv) | document_arxiv_demo.py |
Real arXiv papers + semantic search + cross-compound hyperedges |
| Hierarchy | hierarchy_demo.py |
Org charts and taxonomies |
| TimeSeries | timeseries_demo.py |
Time-ordered data |
| Network | network_demo.py |
Social graphs and citations |
| Catalog | product_catalog_demo.py |
Product catalogs with search |
| RelationalTable | fuzzy_to_exact_demo.py |
CRUD with fuzzy-to-exact pattern |
Run any example from the SDK directory:
See Examples README for the full example index.