Skip to content

Client API

The main entry points for interacting with HyperBinder.

HyperBinder

Factory function that returns either a RemoteHyperBinder (client mode) or LocalHyperBinder (local mode).

from hybi import HyperBinder

# Client mode: Connect to HyperBinder server
hb = HyperBinder(
    url="http://localhost:8000",
    api_key="your-api-key",  # Or set HYPERBINDER_API_KEY
    timeout=30.0,
)

# Local mode: Embedded client (no Docker required)
hb = HyperBinder(
    local=True,
    db_path="./my_db",  # Optional: custom database path
)

# Use as context manager
with HyperBinder() as hb:
    results = hb.search("query", collection="data")

Parameters:

  • url (str, optional): URL of the server (e.g., "http://localhost:8000"). Defaults to http://localhost:8000 if not local.
  • api_key (str, optional): API key for server authentication.
  • local (bool): If True, uses embedded LocalHyperBinder (requires 'hyperbinder' pip package).
  • **kwargs: Additional arguments passed to the specific client (e.g., db_path for local mode).

Returns: Either a LocalHyperBinder or RemoteHyperBinder instance.


RemoteHyperBinder

Direct access to the HTTP client (same as HyperBinder(url=...)).

from hybi import RemoteHyperBinder

hb = RemoteHyperBinder(
    url="http://localhost:8000",
    api_key="your-api-key",
)

LocalHyperBinder

Direct access to the embedded client (same as HyperBinder(local=True)).

from hybi import LocalHyperBinder

hb = LocalHyperBinder(db_path="./my_db")

Requirements: Requires the hyperbinder pip package to be installed.

hybi.HyperBinder(url=None, api_key=None, local=False, **kwargs)

Initialize a HyperBinder client.

Parameters:

Name Type Description Default
url Optional[str]

URL of the server (e.g., "http://localhost:8000"). Defaults to http://localhost:8000 if not local.

None
api_key Optional[str]

Optional API key for server authentication.

None
local bool

If True, uses the embedded LocalHyperBinder (no Docker required). Requires 'hyperbinder' pip package installed.

False
**kwargs Any

Additional arguments passed to the specific client (e.g., db_path for local mode).

{}

Returns:

Type Description
Union[LocalHyperBinder, HyperBinder]

Either a LocalHyperBinder or a RemoteHyperBinder instance.


AsyncHyperBinder

Asynchronous client for high-throughput applications.

from hybi import AsyncHyperBinder
import asyncio

async def main():
    async with AsyncHyperBinder() as hb:
        results = await hb.search("query", collection="data")
        for r in results:
            print(r['name'])

asyncio.run(main())

hybi.AsyncHyperBinder

Bases: AsyncObserveMixin, AsyncComposeMixin, BaseHyperBinder

Async HyperBinder client for high-throughput applications.

Compose operations (unbind, extract, bundle_search, etc.) are provided by AsyncComposeMixin.

Observability operations (traced versions of compose methods, recovery points) are provided by AsyncObserveMixin.

Example

async with AsyncHyperBinder("http://localhost:8000") as hb: await hb.ingest("data.csv", collection="customers") results = await hb.search("enterprise AI", collection="customers")

__init__(url='http://localhost:8000', api_key=None, timeout=30.0, default_collection=None, default_top_k=10, verify_ssl=True, warn_insecure=True, max_retries=3, retry_delay=0.5, join_config=None)

Initialize async HyperBinder client.

Parameters:

Name Type Description Default
url str

Server URL (use https:// in production)

'http://localhost:8000'
api_key Optional[str]

API key for authentication. Can also be set via HYPERBINDER_API_KEY environment variable.

None
timeout float

Request timeout in seconds

30.0
default_collection Optional[str]

Default collection for operations

None
default_top_k int

Default number of results to return

10
verify_ssl bool

Whether to verify SSL certificates (default True)

True
warn_insecure bool

Warn if using HTTP instead of HTTPS (default True)

True
max_retries int

Maximum retry attempts for transient errors (default 3)

3
retry_delay float

Initial delay between retries in seconds (default 0.5)

0.5
join_config Optional[JoinConfig]

Configuration for join operations (cycle limits, dedup). Defaults to JoinConfig() with sensible production defaults.

None

close() async

Close the async HTTP client.

ping() async

Check server health.

is_ready() async

Check if server is ready to handle requests.

Returns:

Type Description
bool

True if server responds with ready status, False otherwise.

Note

Connection errors return False (server unreachable). Other errors (auth, server errors) are re-raised.

collection(name)

Get an async collection object for fluent API access.

list_collections() async

List all collections.

get_collection_info(collection=None) async

Get detailed information about a collection.

Parameters:

Name Type Description Default
collection Optional[str]

Collection name (uses default if not specified)

None

Returns:

Type Description
CollectionInfo

CollectionInfo with type, columns, and capabilities

get_collection_stats(collection=None, *, use_cache=True) async

Get detailed statistics about a collection.

Returns more detailed information than get_collection_info(), including vector configuration (dimension, seed) and source file metadata.

Parameters:

Name Type Description Default
collection Optional[str]

Collection name (uses default if not specified)

None
use_cache bool

Whether to use cached stats if available (default True)

True

Returns:

Type Description
CollectionStats

CollectionStats with full collection details

Example

stats = await hb.get_collection_stats("customers") print(stats) # "customers: 1,000 rows, 5 columns (structured)" print(f"Vector dimension: {stats.dimension}")

delete_collection(collection=None) async

Delete a collection.

Parameters:

Name Type Description Default
collection Optional[str]

Collection name (uses default if not specified)

None

Raises:

Type Description
CollectionNotFoundError

If collection doesn't exist

HyperBinderError

If deletion fails

ingest(source, *, collection=None, dim=1024, seed=42, depth=3, schema=None, vector_col=None, warn_schema_evolution=True) async

Ingest data into a collection.

Parameters:

Name Type Description Default
source Union[str, Path, DataFrame, List[str]]

File path, list of paths, or pandas DataFrame

required
collection Optional[str]

Target collection name

None
dim int

Vector dimension for embeddings (default 512)

1024
seed int

Random seed for reproducibility (default 42)

42
depth int

Hierarchy depth (default 3)

3
schema Optional[BaseMolecule]

Optional Compose schema (Pair, Triple, Record) defining how data should be encoded. If provided, validates that the data matches the schema and stores the schema with the collection for schema-aware queries.

None
warn_schema_evolution bool

Whether to emit SchemaEvolutionWarning warnings during ingest (default True). Set False to suppress adaptive-mode schema evolution warning noise.

True

Returns:

Type Description
IngestResult

IngestResult with ingestion details

Example

Ingest with explicit Triple schema

from hyperbinder.compose import Triple, Field, Encoding

schema = Triple( subject=Field("entity"), predicate=Field("relation", encoding=Encoding.EXACT), object=Field("target"), ) await hb.ingest(df, collection="knowledge", schema=schema)

search(query, *, collection=None, top_k=None, mode=None, filters=None, role=None, slot_filters=None) async

Universal async search across any collection type.

Automatically detects collection type and uses the best search method, or use mode to explicitly choose.

Parameters:

Name Type Description Default
query Union[str, Dict[str, Any]]

Search query (string for text search, dict for field matching)

required
collection Optional[str]

Collection to search

None
top_k Optional[int]

Number of results to return

None
mode Optional[str]

Search mode - "auto" (default), "structured", "semantic", or "hybrid"

None
filters Optional[List[tuple]]

Hard filters as list of (field, op, value) tuples (structured mode)

None
role Optional[str]

Filter by document role e.g. "paragraph" (semantic mode)

None
slot_filters Optional[Dict[str, Any]]

Slot value filters e.g. {"category": "Electronics"} (hybrid mode)

None

Returns:

Type Description
List[SearchResult]

List of SearchResult objects

select(collection=None, columns=None, where=None, order_by=None, limit=None, offset=0, distinct=False) async

Async SQL-like SELECT query.

Parameters:

Name Type Description Default
collection Optional[str]

Collection to query

None
columns Optional[List[str]]

Columns to select (None = all)

None
where Optional[List[tuple]]

Filter conditions as list of (field, operator, value)

None
order_by Optional[List[tuple]]

Sort order as list of (field, descending)

None
limit Optional[int]

Maximum rows to return

None
offset int

Number of rows to skip

0
distinct bool

Return distinct rows only

False

Returns:

Type Description
SelectResult

SelectResult with rows

aggregate(collection=None, group_by=None, aggregations=None, where=None, having=None, order_by=None, limit=None) async

Async SQL-like AGGREGATE query with GROUP BY.

Parameters:

Name Type Description Default
collection Optional[str]

Collection to query

None
group_by Optional[List[str]]

Fields to group by

None
aggregations Optional[List[tuple]]

List of (field, operation, alias) tuples Operations: "sum", "avg", "count", "min", "max"

None
where Optional[List[tuple]]

Filter conditions before grouping

None
having Optional[List[tuple]]

Filter conditions after grouping

None
order_by Optional[List[tuple]]

Sort order as list of (field, descending) tuples

None
limit Optional[int]

Maximum groups to return

None

Returns:

Type Description
AggregateResult

AggregateResult with groups

Example

results = await hb.aggregate( collection="orders", group_by=["category"], aggregations=[("amount", "sum", "total")], order_by=[("total", True)], # Sort by total descending )

join(left, right, on, join_type='inner', columns=None, limit=None) async

Async SQL-like JOIN across collections.

Parameters:

Name Type Description Default
left str

Left collection name

required
right str

Right collection name

required
on List[tuple]

Join conditions as list of (left_field, operator, right_field)

required
join_type str

Type of join ("inner", "left", "right", "outer")

'inner'
columns Optional[List[str]]

Columns to select from result

None
limit Optional[int]

Maximum rows to return

None

Returns:

Type Description
JoinResult

JoinResult with joined rows

multihop(collection=None, start=None, hops=None, top_k=None) async

Async multi-hop reasoning query.

Parameters:

Name Type Description Default
collection Optional[str]

Collection to query

None
start Optional[Dict[str, Any]]

Starting query as field:value dict

None
hops Optional[List[tuple]]

List of (field, value) tuples defining the path

None
top_k Optional[int]

Number of results to return

None

Returns:

Type Description
List[MultihopResult]

List of MultihopResult with reasoning paths

get_context(query, collection=None, max_chunks=5, max_tokens=2000, auto_detect=True, expand=None, *, expand_reasoning=False, reasoning_hops=2, include_proofs=False) async

Async get relevant context for LLM consumption.

Automatically routes to the appropriate search method: - String query + document collection → semantic search - String query + structured collection → structured search - Dict query → structured field search

When expand is provided, the context is enriched with related data from other collections using declared intersections (via hb.intersect()).

When expand_reasoning is provided, the context is enriched with inferred relationships discovered via MHV compositional reasoning (requires paid tier).

Parameters:

Name Type Description Default
query Union[str, Dict[str, Any]]

The question or query (string for text search, dict for structured queries)

required
collection Optional[str]

Collection to search

None
max_chunks int

Maximum number of chunks to retrieve

5
max_tokens int

Approximate maximum tokens in context

2000
auto_detect bool

If True, detect collection type and use appropriate search

True
expand Optional[List[Union[str, Dict[str, Any]]]]

Optional list of collections to expand into. Each element can be: - A string: collection name (e.g., "expertise") - A dict with options: {"collection": "expertise", "fields": ["skill"]}

None
expand_reasoning bool

If True, expand context with MHV reasoning inferences

False
reasoning_hops int

Maximum reasoning chain length (default 2)

2
include_proofs bool

Include proof traces for inferred relationships

False

Returns:

Type Description
Context

Context object with formatted text and source chunks.

Context

If expand_reasoning=True, context may include inferred relationships.

Example

Basic context retrieval

context = await hb.get_context("Alice's team", collection="org")

With reasoning expansion

context = await hb.get_context( ... "Alice's team", ... collection="org", ... expand_reasoning=True, ... reasoning_hops=3, ... )

Context now includes inferred relationships like:

"Alice reports_to→reports_to Bob" (2-hop inference)

ask(question, collection=None, top_k=5, role_filter=None) async

Async end-to-end RAG: retrieve context and generate answer.

Note: The LLM model is configured server-side.

Parameters:

Name Type Description Default
question str

The question to answer

required
collection Optional[str]

Collection to search

None
top_k int

Number of chunks to retrieve

5
role_filter Optional[str]

Only search in specific role (e.g., "paragraph")

None

Returns:

Type Description
Answer

Answer object with generated text and sources

query(collection=None, schema=None)

Get a schema-aware async query builder for a collection.

AsyncComposeQuery provides a fluent interface for building queries that leverage the collection's schema (if available) for type-safe, slot-based operations.

Parameters:

Name Type Description Default
collection Optional[str]

Collection name (uses default if not specified)

None
schema Optional[BaseMolecule]

Optional molecule schema for validation. If not provided, queries will work but without schema validation.

None

Returns:

Type Description
AsyncComposeQuery

AsyncComposeQuery builder for the collection

Example

Basic query

q = hb.query("facts") results = await q.search("enterprise software")

With schema for type-safe slot access

from hyperbinder.compose import Triple, Field schema = Triple(subject=Field("entity"), ...) q = hb.query("facts", schema=schema) results = await q.find(subject="Alice")

Populate a flexible intersection with link data.

Links map source field values to target field values, enabling cross-encoding joins. This method replaces any existing links for the intersection.

Parameters:

Name Type Description Default
intersection Intersection

The flexible intersection to populate (from intersect_flexible())

required
df DataFrame

DataFrame containing link pairs

required
source_column str

Column name for source values

required
target_column str

Column name for target values

required
weight_column Optional[str]

Optional column for link weights (default 1.0)

None

Returns:

Type Description
Dict[str, Any]

Dict with ingestion stats: {"links_created": int, "link_collection": str}

Raises:

Type Description
ValueError

If intersection is not in FLEXIBLE mode

Example

links_df = pd.DataFrame({ ... "emp_id": ["EMP001", "EMP002", "EMP003"], ... "topic": ["machine learning", "databases", "cloud computing"] ... })

result = await hb.populate_links(ix, links_df, "emp_id", "topic") print(f"Created {result['links_created']} links")

Retrieve link mappings for join operations.

Internal method used by join() to look up target values for source values.

Parameters:

Name Type Description Default
link_collection str

Name of the link collection

required
source_values List[str]

List of source values to look up

required

Returns:

Type Description
Dict[str, List[str]]

Dict mapping source_value -> [target_values]

insert_row(collection, *, row, schema) async

Insert a single row using Row molecule encoding (chain binding).

This method uses the dedicated /row/insert/ endpoint which provides: - Chain binding encoding for lossless field extraction - Primary key duplicate detection (raises DuplicateKeyError if exists) - Proper indexing for O(1) PK lookups

Parameters:

Name Type Description Default
collection str

Collection name

required
row Dict[str, Any]

Row data including primary key

required
schema BaseMolecule

RelationalTable schema (required)

required

Returns:

Type Description
Dict[str, Any]

Dict with insert status and pk info

Raises:

Type Description
ValueError

If schema is not RelationalTable or row missing PK

DuplicateKeyError

If row with same PK already exists

Example

await hb.insert_row( ... "users", ... row={"user_id": "U001", "email": "alice@test.com", "name": "Alice"}, ... schema=users_schema, ... )

get_row(collection, *, pk_field, pk_value) async

Get a row by primary key (O(1) lookup).

Used by RelationalTable for deterministic PK lookups.

Parameters:

Name Type Description Default
collection str

Collection name

required
pk_field str

Name of the primary key field

required
pk_value Any

Value to look up

required

Returns:

Type Description
Optional[Dict[str, Any]]

Row data dict if found, None if not found

Raises:

Type Description
ValueError

If multiple rows found (PK must be unique)

CollectionNotFoundError

If collection doesn't exist

Example

row = await hb.get_row("users", pk_field="user_id", pk_value="U001") if row: ... print(row["email"])

update(collection, *, where, set, schema=None) async

Atomically update a row matching the where clause.

This method performs an atomic update - all fields are updated together in a single operation. For RelationalTable schemas, the row is re-encoded with chain binding to preserve encoding integrity.

Parameters:

Name Type Description Default
collection str

Collection name

required
where Dict[str, Any]

Primary key condition, e.g. {"user_id": "U001"}

required
set Dict[str, Any]

Fields to update, e.g. {"email": "new@test.com"}

required
schema Optional[BaseMolecule]

Optional RelationalTable schema for validation and re-encoding

None

Returns:

Type Description
Dict[str, Any]

Dict with update status and info including old/new values

Raises:

Type Description
ValueError

If trying to update primary key or missing PK in where

CollectionNotFoundError

If row not found

Example

await hb.update( ... "users", ... where={"user_id": "U001"}, ... set={"email": "new@example.com", "name": "Alice Smith"}, ... schema=users_schema, ... )

delete(collection, *, where, schema=None) async

Delete a row matching the where clause.

Parameters:

Name Type Description Default
collection str

Collection name

required
where Dict[str, Any]

Primary key condition, e.g. {"user_id": "U001"}

required
schema Optional[BaseMolecule]

Optional RelationalTable schema for validation

None

Returns:

Type Description
Dict[str, Any]

Dict with delete status and info

Raises:

Type Description
ValueError

If missing PK in where

CollectionNotFoundError

If row not found

Example

await hb.delete_row("users", where={"user_id": "U001"})

upsert(collection, *, row, schema) async

Insert or update a row.

If a row with matching primary key exists, update it. Otherwise, insert as new row.

Parameters:

Name Type Description Default
collection str

Collection name

required
row Dict[str, Any]

Row data including primary key

required
schema BaseMolecule

RelationalTable schema (required for PK info)

required

Returns:

Type Description
UpsertResult

UpsertResult indicating whether inserted or updated

Example

await hb.upsert( ... "users", ... row={"user_id": "U001", "email": "new@example.com", "name": "Alice"}, ... schema=users_schema, ... )


Method Categories

Health & Status

Method Description
ping() Check server connectivity
is_ready() Check if server is ready
auth_status() Get authentication status

Collection Management

Method Description
collection(name) Get Collection fluent API
list_collections() List all collections with metadata
list_collection_names() List collection names only (convenience)
get_collection_info(name) Get collection metadata
get_collection_stats(name) Get detailed statistics
delete_collection(name) Delete a collection

Data Ingestion

Method Description
ingest(source, collection, schema) Ingest CSV, DataFrame, or documents

Search Operations

Method Description
search(query, collection) Semantic similarity search
bundle_search(values, field, collection) Search similar to any example
search_prototype(examples, collection) Search using examples as a prototype
analogy(a, b, c, field, collection) Analogical reasoning (A:B :: C:?)

SQL-like Operations

Method Description
select(conditions, collection) Filter with conditions
aggregate(group_by, field, function) GROUP BY aggregation
join(collection, target, on) Cross-collection JOIN

Graph Operations

Method Description
multihop(start, path, collection) Multi-hop traversal

RAG Operations

Method Description
get_context(query, collection) Retrieve context for LLM
ask(query, collection) End-to-end RAG query

Compose Operations

Method Description
query(collection, schema) Get ComposeQuery builder
intersect(source, target) Define cross-collection relationship (strict mode)
intersect_flexible(source, target) Define cross-encoding relationship (flexible mode)
populate_links(intersection, df, ...) Populate links for flexible intersection

CRUD Operations (RelationalTable)

These methods provide SQL-like row operations for RelationalTable schemas.

Method Description
insert_row(collection, row, schema) Insert a single row
get_row(collection, pk_field, pk_value) Get row by primary key
update(collection, where, set, schema) Update row atomically
delete(collection, where, schema) Delete row by primary key
upsert(collection, row, schema) Insert or update row

Example:

from hybi.compose import RelationalTable, Field, Encoding

schema = RelationalTable(
    columns={
        "user_id": Field(encoding=Encoding.EXACT),
        "email": Field(encoding=Encoding.EXACT),
        "name": Field(encoding=Encoding.SEMANTIC),
    },
    primary_key="user_id",
)

# Insert
hb.insert_row("users", row={"user_id": "U001", "email": "a@test.com", "name": "Alice"}, schema=schema)

# Get
row = hb.get_row("users", pk_field="user_id", pk_value="U001")

# Update
hb.update("users", where={"user_id": "U001"}, set={"email": "new@test.com"}, schema=schema)

# Delete
hb.delete("users", where={"user_id": "U001"}, schema=schema)

# Upsert (insert or update)
hb.upsert("users", row={"user_id": "U001", "email": "upsert@test.com", "name": "Alice"}, schema=schema)

See RelationalTable for more details on schema definition.