Skip to content

Built-in Tools

All 10 built-in tools are registered automatically when you create an AgentSession.

graph LR
    subgraph Data Lifecycle
        ingest["ingest"]
        mutate["mutate"]
        manage["manage"]
    end
    subgraph Query
        search["search"]
        select["select"]
        aggregate["aggregate"]
    end
    subgraph Structure
        navigate["navigate"]
        join["join"]
        connect["connect"]
    end
    subgraph Introspect
        explore["explore"]
    end
    connect -->|enables| join
    explore -.->|discover ops| navigate
    explore -.->|discover schema| select
from hybi.agent import AgentSession

session = AgentSession.create(hb)  # all 10 tools registered
session.tool_registry.tool_names   # ['ingest', 'mutate', 'search', 'select', ...]

ingest

Create collections, add data, or extend schemas.

Parameter Type Required Description
name string yes Collection name
schema_type string no Compound type for new collections: knowledge_graph, catalog, relational_table, hierarchy, timeline, document, network
fields array no Field definitions [{name, encoding}] for new collections
rows array no Data rows [{field: value, ...}] to add
add_fields array no New fields [{name, encoding}] to add to existing collection

Modes (determined by which parameters are provided):

  1. Create + populate: name + schema_type + fields (+ optional rows)
  2. Add rows: name + rows (collection must exist)
  3. Add fields: name + add_fields (extends schema, existing rows get NULL)

Schema type guidance

Use relational_table for structured tabular data (spreadsheets, CSVs, SQL-like tables) — it uses lossless Row encoding and supports UPDATE/DELETE. Use catalog for read-heavy searchable data. Use document for unstructured text chunks with metadata.

Field encodings

semantic (text for similarity search), exact (names, IDs, categories), numeric (quantities), temporal (dates in YYYY-MM-DD).

Field conventions

knowledge_graph uses subject, predicate/relation, object. hierarchy uses node, parent. timeline expects timestamps in YYYY-MM-DD format. Field names are heuristically matched — e.g., "date" maps to the timestamp field.


mutate

Update or delete individual rows in a relational_table collection. Only works with collections created with schema_type="relational_table", which uses lossless Row encoding with a primary key.

Parameter Type Required Description
collection string yes Collection name
op string yes "update" or "delete_row"
pk string yes Primary key value identifying the row
set object no Fields to update: {field_name: new_value} (required for update)
# Update a row
mutate(collection="trades", op="update", pk="T001", set={"status": "closed"})

# Delete a row
mutate(collection="trades", op="delete_row", pk="T001")

Multi-modal HDC search. Provide exactly one of four search modes.

Parameter Type Required Description
collection string no Collection to search (omit to search all)
query string no Text query for semantic search
slots object no Per-slot queries with optional weights: {field: value} or {field: [value, weight]}
examples array no Example values for prototype search
field string no Field name for prototype/analogy (default: "value")
analogy object no A:B::C:? analogy: {"a": "king", "b": "queen", "c": "man"}
top_k number no Max results (default: 10)
as_of string no Point-in-time date (YYYY-MM-DD) for temporal filtering

Modes:

  • Text (query): Hybrid semantic+symbolic search. Omit collection to search all collections concurrently.
  • Slot-weighted (slots): Per-field encoded queries with independent weights. Requires collection.
  • Prototype (examples + field): Bundles N examples into one prototype vector and finds similar items. Requires collection.
  • Analogy (analogy + field): A:B::C:? vector algebra via encode(B) * encode(A)^H * encode(C). Requires collection.

select

SQL-like select with column projection, filtering, ordering, and limit. Supports single or batch mode.

Parameter Type Required Description
collection string no Single collection to query
collections array no Multiple collections for batch query
columns array no Columns to return (default: all)
where array no Filter conditions as [column, operator, value] triples
order_by array no Sort spec: [["col", "DESC"]]
limit number no Max rows (default: 100)

Provide either collection (single) or collections (batch). Batch mode returns results keyed by collection name.

Where operators: =, !=, <, >, <=, >=, like, in

# Example where clauses
[["patent_expiry", "<", "2025-01"]]
[["name", "in", ["Vexoril", "Cardizyn"]]]

aggregate

Group-by aggregation with computed metrics.

Parameter Type Required Description
collection string yes Collection to aggregate
group_by array yes Columns to group by
metrics array yes Aggregation specs as [column, operation, alias] triples
where array no Filter conditions before aggregation (same format as select)

Operations: sum, avg, min, max, count, stddev

# Example: average cost by therapeutic area
metrics = [["daily_cost_usd", "avg", "avg_cost"]]

join

Cross-collection join via a declared intersection.

Parameter Type Required Description
source_collection string yes Source collection
target_collection string yes Target collection
query string no Optional search query to filter source rows before joining

Warning

Requires a prior connect() between the two collections. The optional query parameter requires embedding configuration.


Schema-native structural operations. The available operations depend on the compound type.

Parameter Type Required Description
collection string yes Collection to navigate
op string yes Operation to perform (see below)
entity string no Entity/node value (graph ops)
edge string no Edge/relation type filter (neighbors)
direction string no "outgoing", "incoming", "both" (default: "outgoing")
from_entity string no Start node (path)
to_entity string no End node (path)
path array no Relation types to follow (traverse)
mode string no Traverse mode: "exact" (default) or "fuzzy" (semantic similarity with beam search)
threshold number no Similarity threshold for fuzzy traverse (default: 0.5)
node string no Node value (tree ops)
depth number no Max depth (ancestors/descendants)
timestamp string no Timestamp for at_time (YYYY-MM-DD)
start string no Start of time range
end string no End of time range
value string no Value for temporal when query
top_k number no Max results (default: 10)
max_hops number no Max hops for path search (default: 5)

Operations by schema type:

Schema Type Operations
knowledge_graph neighbors, path, traverse
network neighbors, path, traverse
hierarchy children, parent, ancestors, descendants
timeline at_time, time_range, when

Use explore(collection=...) to see which operations are available for a specific collection.


explore

Inspect the data landscape. Three modes based on which parameters are provided.

Parameter Type Required Description
collection string no Collection to inspect (omit for session overview)
fields array no Field names to get distinct values for

Modes:

  1. Session overview (no args): All collections with schema types, row counts, intersections, and available operations.
  2. Collection detail (collection): Schema, fields with encodings, sample rows, key field value distributions, related intersections, and available operations.
  3. Field values (collection + fields): Distinct values per field, sorted by frequency.

connect

Declare a cross-collection intersection between two fields. Enables join queries.

Parameter Type Required Description
source string yes Source field in "collection.field" format
target string yes Target field in "collection.field" format
relation string no "auto" (default), "identity", or "semantic". Use "identity" for exact-match fields (names, IDs)

Note

Declare intersections after both collections have data.


manage

Collection lifecycle operations.

Parameter Type Required Description
collection string yes Collection name
op string yes "delete" or "rename"
new_name string no New name (required for rename)
  • delete: Permanently deletes the collection and all its data. Also removes any intersections referencing it.
  • rename: Migrates all data to the new name and updates intersection references.