Skip to content

Sandbox

The sandbox lets agents write Python code that calls HyperBinder tools as async functions. Only the final print() output enters context, eliminating multi-round-trip overhead.

flowchart LR
    LLM["LLM writes<br/>Python script"] --> Validate["AST<br/>Validation"]
    Validate --> Exec["Execute in<br/>sandbox namespace"]
    Exec --> Tools["search(), select(),<br/>navigate(), analogy(), ..."]
    Tools --> Print["print() output<br/>captured"]
    Print --> Return["Only printed text<br/>returns to LLM"]

Security

Two tiers of isolation:

Tier Mode Protection
Tier 1 Always active AST validation blocks dunder access, imports, dangerous builtins; loop guard caps iterations at 10k per loop
Tier 2 Opt-in Subprocess isolation — code runs in a child process with no access to parent memory, session, or imports
# Enable subprocess isolation
from hybi.agent.sandbox import SUBPROCESS_ISOLATION
SUBPROCESS_ISOLATION = True

AST Validation

The _SandboxValidator blocks:

  • Dunder access: __class__, __subclasses__, __globals__, __import__, etc.
  • Dangerous builtins: getattr, eval, exec, compile, open, input
  • All imports: import and from ... import statements
  • Oversized literals: strings and byte literals above 10KB

The _LoopGuardTransformer injects iteration counters into for and while loops, raising RuntimeError after 10,000 iterations.

Code Executor

The executor builds a namespace with async wrappers for each builtin tool, then executes the agent's code in that namespace.

from hybi.agent.sandbox import get_execute_code_tool

# Get the execute_code tool definition (opt-in, not auto-registered)
tool = get_execute_code_tool()
session.tool_registry.register(tool)

Execution Namespace

build_execution_namespace(session) creates a dict with:

  • Async function wrappers for all builtin tools (matching their handler signatures)
  • Search aliases: search_slots(), search_prototype(), analogy(), traverse()
  • HDC operations: decompose() — unbind stored rows to reveal field associations
  • Pre-loaded utilities: json, defaultdict, Counter, OrderedDict, math
  • Active CollectionView objects injected by name

HDC Operations

These functions expose HyperBinder's distinctive algebraic capabilities at a task-oriented level:

Function What it does
analogy(collection, a, b, c, field) A:B::C:? — find what relates to C the way B relates to A
search_prototype(collection, examples, field) Bundle N examples and find similar items — no traditional equivalent
decompose(collection, row_ids, fields) Unbind rows to reveal what each field contributes to the encoding
# Analogy: Apple:Red :: Banana:?
result = await analogy("foods", "Apple", "Red", "Banana", field="name")

# Prototype: find drugs similar to these three
result = await search_prototype("drugs", ["Aspirin", "Ibuprofen", "Naproxen"], field="name")

# Decompose: what is row 0 most associated with in the 'relation' field?
result = await decompose("mechanisms", [0, 1], ["relation", "object"])

Output Capture

All print() output is captured to a StringIO buffer, capped at 16KB. Only this output is returned to the agent — intermediate state stays in the sandbox.

CollectionView

OOP proxy for a single collection, injected into the sandbox namespace as the collection name.

# Inside sandbox code, the agent writes:
results = await drugs.search("kinase inhibitor", top_k=5)
values = await drugs.distinct("mechanism")
data = await drugs.filter(where=[["status", "=", "approved"]])

Properties

Property Returns Description
name str Collection name
schema dict Compound schema details
fields List[str] Field names
rows int Row count
key_values Dict[str, List[str]] Distinct values for key fields
intersects_with List[dict] Related intersections
available_operations dict Valid search/navigate/tabular ops

Methods

All methods are async and return ResultSet objects (except distinct which returns a list):

  • search(query, top_k=10) — semantic text search
  • search_slot(query, slot, top_k=10) — search within a specific field
  • search_slots(top_k=10, **slot_queries) — multi-slot keyword search: await coll.search_slots(subject="oil")
  • bundle_search(values, field="value", top_k=10) — prototype search from example values
  • filter(limit=100, **predicates) — exact-match filter: await coll.filter(exchange="NYMEX")
  • distinct(field) — distinct values for a field (returns List[str])
  • extract(row_ids, fields=None) — extract field values for specific rows by ID
  • analogy(a, b, c, field="value") — A:B::C:? vector algebra
  • navigate(op, **kwargs) — schema-native traversal
  • traverse(start, start_slot, path, top_k=10) — fuzzy traversal shortcut

ResultSet

Typed container for query results. Holds tabular data and an optional HDC vector for geometric composition.

rs = await drugs.search("kinase", top_k=5)
print(rs)           # formatted table
print(rs.rows)      # list of dicts
print(len(rs))      # row count
print(rs.fields)    # field names
for row in rs:       # iteration
    print(row)
Attribute Type Description
rows List[dict] Result rows
row_ids List[int] Row IDs from the backend
scores List[float] Similarity scores (if from search)
collection_name str Source collection name
fields List[str] Field names
vector optional Bundled HDC vector (local mode only)