Sandbox¶
The sandbox lets agents write Python code that calls HyperBinder tools as async functions. Only the final print() output enters context, eliminating multi-round-trip overhead.
flowchart LR
LLM["LLM writes<br/>Python script"] --> Validate["AST<br/>Validation"]
Validate --> Exec["Execute in<br/>sandbox namespace"]
Exec --> Tools["search(), select(),<br/>navigate(), analogy(), ..."]
Tools --> Print["print() output<br/>captured"]
Print --> Return["Only printed text<br/>returns to LLM"]
Security¶
Two tiers of isolation:
| Tier | Mode | Protection |
|---|---|---|
| Tier 1 | Always active | AST validation blocks dunder access, imports, dangerous builtins; loop guard caps iterations at 10k per loop |
| Tier 2 | Opt-in | Subprocess isolation — code runs in a child process with no access to parent memory, session, or imports |
# Enable subprocess isolation
from hybi.agent.sandbox import SUBPROCESS_ISOLATION
SUBPROCESS_ISOLATION = True
AST Validation¶
The _SandboxValidator blocks:
- Dunder access:
__class__,__subclasses__,__globals__,__import__, etc. - Dangerous builtins:
getattr,eval,exec,compile,open,input - All imports:
importandfrom ... importstatements - Oversized literals: strings and byte literals above 10KB
The _LoopGuardTransformer injects iteration counters into for and while loops, raising RuntimeError after 10,000 iterations.
Code Executor¶
The executor builds a namespace with async wrappers for each builtin tool, then executes the agent's code in that namespace.
from hybi.agent.sandbox import get_execute_code_tool
# Get the execute_code tool definition (opt-in, not auto-registered)
tool = get_execute_code_tool()
session.tool_registry.register(tool)
Execution Namespace¶
build_execution_namespace(session) creates a dict with:
- Async function wrappers for all builtin tools (matching their handler signatures)
- Search aliases:
search_slots(),search_prototype(),analogy(),traverse() - HDC operations:
decompose()— unbind stored rows to reveal field associations - Pre-loaded utilities:
json,defaultdict,Counter,OrderedDict,math - Active
CollectionViewobjects injected by name
HDC Operations¶
These functions expose HyperBinder's distinctive algebraic capabilities at a task-oriented level:
| Function | What it does |
|---|---|
analogy(collection, a, b, c, field) |
A:B::C:? — find what relates to C the way B relates to A |
search_prototype(collection, examples, field) |
Bundle N examples and find similar items — no traditional equivalent |
decompose(collection, row_ids, fields) |
Unbind rows to reveal what each field contributes to the encoding |
# Analogy: Apple:Red :: Banana:?
result = await analogy("foods", "Apple", "Red", "Banana", field="name")
# Prototype: find drugs similar to these three
result = await search_prototype("drugs", ["Aspirin", "Ibuprofen", "Naproxen"], field="name")
# Decompose: what is row 0 most associated with in the 'relation' field?
result = await decompose("mechanisms", [0, 1], ["relation", "object"])
Output Capture¶
All print() output is captured to a StringIO buffer, capped at 16KB. Only this output is returned to the agent — intermediate state stays in the sandbox.
CollectionView¶
OOP proxy for a single collection, injected into the sandbox namespace as the collection name.
# Inside sandbox code, the agent writes:
results = await drugs.search("kinase inhibitor", top_k=5)
values = await drugs.distinct("mechanism")
data = await drugs.filter(where=[["status", "=", "approved"]])
Properties¶
| Property | Returns | Description |
|---|---|---|
name |
str |
Collection name |
schema |
dict |
Compound schema details |
fields |
List[str] |
Field names |
rows |
int |
Row count |
key_values |
Dict[str, List[str]] |
Distinct values for key fields |
intersects_with |
List[dict] |
Related intersections |
available_operations |
dict |
Valid search/navigate/tabular ops |
Methods¶
All methods are async and return ResultSet objects (except distinct which returns a list):
search(query, top_k=10)— semantic text searchsearch_slot(query, slot, top_k=10)— search within a specific fieldsearch_slots(top_k=10, **slot_queries)— multi-slot keyword search:await coll.search_slots(subject="oil")bundle_search(values, field="value", top_k=10)— prototype search from example valuesfilter(limit=100, **predicates)— exact-match filter:await coll.filter(exchange="NYMEX")distinct(field)— distinct values for a field (returnsList[str])extract(row_ids, fields=None)— extract field values for specific rows by IDanalogy(a, b, c, field="value")— A:B::C:? vector algebranavigate(op, **kwargs)— schema-native traversaltraverse(start, start_slot, path, top_k=10)— fuzzy traversal shortcut
ResultSet¶
Typed container for query results. Holds tabular data and an optional HDC vector for geometric composition.
rs = await drugs.search("kinase", top_k=5)
print(rs) # formatted table
print(rs.rows) # list of dicts
print(len(rs)) # row count
print(rs.fields) # field names
for row in rs: # iteration
print(row)
| Attribute | Type | Description |
|---|---|---|
rows |
List[dict] |
Result rows |
row_ids |
List[int] |
Row IDs from the backend |
scores |
List[float] |
Similarity scores (if from search) |
collection_name |
str |
Source collection name |
fields |
List[str] |
Field names |
vector |
optional | Bundled HDC vector (local mode only) |