Skip to content

POST /unstructured/multihop_query/

Executes a multi-hop semantic retrieval over an ingested document namespace. Each hop progressively refines the candidate pool by following semantic and symbolic edges, enabling retrieval that traverses document structure rather than relying on a single-pass search.


Request

Content-Type: application/json

Parameter Type Required Default Description
query string Natural language query string
db_name string Name of the database to query
namespace string Namespace to query within the database
role string null Target role for symbolic encoding (e.g. "paragraph"). Defaults to "paragraph" internally if use_symbolic is true and no role is provided
use_symbolic bool true Whether to include symbolic encoding in retrieval
num_hops int 3 Number of hops to traverse
top_k_per_hop int 15 Candidate pool size at each hop
final_top_k int 10 Number of final results to return
hop_decay float 0.85 Score decay factor applied at each hop
context_expansion_ratio float 0.5 Fraction of each hop's candidates used to seed the next hop

Behavior

Multi-hop traversal — Rather than a single retrieval pass, the query is executed across num_hops iterations. Each hop uses the top candidates from the previous hop as context seeds for the next, following semantic and symbolic edges through the document.

Symbolic encoding — When use_symbolic is true, the target role is encoded symbolically using "exact" encoding and used to guide retrieval toward structurally relevant chunks. If symbolic encoding fails it is skipped silently and retrieval continues without it.

Score decay — Scores are multiplied by hop_decay at each successive hop, so earlier hops carry more weight in the final ranking. Set to 1.0 to disable decay.

Data loading — All text content is fetched lazily from RocksDB using the final set of result row IDs. Each hop displays up to 3 representative docs for tracing — final_results contains the full ranked output.


Responses

200 OK

{
  "query": "explain the appeals process",
  "namespace": "document_upload_3f9a1c2e",
  "hops": [
    {
      "hop": 0,
      "docs": ["Appeals must be submitted in writing...", "The review board convenes...", "..."]
    },
    {
      "hop": 1,
      "docs": ["Decisions are issued within 30 days...", "..."]
    }
  ],
  "final_results": [
    {
      "row_id": 34,
      "value": "Appeals must be submitted in writing within 14 days...",
      "role": "paragraph",
      "parent": "chunk_6",
      "text": "Appeals must be submitted in writing within 14 days...",
      "score": 0.923
    }
  ],
  "symbolic_edges": [
    { "hop": 0, "doc_idx": 34 }
  ]
}

final_results array — each item contains:

Field Description
row_id Internal row identifier
value Text content of the chunk
role Cell role label (e.g. "paragraph", "sentence")
parent Parent chunk identifier
text Alias of value
score Final relevance score. Non-finite values are clamped to 0.0

hops array:

Field Description
hop Hop index (0-based)
docs Up to 3 representative document snippets from that hop

symbolic_edges array:

Field Description
hop Hop where the edge was activated
doc_idx Row ID of the document linked via symbolic edge

Error Responses

Status Condition
500 Namespace not found, retrieval failure, or unexpected internal error

Notes

  • The namespace must already be ingested via /unstructured/upload_document/ before querying.
  • Increase num_hops for longer documents where relevant content may be spread across distant chunks.
  • If final_results is empty, try lowering hop_decay, increasing top_k_per_hop, or setting use_symbolic to false to isolate the issue.

Example

import requests

SERVER_URL = "http://hbserver:8000"
API_KEY    = "yourapitoken"

def multihop_query(query: str, namespace: str) -> dict:
    response = requests.post(
        f"{SERVER_URL}/unstructured/multihop_query/",
        headers={"X-API-Key": API_KEY},
        json={
            "query":                   query,
            "db_name":                 "fractal_db",
            "namespace":               namespace,
            "role":                    "paragraph",  # optional
            "use_symbolic":            True,          # optional, defaults to True
            "num_hops":                3,             # optional, defaults to 3
            "top_k_per_hop":           15,            # optional, defaults to 15
            "final_top_k":             10,            # optional, defaults to 10
            "hop_decay":               0.85,          # optional, defaults to 0.85
            "context_expansion_ratio": 0.5,           # optional, defaults to 0.5
        },
    )
    response.raise_for_status()
    return response.json()


result = multihop_query(
    query="explain the appeals process",
    namespace="document_upload_3f9a1c2e",
)
print(result)

Expected output:

{
  "query": "explain the appeals process",
  "namespace": "document_upload_3f9a1c2e",
  "hops": [
    { "hop": 0, "docs": ["Appeals must be submitted in writing...", "The review board...", "..."] },
    { "hop": 1, "docs": ["Decisions are issued within 30 days...", "..."] },
    { "hop": 2, "docs": ["Final rulings are binding unless...", "..."] }
  ],
  "final_results": [
    {
      "row_id": 34,
      "value": "Appeals must be submitted in writing within 14 days...",
      "role": "paragraph",
      "parent": "chunk_6",
      "text": "Appeals must be submitted in writing within 14 days...",
      "score": 0.923
    }
  ],
  "symbolic_edges": [
    { "hop": 0, "doc_idx": 34 }
  ]
}