Embedding Models¶

HyperBinder uses embedding models to encode SEMANTIC fields, that is fields where similar text should produce similar results in search.

Any embedding model works. HyperBinder adapts automatically. The default is all-MiniLM-L6-v2 (384 dimensions).

HyperBinder does NOT store raw embeddings

Embeddings are an input to HyperBinder's encoding pipeline. At ingest time, they are transformed into an internal representation, and only that internal representation is stored. At query time, the same transformation runs on the query embedding. Your choice of embedding model affects the quality of this encoding.

Same model for ingest and query

You must use the same embedding model for both ingest and query. Mixing models will produce meaningless results because the internal representation is calibrated to the model used at ingest time.

How Embedding Dimensions Work¶

HyperBinder's internal representation has a fixed capacity (256 dimensions by default for semantic fields). When your embedding model produces vectors of a different size, HyperBinder adapts automatically:

Smaller embeddings (embed_dim ≤ internal capacity): No fidelity loss. HyperBinder interpolates to fill the representation.
Larger embeddings (embed_dim > internal capacity): Some compression occurs. HyperBinder auto-detects your model's characteristics and uses the best strategy, but information is necessarily lost.

The compression ratio (embed_dim / internal_dim) determines how much information is preserved:

Compression Ratio	Quality	Effect
≤ 1.0×	Lossless	No information lost
1–2×	Excellent	Semantic ranking well preserved
2–4×	Good	Slight degradation in fine-grained ranking
4×+	Moderate	Consider using a lower-dimensional model or pre-truncating (see MRL models)

Recommended Models¶

Model	Dimensions	Compression @256	Quality	Notes
`all-MiniLM-L6-v2`	384	1.5×	Excellent	Default, fast, good quality
`nomic-embed-text-v1.5`	768	3×	Good	MRL-trained, compresses well
`text-embedding-3-small`	1536	6×	Moderate	Use MRL truncation to 256 for best results
`text-embedding-3-large`	3072	12×	Lower	Use MRL truncation to 256–768 first

Lower-dimensional models often give better end-to-end results because less compression means higher fidelity in HyperBinder's internal representation.

Bringing Your Own Embedding Model¶

There are two ways to provide embeddings to HyperBinder.

`encode_fn` callback (recommended)¶

Register a function that maps text to embeddings. HyperBinder calls this automatically for both ingest encoding and query encoding.

from sentence_transformers import SentenceTransformer
from hybi import HyperBinder

model = SentenceTransformer("all-MiniLM-L6-v2")

hb = HyperBinder(local=True, encode_fn=model.encode)
hb.ingest(df, collection="docs")

# Queries are automatically encoded using your encode_fn
results = hb.search("machine learning", collection="docs")

The encode_fn should accept a list of strings and return a numpy array (or list of lists) of embeddings:

def my_encode_fn(texts: list[str]) -> np.ndarray:
    # Your embedding logic here
    return embeddings  # shape: (len(texts), embed_dim)

`vector_col` parameter¶

For pipelines that pre-compute embeddings externally, pass a column name containing embedding vectors at ingest time:

import pandas as pd

# DataFrame with pre-computed embeddings
df = pd.DataFrame({
    "text": ["doc one", "doc two"],
    "my_embeddings": [embedding_1, embedding_2],  # list[float] per row
})

hb.ingest(df, collection="docs", vector_col="my_embeddings")

Note

With vector_col, you are responsible for encoding query vectors yourself. The encode_fn approach handles this automatically and is preferred for most use cases.

MRL (Matryoshka) Models¶

Some embedding models are trained with Matryoshka Representation Learning (MRL), which means you can safely truncate their output to shorter dimensions without retraining. The first N dimensions of an MRL-trained model contain a valid, lower-dimensional embedding.

HyperBinder auto-detects MRL structure — no configuration is needed.

For high-dimensional MRL models, you can pre-truncate embeddings before passing them to HyperBinder. This reduces the compression ratio and preserves more fidelity:

from sentence_transformers import SentenceTransformer
import numpy as np

# text-embedding-3-small produces 1536-dim vectors
# but supports MRL truncation
model = SentenceTransformer("text-embedding-3-small")

def truncated_encode(texts, target_dim=256):
    embeddings = model.encode(texts)
    truncated = embeddings[:, :target_dim]
    # Re-normalize after truncation
    norms = np.linalg.norm(truncated, axis=1, keepdims=True)
    return truncated / norms

hb = HyperBinder(local=True, encode_fn=truncated_encode)
# Now compression ratio is 256/256 = 1.0× (lossless)

Not all models support MRL truncation. Check your model's documentation before truncating. Models that support it include:

OpenAI text-embedding-3-small and text-embedding-3-large
nomic-embed-text-v1.5
Any model explicitly documented as MRL-trained

Encoding quality over time

HyperBinder's encoding quality improves as it sees more data. The first few rows use a general-purpose encoding strategy. After accumulating statistics about your data distribution, the encoding automatically adapts for better fidelity. No action is needed — this is fully automatic.

Best Practices¶

Always use the same embedding model for ingest and query. Switching models invalidates the internal representation.
Lower-dimensional models often give better end-to-end results because less compression means higher fidelity.
For high-dimensional models, pre-truncate if the model supports MRL. This is the single most effective optimization.
Don't mix embedding models within a single collection. Each collection should use one consistent model.

Embedding Models¶

How Embedding Dimensions Work¶

Recommended Models¶

Bringing Your Own Embedding Model¶

encode_fn callback (recommended)¶

vector_col parameter¶

MRL (Matryoshka) Models¶

Best Practices¶

`encode_fn` callback (recommended)¶

`vector_col` parameter¶