Multivector (ColBERT / ColPali)

Multivector collections store token-level vector representations. Late interaction models like ColBERT and ColPali compute similarity at the token level using max_sim, enabling fine-grained reranking.

Create a Multivector Collection

CREATE COLLECTION pdf_retrieval (
  original VECTOR(128, COSINE) WITH MULTIVECTOR (comparator = 'max_sim') WITH HNSW (m = 0),
  mean_pooling_columns VECTOR(128, COSINE) WITH MULTIVECTOR (comparator = 'max_sim'),
  mean_pooling_rows VECTOR(128, COSINE) WITH MULTIVECTOR (comparator = 'max_sim')
)

Option	Description
`WITH MULTIVECTOR (comparator = 'max_sim')`	Enables token-level max-sim scoring
`WITH HNSW (m = 0)`	Disables HNSW indexing on this vector — reduces RAM and speeds up inserts. Use on reranking vectors that are never searched directly.

Dense + Multivector Collection

CREATE COLLECTION docs (
  dense VECTOR(384, COSINE),
  colbert VECTOR(128, COSINE) WITH MULTIVECTOR (comparator = 'max_sim') WITH HNSW (m = 0)
)

dense — used for fast first-stage ANN search
colbert — used for accurate late-interaction reranking (HNSW (m = 0) = no index, pure reranking)

Insert with Multivectors

Provide pre-computed vectors using the vector key:

-- Single multivector
INSERT INTO docs VALUES {
  'id': 1,
  'text': 'Qdrant vector database',
  'vector': {'dense': [0.1, 0.2, 0.3], 'colbert': [[0.1, 0.2], [0.3, 0.4]]}
}

-- PDF retrieval — three named vectors
INSERT INTO pdf_retrieval VALUES {
  'id': 1,
  'vector': {
    'original': [[0.1, 0.2], [0.3, 0.4]],
    'mean_pooling_columns': [[0.1, 0.2]],
    'mean_pooling_rows': [[0.3, 0.4]]
  }
}

The vector value is a map of named_vector → value:

1D array [...] — dense vector
2D array [[...], [...]] — multivector (one sub-array per token)

Two-Stage Search (Mean-Pooled → Rerank)

The recommended ColBERT / ColPali retrieval pattern: fast first stage with mean-pooled vectors, accurate reranking with original multivectors.

WITH
  _pf0 AS (QUERY [0.1, 0.2, 0.3] USING 'mean_pooling_columns' LIMIT 100),
  _pf1 AS (QUERY [0.1, 0.2, 0.3] USING 'mean_pooling_rows' LIMIT 100)
QUERY [0.1, 0.2, 0.3] FROM pdf_retrieval USING 'original' LIMIT 10
  PREFETCH (_pf0, _pf1)

Full ColBERT Showcase

-- 1. Create collection
CREATE COLLECTION colbert_docs (
  dense VECTOR(384, COSINE),
  colbert VECTOR(128, COSINE) WITH MULTIVECTOR (comparator = 'max_sim') WITH HNSW (m = 0)
)

-- 2. Insert with pre-computed vectors
INSERT INTO colbert_docs VALUES {
  'id': 1,
  'text': 'vector search with late interaction',
  'vector': {
    'dense': [0.1, 0.2, 0.3],
    'colbert': [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]
  }
}

-- 3. Two-stage retrieval
WITH
  first_stage AS (QUERY [0.1, 0.2, 0.3] USING 'dense' LIMIT 100)
QUERY [0.1, 0.2, 0.3] FROM colbert_docs USING 'colbert' LIMIT 10
  PREFETCH (first_stage)

EXPLAIN Output

Use EXPLAIN to verify your multivector setup:

EXPLAIN a multivector query

qql-go explain "WITH _pf0 AS (QUERY [0.1, 0.2] USING 'dense' LIMIT 100) QUERY [0.1, 0.2] FROM docs USING 'colbert' LIMIT 10 PREFETCH (_pf0)"

The explain output will show USING: colbert, CTEs: [_pf0], and PREFETCH REFS: [_pf0].

Update a Vector

UPDATE docs SET VECTOR 'colbert' = [[0.1, 0.2], [0.3, 0.4]] WHERE id = 1