Skip to content

Multi-Stage Retrieval

QQL supports multi-stage retrieval pipelines using Common Table Expressions (CTEs) and PREFETCH DAGs. Each stage can have its own vector, filter, limit, and score threshold.

WITH
dense AS (QUERY 'emergency care' USING dense LIMIT 200),
sparse AS (QUERY 'emergency care' USING sparse LIMIT 300)
QUERY 'emergency care' FROM docs LIMIT 10
PREFETCH (dense, sparse)
FUSION RRF

Each CTE is a named retrieval stage. The top-level QUERY merges the prefetch results via FUSION RRF or FUSION DBSF.

Apply independent filters and score thresholds to each CTE reference at the prefetch level:

WITH
dense AS (QUERY 'search' USING dense LIMIT 200 WHERE category = 'tech'),
sparse AS (QUERY 'search' USING sparse LIMIT 300)
QUERY 'search' FROM docs LIMIT 10
PREFETCH (
dense WHERE priority = 'high' SCORE THRESHOLD 0.6,
sparse SCORE THRESHOLD 0.3
)
FUSION RRF WITH (rrf_k = 20, rrf_weights = [0.6, 0.4])

Each CTE can target a different named vector:

WITH
_pf0 AS (QUERY [0.1, 0.2, 0.3] USING 'mean_pooling_columns' LIMIT 100),
_pf1 AS (QUERY [0.1, 0.2, 0.3] USING 'mean_pooling_rows' LIMIT 100)
QUERY [0.1, 0.2, 0.3] FROM pdf_retrieval USING 'original' LIMIT 10
PREFETCH (_pf0, _pf1)

CTEs can reference other CTEs to build hierarchical pipelines:

WITH
broad AS (
QUERY 'emergency neurological assessment' USING dense LIMIT 500
WHERE department = 'emergency'
),
narrow AS (
QUERY 'emergency neurological assessment' USING sparse LIMIT 100
PREFETCH (broad)
)
QUERY 'emergency neurological assessment' FROM clinical_docs LIMIT 5
PREFETCH (narrow)
FUSION RRF

Fuse CTE results without a new search:

WITH
dense AS (QUERY 'search' USING 'dense' LIMIT 100),
sparse AS (QUERY 'search' USING 'sparse' LIMIT 100)
FUSION RRF LIMIT 10 PREFETCH (dense, sparse)

Or without explicit CTEs (inline):

FUSION RRF LIMIT 10 PREFETCH (dense, sparse)
WITH
high_priority AS (QUERY 'kubernetes deployment' USING dense LIMIT 50
WHERE priority = 'critical' AND status = 'open'),
general AS (QUERY 'kubernetes deployment' USING dense LIMIT 200),
keyword AS (QUERY 'kubernetes deployment' USING sparse LIMIT 200)
QUERY 'kubernetes deployment' FROM incidents LIMIT 10
PREFETCH (
high_priority SCORE THRESHOLD 0.7,
general SCORE THRESHOLD 0.4,
keyword SCORE THRESHOLD 0.3
)
FUSION RRF
WITH (rrf_k = 30, rrf_weights = [0.5, 0.3, 0.2])

Effect: Critical incidents get 50% of the RRF weight, general dense 30%, keyword 20%.

Nested CTE prefetch for coarse-to-fine retrieval — broad dense first pass, narrow sparse second pass:

WITH
broad AS (
QUERY 'emergency neurological assessment' USING dense LIMIT 500
WHERE department = 'emergency'
),
narrow AS (
QUERY 'emergency neurological assessment' USING sparse LIMIT 100
PREFETCH (broad)
)
QUERY 'emergency neurological assessment' FROM clinical_docs LIMIT 5
PREFETCH (narrow)
FUSION RRF

Pipeline

broad (dense, 500 candidates, emergency dept)
narrow (sparse, 100 results scoped to broad)
FUSION RRF (final 5)
DecisionWhy
broad retrieves 500 dense candidatesWide semantic coverage
narrow sparse search inside broadKeyword precision within semantic neighborhood
Top-level fuses only narrow2-stage: dense broad → sparse narrow → RRF
WITH
<name> AS (QUERY <target> [USING <vector>] [LIMIT <n>] [WHERE <filter>] [PREFETCH (<name>)]),
<name> AS (...)
QUERY '<text>' FROM <collection> [USING '<vector>'] LIMIT <n>
PREFETCH (
<name> [WHERE <filter>] [SCORE THRESHOLD <float>],
<name> [SCORE THRESHOLD <float>]
)
FUSION <RRF | DBSF>
[WITH (rrf_k = <n>, rrf_weights = [<f>, <f>])]