Query rewriting, decomposition, and retrieval routing

Not every retrieval failure is an embedding failure.

Sometimes the real problem is earlier: the query was vague, contained multiple asks, or should have gone to a completely different source. That is why strong RAG systems often have a pre-retrieval layer that rewrites, decomposes, or routes the query before any search happens.

Three different problems, three different fixes

Problem	Best first move	Why
Query is vague or uses mismatched language	Rewrite	Improve retrievability without changing intent
Query contains multiple asks	Decompose	Lets each sub-question retrieve cleaner evidence
Query belongs to different corpora or tools	Route	Different sources need different retrieval methods

The practical rule is to fix the query path before you spend more tokens on bigger prompts.

A concrete pre-retrieval pipeline

Here is a simple TypeScript sketch:

export async function prepareQuery(userQuery: string) {
  const queryShape = await classifyQueryShape(userQuery);

  if (queryShape === "multi_part") {
    return {
      route: "decompose",
      queries: await decomposeQuery(userQuery),
    };
  }

  if (queryShape === "structured_data_question") {
    return {
      route: "tool",
      queries: [userQuery],
    };
  }

  if (queryShape === "vague_or_slangy") {
    return {
      route: "rewrite_then_search",
      queries: [await rewriteQuery(userQuery)],
    };
  }

  return {
    route: "direct_search",
    queries: [userQuery],
  };
}

This is useful because it separates three decisions clearly:

should the query be normalized?
should it be split?
where should it go?

Once those decisions are explicit, you can evaluate them independently.

Rewrite only when the wording is the problem

Rewriting helps when the query is:

informal
shorthand-heavy
missing domain terms
phrased differently from the indexed corpus

Good rewrite behavior preserves intent while improving searchability.

Bad rewrite behavior guesses what the answer should be or adds specificity the user did not provide.

If filters, metadata, or better chunking would solve the issue, rewriting is not the first fix.

Decomposition helps when the user really asked two questions

A single query such as:

"What changed in pricing and what does it mean for enterprise customers?"

actually contains at least two retrieval jobs:

what changed
what the impact is for a specific segment

If you keep those combined, retrieval often mixes weak evidence from both. Decomposition lets the system search and answer in smaller, cleaner parts.

Routing matters when your sources differ

A mature assistant usually has multiple destinations:

product docs
policy docs
support tickets
metrics dashboards
SQL-backed systems

These should not all share the same retrieval path. Some need hybrid search. Some need a tool call. Some need SQL, not RAG.

Routing is often the most valuable part of the pre-retrieval layer because it stops the wrong engine from answering the question.

What to measure

Track:

rewrite lift on retrieval metrics
decomposition lift on recall or answer correctness
routing accuracy
answer correctness after routing

If you only look at final answer quality, you will not know whether the improvement came from better query prep or better generation.

A practical worksheet

Use this before implementing:

Workflow:
Common query shapes:
Which shapes need rewriting:
Which shapes need decomposition:
Which shapes should route to tools or SQL:
Which shapes should use document retrieval:
Primary metric:
Guardrail metric:

That gives you a real pre-retrieval design instead of an intuitive guess.

The common anti-pattern

The common anti-pattern is rewriting every query because a query-rewrite step feels smart.

That often creates two new problems:

extra cost on easy queries
accidental drift away from the user's real wording

The better pattern is selective rewriting, selective decomposition, and explicit routing based on query shape.

How StackSpend helps

Pre-retrieval layers change spend by adding or removing search steps, tool calls, and generation retries. In StackSpend, you can compare cost per answered query before and after a routing layer launch, see whether rewrite-heavy traffic is actually improving outcomes, and spot when a "smarter" pre-retrieval design is adding cost without improving retrieval quality enough to justify it.

What to do next

FAQ

Should I rewrite every query automatically?

Usually no. Rewrite only when the wording is vague, mismatched to the corpus, or clearly likely to retrieve poorly.

When should I decompose a query?

When one user message contains multiple retrieval intents that are likely to require separate evidence sets.

How do I know a query should route to a tool instead of retrieval?

If the answer belongs to a live system of record such as SQL, an API, or account state, a tool path is usually safer than document retrieval.

What is the easiest mistake to make with query rewriting?

Accidentally changing user intent while trying to make the query more searchable.

Is pre-retrieval work worth the extra complexity?

Often yes, because it can improve answer quality more cheaply than increasing prompt size or switching to a larger generation model.