QA over structured data and grounding patterns

If the answer lives in a database, API, or validated report, the model should not "know" it from memory.

This sounds obvious, but many teams still build structured-data QA as if every question-answering task were generic RAG. That is how you get fluent answers to the wrong number.

Grounding means the answer is constrained by evidence

For structured-data QA, the safest pattern is:

fetch deterministically
explain second

The model can help map intent to a query or tool call, but the system of record should still provide the facts.

Choose the grounding path based on the source

Source of truth	Best pattern	Why
Warehouse or database	SQL or query template path	The data is structured and queryable
Business system or API	Tool or function calling	The answer should come from live system data
Reports or policies	Retrieval with citations	The evidence is document-based
Mixed sources	Router plus grounded path per source	One answer path does not fit all source types

A concrete text-to-SQL safety pattern

Suppose a user asks, "What was AWS spend last week?"

The model can help convert that to a query, but you should still validate the query before execution.

export async function answerSpendQuestion(question: string) {
  const draftSql = await generateSql(question);

  validateSql(draftSql, {
    allowedTables: ["rollup_daily"],
    allowWriteStatements: false,
  });

  const rows = await runReadOnlyQuery(draftSql);

  return summarizeRows(question, rows);
}

The point is not that text-to-SQL is impossible. The point is that it needs guardrails:

allowed tables
read-only enforcement
schema-aware validation
answer synthesis after execution

If the answer should come from the warehouse, fetch it from the warehouse.

Tool-based QA is often better than generic RAG

Many common questions are really tool questions:

what is the order status?
which jobs failed?
when was this invoice paid?
what limit applies to this account?

Those should call systems of record directly. Retrieval over docs is helpful for policies and explanations, not for live account state.

Mixed answers need split responsibilities

Some questions need both raw values and explanation:

"Why did spend rise last week?"
"How many failed jobs did we have, and what changed?"

In those cases:

fetch the values deterministically
optionally fetch document or policy context
let the model explain based on those grounded inputs

That is much safer than asking the model to infer the facts and the explanation in one step.

What to evaluate

For grounded QA, track:

answer correctness
evidence correctness
unsupported-claim rate
query or tool success rate

If the system sounds helpful but the answer is wrong, it is not production-ready.

A fill-in grounding spec

Use this before shipping:

Workflow:
System of record:
Is the answer live, historical, or policy-based?
What should be fetched deterministically?
What should the model only explain?
What evidence should be shown back to the user?
Primary metric:
Guardrail metric:

This prevents the most common failure: using one answer path for data that clearly belongs to different sources.

The common failure mode

The common failure mode is using RAG for everything because it feels like a general solution.

RAG is useful for:

policies
documentation
narrative reports

It is not the best default for:

live metrics
account records
operational statuses

If the answer is sitting in a table or API, route through the table or API.

How StackSpend helps

Grounded QA has distinct cost layers across model usage, query execution, and retrieval. In StackSpend, you can separate those layers by workflow, see whether one grounded assistant is overusing model explanation relative to cheap deterministic fetches, and watch cost per successful grounded answer as usage grows.

What to do next

FAQ

When should I use text-to-SQL instead of RAG?

Use text-to-SQL when the answer belongs to structured tables and the user needs current or queryable facts, not just document explanations.

Should the model execute arbitrary SQL?

No. Use read-only enforcement, allowed-table validation, and preferably templates or validators that constrain what can run.

Can I combine SQL and retrieval in one answer?

Yes. Fetch the facts deterministically first, then add retrieved policy or documentation context if the user needs explanation.

What is the biggest risk in grounded QA?

Letting the model synthesize an answer that mixes grounded facts with unsupported claims. That is why evidence correctness matters alongside answer correctness.

Is retrieval still useful for structured-data QA?

Yes, but usually for surrounding narrative context such as policy, definitions, and documentation rather than for the live values themselves.