Production systemsBuild production LLM applicationsModule 4 of 10
Guides
March 12, 2026
By Andrew Day

QA over structured data and grounding patterns

Many LLM question-answering systems should be grounded in SQL, tools, or explicit evidence rather than treated like generic RAG.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

If the answer lives in a database, API, or validated report, the model should not "know" it from memory.

This sounds obvious, but many teams still build structured-data QA as if every question-answering task were generic RAG. That is how you get fluent answers to the wrong number.

Grounding means the answer is constrained by evidence

For structured-data QA, the safest pattern is:

  • fetch deterministically
  • explain second

The model can help map intent to a query or tool call, but the system of record should still provide the facts.

Choose the grounding path based on the source

Source of truth Best pattern Why
Warehouse or database SQL or query template path The data is structured and queryable
Business system or API Tool or function calling The answer should come from live system data
Reports or policies Retrieval with citations The evidence is document-based
Mixed sources Router plus grounded path per source One answer path does not fit all source types

A concrete text-to-SQL safety pattern

Suppose a user asks, "What was AWS spend last week?"

The model can help convert that to a query, but you should still validate the query before execution.

export async function answerSpendQuestion(question: string) {
  const draftSql = await generateSql(question);

  validateSql(draftSql, {
    allowedTables: ["rollup_daily"],
    allowWriteStatements: false,
  });

  const rows = await runReadOnlyQuery(draftSql);

  return summarizeRows(question, rows);
}

The point is not that text-to-SQL is impossible. The point is that it needs guardrails:

  • allowed tables
  • read-only enforcement
  • schema-aware validation
  • answer synthesis after execution

If the answer should come from the warehouse, fetch it from the warehouse.

Tool-based QA is often better than generic RAG

Many common questions are really tool questions:

  • what is the order status?
  • which jobs failed?
  • when was this invoice paid?
  • what limit applies to this account?

Those should call systems of record directly. Retrieval over docs is helpful for policies and explanations, not for live account state.

Mixed answers need split responsibilities

Some questions need both raw values and explanation:

  • "Why did spend rise last week?"
  • "How many failed jobs did we have, and what changed?"

In those cases:

  1. fetch the values deterministically
  2. optionally fetch document or policy context
  3. let the model explain based on those grounded inputs

That is much safer than asking the model to infer the facts and the explanation in one step.

What to evaluate

For grounded QA, track:

  • answer correctness
  • evidence correctness
  • unsupported-claim rate
  • query or tool success rate

If the system sounds helpful but the answer is wrong, it is not production-ready.

A fill-in grounding spec

Use this before shipping:

Workflow:
System of record:
Is the answer live, historical, or policy-based?
What should be fetched deterministically?
What should the model only explain?
What evidence should be shown back to the user?
Primary metric:
Guardrail metric:

This prevents the most common failure: using one answer path for data that clearly belongs to different sources.

The common failure mode

The common failure mode is using RAG for everything because it feels like a general solution.

RAG is useful for:

  • policies
  • documentation
  • narrative reports

It is not the best default for:

  • live metrics
  • account records
  • operational statuses

If the answer is sitting in a table or API, route through the table or API.

How StackSpend helps

Grounded QA has distinct cost layers across model usage, query execution, and retrieval. In StackSpend, you can separate those layers by workflow, see whether one grounded assistant is overusing model explanation relative to cheap deterministic fetches, and watch cost per successful grounded answer as usage grows.

What to do next

FAQ

When should I use text-to-SQL instead of RAG?

Use text-to-SQL when the answer belongs to structured tables and the user needs current or queryable facts, not just document explanations.

Should the model execute arbitrary SQL?

No. Use read-only enforcement, allowed-table validation, and preferably templates or validators that constrain what can run.

Can I combine SQL and retrieval in one answer?

Yes. Fetch the facts deterministically first, then add retrieved policy or documentation context if the user needs explanation.

What is the biggest risk in grounded QA?

Letting the model synthesize an answer that mixes grounded facts with unsupported claims. That is why evidence correctness matters alongside answer correctness.

Is retrieval still useful for structured-data QA?

Yes, but usually for surrounding narrative context such as policy, definitions, and documentation rather than for the live values themselves.

References

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Know where your cloud and AI spend stands — every day.

Connect providers in minutes. Get 90 days of visibility and start receiving daily cost updates before the invoice lands.

14-day free trial. No credit card required. Plans from $19/month.