Production systemsLLM reliability and governanceModule 4 of 4
Guides
March 12, 2026
By Andrew Day

When not to use an LLM: decision guide

The highest-leverage AI architecture choice is often not using an LLM at all. Use this guide to reject bad LLM candidates early.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

One of the highest-leverage AI decisions is deciding not to use an LLM at all.

This is harder than it sounds because "add AI" often enters the roadmap before anyone has mapped the actual workflow. The result is a feature that feels modern but adds cost, review load, and operational fragility to a problem that code, search, or classic ML already solved well enough.

A practical decision table

If the job is mostly... Best default Why
Exact calculation Code Deterministic, auditable, cheaper
Threshold checking Rules Logic is explicit and stable
Simple lookup Search or SQL The answer already exists in a system of record
Stable high-volume labeling Traditional ML Usually faster and cheaper at scale
Messy semantic interpretation Possibly an LLM This is where LLMs earn their complexity

The real decision flow

Use this flow before green-lighting an LLM feature:

Question If yes If no
Is the answer derivable by explicit rules or formulas? Use code or rules Keep going
Does the answer already live in a system of record? Use search, SQL, or tools Keep going
Is the output space tiny and fixed? Consider rules or classic ML first Keep going
Is the cost of a wrong answer high and hard to detect? Require stricter controls or avoid LLM authority Keep going
Is the remaining problem mostly messy language or semantic interpretation? LLM may be justified You probably do not need an LLM

If the first three rows mostly land on non-LLM tools, adding an LLM is usually overengineering.

Five strong reasons not to use an LLM

1. The task is deterministic

Examples:

  • pricing calculations
  • threshold checks
  • explicit policy rules

If the right answer should always be derived the same way, derive it in code.

2. The answer already exists

Examples:

  • account balance
  • shipment status
  • current feature flag state

If a system of record can answer directly, route to that system. Do not ask the model to imitate a database.

3. The acceptable answer space is tiny

Examples:

  • map a known field to a known label
  • choose one menu item from exact triggers

Rules or classic ML are often simpler and more stable here.

4. The risk is too high

Examples:

  • legal interpretation
  • destructive admin actions
  • financial approvals

An LLM can assist these workflows, but it should not be the final authority by default.

5. The business value does not justify the operating cost

A technically possible LLM feature can still be a bad product choice if it adds:

  • prompt maintenance
  • review labor
  • latency
  • user confusion
  • recurring inference cost

The right comparison is not "LLM versus nothing." It is "LLM versus the cheapest reliable alternative."

To make this concrete: running 1 million classification calls per month through gpt-4.1-mini at $0.40/M input tokens and $1.60/M output tokens costs roughly $40–$80/month for the model call alone, before retries, review labor, or prompt upkeep. A trained text classifier deployed on a modest cloud instance can serve the same volume for a few dollars in compute with sub-millisecond latency. If the LLM is doing something the classifier cannot — handling edge cases, interpreting novel phrasing, returning structured reasoning — that cost difference may be justified. If it is doing straightforward labeling on a stable label set, the economics rarely hold up.

Prices as of March 2026. Always check current provider pricing before building a cost model.

Where teams get trapped

Teams often reach for an LLM because:

  • competitors said "AI"
  • the prototype looked impressive
  • the real workflow was never broken down
  • manual review is hiding low-quality automation

Those are weak reasons to keep a model in the architecture.

A short rejection rubric

If most answers are yes, reject or rethink the LLM idea:

Can the logic be written explicitly?
Is the input already structured?
Is the answer space small and fixed?
Is there already a cheaper reliable system?
Would a wrong answer be expensive or risky?

If four or five answers are yes, the default should be "no LLM."

What to use instead

Prefer:

  • search for lookup problems
  • SQL or tools for system-of-record questions
  • rules engines for explicit policy logic
  • traditional ML for stable high-volume predictions
  • OCR, ASR, or parsing tools for basic media extraction

You do not get extra product value from using an LLM where a simpler system already solves the job.

The better question to ask

Instead of asking:

  • can an LLM do this?

Ask:

  • what is the cheapest reliable system that solves this workflow?

That framing produces better architecture decisions and much less AI theater.

How StackSpend helps

Avoided AI usage is an economic win too. In the Data Explorer, you can segment spend by provider and service category to see which AI features are growing fastest — and then ask, for each one, whether that spend is producing proportional value. Workflows that have high token cost but low resolution rates often turn out to be exactly the kind of deterministic problem that should not have used an LLM at all. The Monitoring view makes those outliers visible before they become large line items, so the architecture conversation happens while the problem is still small.

What to do next

FAQ

Does "do not use an LLM" mean AI is the wrong strategy?

No. It means one workflow is better served by another tool. Good AI architecture is about choosing the right mechanism, not forcing an LLM into every problem.

When is classic ML better than an LLM?

When the labels are stable, training data is good, and low latency or low unit cost matters at scale.

Can an LLM still help a deterministic workflow?

Sometimes as an assistant around the edges, such as summarizing context before a deterministic step. It usually should not replace the deterministic core.

What is the biggest warning sign of AI overengineering?

Manual review keeps catching mistakes that explicit rules or direct system lookups would have prevented from the start.

What should I do when the team wants AI for strategic reasons?

Break the workflow down concretely and compare the LLM approach with the cheapest reliable alternative. That usually makes the trade-off easier to discuss honestly.

References

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Know where your cloud and AI spend stands — every day.

Connect providers in minutes. Get 90 days of visibility and start receiving daily cost updates before the invoice lands.

14-day free trial. No credit card required. Plans from $19/month.