When not to use an LLM: decision guide

One of the highest-leverage AI decisions is deciding not to use an LLM at all.

This is harder than it sounds because "add AI" often enters the roadmap before anyone has mapped the actual workflow. The result is a feature that feels modern but adds cost, review load, and operational fragility to a problem that code, search, or classic ML already solved well enough.

A practical decision table

If the job is mostly...	Best default	Why
Exact calculation	Code	Deterministic, auditable, cheaper
Threshold checking	Rules	Logic is explicit and stable
Simple lookup	Search or SQL	The answer already exists in a system of record
Stable high-volume labeling	Traditional ML	Usually faster and cheaper at scale
Messy semantic interpretation	Possibly an LLM	This is where LLMs earn their complexity

The real decision flow

Use this flow before green-lighting an LLM feature:

Question	If yes	If no
Is the answer derivable by explicit rules or formulas?	Use code or rules	Keep going
Does the answer already live in a system of record?	Use search, SQL, or tools	Keep going
Is the output space tiny and fixed?	Consider rules or classic ML first	Keep going
Is the cost of a wrong answer high and hard to detect?	Require stricter controls or avoid LLM authority	Keep going
Is the remaining problem mostly messy language or semantic interpretation?	LLM may be justified	You probably do not need an LLM

If the first three rows mostly land on non-LLM tools, adding an LLM is usually overengineering.

Five strong reasons not to use an LLM

1. The task is deterministic

Examples:

pricing calculations
threshold checks
explicit policy rules

If the right answer should always be derived the same way, derive it in code.

2. The answer already exists

Examples:

account balance
shipment status
current feature flag state

If a system of record can answer directly, route to that system. Do not ask the model to imitate a database.

3. The acceptable answer space is tiny

Examples:

map a known field to a known label
choose one menu item from exact triggers

Rules or classic ML are often simpler and more stable here.

4. The risk is too high

Examples:

legal interpretation
destructive admin actions
financial approvals

An LLM can assist these workflows, but it should not be the final authority by default.

5. The business value does not justify the operating cost

A technically possible LLM feature can still be a bad product choice if it adds:

prompt maintenance
review labor
latency
user confusion
recurring inference cost

The right comparison is not "LLM versus nothing." It is "LLM versus the cheapest reliable alternative."

To make this concrete: running 1 million classification calls per month through gpt-4.1-mini at $0.40/M input tokens and $1.60/M output tokens costs roughly $40–$80/month for the model call alone, before retries, review labor, or prompt upkeep. A trained text classifier deployed on a modest cloud instance can serve the same volume for a few dollars in compute with sub-millisecond latency. If the LLM is doing something the classifier cannot — handling edge cases, interpreting novel phrasing, returning structured reasoning — that cost difference may be justified. If it is doing straightforward labeling on a stable label set, the economics rarely hold up.

Prices as of March 2026. Always check current provider pricing before building a cost model.

Where teams get trapped

Teams often reach for an LLM because:

competitors said "AI"
the prototype looked impressive
the real workflow was never broken down
manual review is hiding low-quality automation

Those are weak reasons to keep a model in the architecture.

A short rejection rubric

If most answers are yes, reject or rethink the LLM idea:

Can the logic be written explicitly?
Is the input already structured?
Is the answer space small and fixed?
Is there already a cheaper reliable system?
Would a wrong answer be expensive or risky?

If four or five answers are yes, the default should be "no LLM."

What to use instead

Prefer:

search for lookup problems
SQL or tools for system-of-record questions
rules engines for explicit policy logic
traditional ML for stable high-volume predictions
OCR, ASR, or parsing tools for basic media extraction

You do not get extra product value from using an LLM where a simpler system already solves the job.

The better question to ask

Instead of asking:

can an LLM do this?

Ask:

what is the cheapest reliable system that solves this workflow?

That framing produces better architecture decisions and much less AI theater.

How StackSpend helps

Avoided AI usage is an economic win too. In the Data Explorer, you can segment spend by provider and service category to see which AI features are growing fastest — and then ask, for each one, whether that spend is producing proportional value. Workflows that have high token cost but low resolution rates often turn out to be exactly the kind of deterministic problem that should not have used an LLM at all. The Monitoring view makes those outliers visible before they become large line items, so the architecture conversation happens while the problem is still small.

What to do next

FAQ

Does "do not use an LLM" mean AI is the wrong strategy?

No. It means one workflow is better served by another tool. Good AI architecture is about choosing the right mechanism, not forcing an LLM into every problem.

When is classic ML better than an LLM?

When the labels are stable, training data is good, and low latency or low unit cost matters at scale.

Can an LLM still help a deterministic workflow?

Sometimes as an assistant around the edges, such as summarizing context before a deterministic step. It usually should not replace the deterministic core.

What is the biggest warning sign of AI overengineering?

Manual review keeps catching mistakes that explicit rules or direct system lookups would have prevented from the start.

What should I do when the team wants AI for strategic reasons?

Break the workflow down concretely and compare the LLM approach with the cheapest reliable alternative. That usually makes the trade-off easier to discuss honestly.

When not to use an LLM: decision guide

A practical decision table

The real decision flow

Five strong reasons not to use an LLM

1. The task is deterministic

2. The answer already exists

3. The acceptable answer space is tiny

4. The risk is too high

5. The business value does not justify the operating cost

Where teams get trapped

A short rejection rubric

What to use instead

The better question to ask

How StackSpend helps

What to do next

FAQ

Does "do not use an LLM" mean AI is the wrong strategy?

When is classic ML better than an LLM?

Can an LLM still help a deterministic workflow?

What is the biggest warning sign of AI overengineering?

What should I do when the team wants AI for strategic reasons?

References

Know where your cloud and AI spend stands — every day.