One of the highest-leverage AI decisions is deciding not to use an LLM at all.
This is harder than it sounds because "add AI" often enters the roadmap before anyone has mapped the actual workflow. The result is a feature that feels modern but adds cost, review load, and operational fragility to a problem that code, search, or classic ML already solved well enough.
A practical decision table
| If the job is mostly... | Best default | Why |
|---|---|---|
| Exact calculation | Code | Deterministic, auditable, cheaper |
| Threshold checking | Rules | Logic is explicit and stable |
| Simple lookup | Search or SQL | The answer already exists in a system of record |
| Stable high-volume labeling | Traditional ML | Usually faster and cheaper at scale |
| Messy semantic interpretation | Possibly an LLM | This is where LLMs earn their complexity |
The real decision flow
Use this flow before green-lighting an LLM feature:
| Question | If yes | If no |
|---|---|---|
| Is the answer derivable by explicit rules or formulas? | Use code or rules | Keep going |
| Does the answer already live in a system of record? | Use search, SQL, or tools | Keep going |
| Is the output space tiny and fixed? | Consider rules or classic ML first | Keep going |
| Is the cost of a wrong answer high and hard to detect? | Require stricter controls or avoid LLM authority | Keep going |
| Is the remaining problem mostly messy language or semantic interpretation? | LLM may be justified | You probably do not need an LLM |
If the first three rows mostly land on non-LLM tools, adding an LLM is usually overengineering.
Five strong reasons not to use an LLM
1. The task is deterministic
Examples:
- pricing calculations
- threshold checks
- explicit policy rules
If the right answer should always be derived the same way, derive it in code.
2. The answer already exists
Examples:
- account balance
- shipment status
- current feature flag state
If a system of record can answer directly, route to that system. Do not ask the model to imitate a database.
3. The acceptable answer space is tiny
Examples:
- map a known field to a known label
- choose one menu item from exact triggers
Rules or classic ML are often simpler and more stable here.
4. The risk is too high
Examples:
- legal interpretation
- destructive admin actions
- financial approvals
An LLM can assist these workflows, but it should not be the final authority by default.
5. The business value does not justify the operating cost
A technically possible LLM feature can still be a bad product choice if it adds:
- prompt maintenance
- review labor
- latency
- user confusion
- recurring inference cost
The right comparison is not "LLM versus nothing." It is "LLM versus the cheapest reliable alternative."
To make this concrete: running 1 million classification calls per month through gpt-4.1-mini at $0.40/M input tokens and $1.60/M output tokens costs roughly $40–$80/month for the model call alone, before retries, review labor, or prompt upkeep. A trained text classifier deployed on a modest cloud instance can serve the same volume for a few dollars in compute with sub-millisecond latency. If the LLM is doing something the classifier cannot — handling edge cases, interpreting novel phrasing, returning structured reasoning — that cost difference may be justified. If it is doing straightforward labeling on a stable label set, the economics rarely hold up.
Prices as of March 2026. Always check current provider pricing before building a cost model.
Where teams get trapped
Teams often reach for an LLM because:
- competitors said "AI"
- the prototype looked impressive
- the real workflow was never broken down
- manual review is hiding low-quality automation
Those are weak reasons to keep a model in the architecture.
A short rejection rubric
If most answers are yes, reject or rethink the LLM idea:
Can the logic be written explicitly?
Is the input already structured?
Is the answer space small and fixed?
Is there already a cheaper reliable system?
Would a wrong answer be expensive or risky?
If four or five answers are yes, the default should be "no LLM."
What to use instead
Prefer:
- search for lookup problems
- SQL or tools for system-of-record questions
- rules engines for explicit policy logic
- traditional ML for stable high-volume predictions
- OCR, ASR, or parsing tools for basic media extraction
You do not get extra product value from using an LLM where a simpler system already solves the job.
The better question to ask
Instead of asking:
- can an LLM do this?
Ask:
- what is the cheapest reliable system that solves this workflow?
That framing produces better architecture decisions and much less AI theater.
How StackSpend helps
Avoided AI usage is an economic win too. In the Data Explorer, you can segment spend by provider and service category to see which AI features are growing fastest — and then ask, for each one, whether that spend is producing proportional value. Workflows that have high token cost but low resolution rates often turn out to be exactly the kind of deterministic problem that should not have used an LLM at all. The Monitoring view makes those outliers visible before they become large line items, so the architecture conversation happens while the problem is still small.
What to do next
FAQ
Does "do not use an LLM" mean AI is the wrong strategy?
No. It means one workflow is better served by another tool. Good AI architecture is about choosing the right mechanism, not forcing an LLM into every problem.
When is classic ML better than an LLM?
When the labels are stable, training data is good, and low latency or low unit cost matters at scale.
Can an LLM still help a deterministic workflow?
Sometimes as an assistant around the edges, such as summarizing context before a deterministic step. It usually should not replace the deterministic core.
What is the biggest warning sign of AI overengineering?
Manual review keeps catching mistakes that explicit rules or direct system lookups would have prevented from the start.
What should I do when the team wants AI for strategic reasons?
Break the workflow down concretely and compare the LLM approach with the cheapest reliable alternative. That usually makes the trade-off easier to discuss honestly.