Use this when you need to give leadership a forecast for cloud and AI spend — and you do not have a FinOps team, a finance analyst, or a week to spare.
The fast answer: build a baseline from the last three months of actual spend, adjust for any known trend, and then add a separate AI-specific estimate based on feature launches and usage growth. Review pace daily and update the forecast when assumptions change. A living forecast you update weekly is far more useful than a precise one you built once in a planning cycle.
What you will get in 10 minutes
- A simple way to set a baseline from historical data
- A trend adjustment that prevents the baseline from going stale
- A separate AI spend model with the right inputs
- A short story of why last-month extrapolation breaks for AI
- What to do when the forecast is wrong mid-month
Why most forecasts fail teams like yours
Most cloud cost forecasting guides assume you have a FinOps team. They talk about variance analysis, rolling forecasts, and budget reconciliation. They are written for finance departments, not engineering teams.
But most teams do not have a FinOps person. They have a CTO who needs to answer the board question: "Will we stay within budget this month?" The answer needs to take about 10 minutes to prepare, and it needs to be defensible — not a shrug.
The other reason forecasts fail is that teams treat cloud and AI spend as the same problem. They are not.
Cloud infrastructure spend — servers, storage, databases, networking — behaves like a capacity-based cost. It moves slowly and predictably. Last month is usually a reasonable proxy for this month, with a trend adjustment.
AI spend is different. It is usage-based and feature-coupled. A single feature launch, a prompt change, or a background job going rogue can triple an AI API bill in days. Extrapolating from last month works fine until it suddenly does not.
Start with baselines
A baseline is what normal looks like. It is your anchor. When spend diverges from the baseline, you have something to investigate. When it tracks the baseline, you have evidence that things are on pace.
Here is how to establish one from three months of data:
| Month | Total spend |
|---|---|
| January | $7,500 |
| February | $8,000 |
| March | $8,500 |
Average: $8,000
That average is your baseline. Not a precise prediction — an anchor. If spend next month comes in at $8,100, that is baseline noise. If it comes in at $11,400, that is a signal.
Why the baseline is the anchor, not the target: Teams sometimes treat the baseline as the budget. It is not. The budget is a decision about what you are willing to spend. The baseline is a description of what you have been spending. They inform each other, but they are different things. If the business is growing, you should expect spend to grow — the right question is whether it is growing at the right rate.
Account for trend
If spend has been rising consistently, the baseline average understates what next month is likely to cost.
A simple trend adjustment: take the most recent month and add half the month-over-month growth rate.
From the example above:
- February to March growth: +$500
- Half of that: +$250
- Trend-adjusted forecast: $8,500 + $250 = $8,750
That takes about 30 seconds and is meaningfully better than a flat average. For most teams whose spend is drifting up slowly as the product grows, this one adjustment prevents the "we knew it was going up, we just did not account for it" conversation at month-end.
When the trend adjustment is not enough: If something material changed — a team doubled, a major feature launched, a new provider was added — the trend from prior months does not capture the new baseline. In those cases, set the baseline fresh from the last two to four weeks of data rather than three months. More recent is more representative when behavior has changed.
Add a confidence range
Forecasts are estimates, not certainties. Presenting a single number implies false precision. A range is more honest and more useful.
For stable workloads: "We expect to spend between $8,500 and $9,200 this month."
For variable workloads: "We expect to spend between $8,000 and $10,500 this month."
The range tells leadership when to start paying attention. If you are at $9,800 on day 20 and your range was $8,500 to $9,200, you are tracking above the top of the forecast and the conversation needs to happen now, not at month-end.
AI spend is different — forecast it separately
Cloud infrastructure and AI API spend behave differently enough that they need separate forecasting inputs. Blending them into one number hides the most important signal.
Here is a story that illustrates why:
A team is forecasting their monthly spend. Last month was $9,200 total: $3,800 in cloud infrastructure and $5,400 in AI APIs. They apply a trend-adjusted forecast and land on $9,800 for the current month.
On day 8 of the month, a product engineer ships a new summarization feature. It is well-received and usage picks up quickly. By day 14, total spend is already at $8,600 — on pace for $18,600. The team panics. The finance lead asks what happened. The engineer says the feature is doing well. Everybody stares at each other.
What happened is that the forecast treated AI spend like infrastructure spend. It extrapolated from last month's usage pattern, which did not include the new feature. The feature launched, AI API calls doubled, and the forecast had no way of capturing that because it was built entirely from historical data.
The right way to forecast AI spend uses three forward-looking inputs, not backwards-looking averages:
- Cost per request — the average cost of a complete inference call in your main workflows. Pull this from the last 7 to 14 days of actual data.
- Requests per day — current daily call volume. This changes when traffic grows or features launch.
- Expected daily growth rate — informed by what you know is coming: feature launches, user growth targets, background jobs being enabled.
Simple formula:
Monthly AI spend =
cost per request × requests per day × 30
Example:
| Input | Value |
|---|---|
| Cost per request | $0.007 |
| Requests per day | 12,000 |
| Days in month | 30 |
Projected AI spend: $0.007 × 12,000 × 30 = $2,520
Now add what you know is changing: if a new feature is launching mid-month that you expect to add 3,000 requests per day, model that separately and add it to the second half of the month. Do not just average it in.
Add a stress case for AI: Because AI spend can move so fast, a stress case matters more than it does for infrastructure. Model "what if cost per request rises 20% and volume grows 30% above plan?" That becomes your escalation threshold. If the mid-month pace suggests you are heading toward the stress case, you have a week or two to act rather than discovering it at invoice time.
Review daily pace throughout the month
The forecast is not a document you file and forget. Its value comes from comparing actual daily pace against the projected pace every few days.
Track three numbers daily:
- Month-to-date spend
- Projected month-end at current pace
- Variance vs forecast at this point in the month
If you are 12 days into a 30-day month, you should be roughly at 40% of your forecast. Being at 60% on day 12 is very different from being at 60% on day 25.
Set a simple check: if projected month-end is more than 15% above the forecast, investigate before you keep going.
When the forecast is wrong mid-month
This happens. The right response is not panic — it is a short structured check.
- Quantify the gap. How far above forecast is the current pace? Is it 10% or 50%?
- Is it cloud or AI? Check which category moved. Cloud moving unexpectedly means an infrastructure change — scale-up, a new service, data egress. AI moving unexpectedly means a feature, prompt change, model routing change, or background job.
- Is it volume or cost per request? Volume growing means the product is being used more. Cost per request growing means something about each request got more expensive. These have different fixes.
- Update the forecast. Do not leave a stale number in the spreadsheet. Update it to reflect the new expected month-end and communicate it to whoever needs to know.
- Decide on a response. Accept the overrun if it is justified by product growth. Investigate and fix if it is waste. Escalate if it is large enough to require a leadership decision.
The teams that handle mid-month surprises well are the ones who have practiced explaining the gap: "We are on pace for $12,000 against a $9,500 forecast because the new search feature launched on the 8th and is running at 2.5x the expected volume. Cost per request is flat, so this is growth, not waste. We are updating the forecast to $11,800 and will adjust next month's budget assumption."
That is a three-sentence explanation that took 20 minutes to prepare. It is much easier to give than "we're not sure why it's high."
Keep it simple enough to actually update
You do not need a complex forecasting model. You need something that the team will actually look at and update.
A shared spreadsheet with:
- last three months of actuals
- trend-adjusted baseline
- AI spend estimate from current cost-per-request and volume
- a stress case
- daily pace tracker
Updated weekly, that covers most of what a board or leadership team needs. The key is the weekly update habit. A forecast that was built in January and never touched is nearly useless in March.
How StackSpend helps
StackSpend gives you the inputs that make forecasting easier:
- cross-provider daily spend to establish baselines
- category breakdowns to separate AI inference from infrastructure
- daily forecast vs budget tracking
- cost-per-request visibility across your main workflows
That makes it easier to build a living forecast rather than a one-time estimate.