BOTTOM-UP VS TOP-DOWN FORECASTING

Bottom-Up vs Top-Down Forecasting: Which to Use

By Jason Osajima — former VP of AI at a $250M manufacturer · LinkedIn · Updated June 2026

Quick answer

Bottom-up vs top-down forecasting compared on accuracy, speed, and where each breaks. Plus the middle-out method that actually wins. From a $250M operator.

Use bottom-up forecasting when you need a number operations can build to (a specific SKU at a specific plant), and top-down when you need a fast, stable aggregate for the financial plan or a board view. For most $100M–$1B manufacturers the right answer is neither alone — you forecast at the level where the signal is strongest, then reconcile down to SKU and up to total so production and finance plan against one number. The divergence between the two methods isn't noise to average away; it's an early warning that one of your numbers is wrong.

I learned this the hard way at a $250M manufacturer. Finance forecasted top-down by revenue. Demand planning forecasted bottom-up by SKU. The two numbers ran 9% apart every quarter, and nobody owned closing the gap. Production planned to one, finance reported the other, and the obsolete reserve quietly ate the difference.

What each method actually does

Both methods cut the same cake. They just start from opposite ends of the hierarchy, and that starting point determines where each one is sharp and where it goes blind.

Bottom-up forecasting builds the total from the floor up. You forecast every SKU at every location, then sum. It's granular, it's what operations needs to plan production and replenishment, and it's how demand planners naturally think.

Top-down forecasting starts with the aggregate — total company revenue, or a product family's annual target — then allocates down to SKUs using historical mix. The Institute of Business Forecasting (2024) defines it as forecasting first at a higher level of aggregation, then disaggregating into categories and SKUs. It's how finance and the board think, and it's fast.

The core trade-off comes down to where errors live. Bottom-up is accurate where it matters (the SKU you actually ship) but noisy in aggregate, because thousands of small errors accumulate and the long tail is mostly guesswork. Top-down is stable in aggregate but blind at the SKU level — it'll tell you the family will do $4M and have no idea which of the 60 SKUs inside it customers actually want.

Why aggregate forecasts are more accurate

This isn't an opinion. It's a statistical property of aggregation: errors at the SKU level partially cancel when you sum them, so a forecast made at a higher level is more accurate in percentage terms than the sum of its parts.

The textbook reference is Hyndman and Athanasopoulos (2021) in Forecasting: Principles and Practice, which lays out the hierarchical structure formally — bottom-up, top-down, and the reconciliation methods that beat both. Their forecast reconciliation review (2024) shows that reconciling forecasts across the hierarchy brings information from the granular series back up and from the aggregate series back down, improving accuracy at multiple levels at once.

The largest public test of this is the M5 competition, which forecasted 42,840 hierarchical Walmart series. Makridakis et al. (2022) found that machine-learning methods dominated, but plain exponential smoothing stayed competitive specifically at the product and product-store level — a reminder that the right method depends on where in the hierarchy you're standing.

Where bottom-up wins

Bottom-up earns its keep wherever a human or a machine has to act on a single item.

Replenishment and production. You can't build "the family." You build a specific SKU at a specific plant. Bottom-up is the only forecast operations can execute against.
High-volume, low-variability A items. When a SKU has clean, stable history, its individual forecast is reliable and worth doing precisely. This is the natural output of ABC-XYZ inventory analysis — the A/X quadrant is where bottom-up shines.
Capturing real SKU-level signal. A promo on one SKU, a new customer for another — bottom-up captures it where top-down smears it across the whole family.

Where bottom-up breaks

The long tail kills it. If 70% of your SKUs are low-volume, high-variability C items, their individual forecasts are barely better than noise, and summing 3,000 noisy guesses gives you a shaky total. Intermittent items make this worse, which is why forecasting intermittent demand for spare parts needs its own methods entirely. Bottom-up also drifts: each planner's small optimism compounds into a company number that's systematically high — a measurable forecast bias.

Where top-down wins

Top-down earns its keep on speed and stability.

Aggregate accuracy. As the reconciliation literature shows, forecasting at a higher level of aggregation is more accurate in percentage terms because errors cancel. Your total-company forecast from the top is usually tighter than the sum of SKU forecasts.
Speed and the financial plan. Finance can set a defensible revenue number in an afternoon. The board cares about the family and the quarter, not SKU 40231.
Imposing a reality check. When sales' bottom-up forecast sums to 30% growth and the market is growing 4%, top-down is the gut check that says "prove it."

Where top-down breaks

Mix. Top-down allocates by historical share, so it assumes next quarter's mix looks like last year's. The moment a SKU is launching, dying, or shifting, the allocation is wrong at exactly the SKUs where being wrong is expensive. Top-down will keep replenishing a dying SKU because its historical share says to — and it has no native way to forecast a brand-new one, which is its own discipline covered in new product demand forecasting.

Side-by-side comparison

Dimension	Bottom-up	Top-down
Granularity	SKU-location	Family / total
Aggregate accuracy	Lower (errors compound)	Higher (errors cancel)
SKU-level accuracy	Higher for A items	Low (mix-blind)
Speed	Slow	Fast
Best for	Production, replenishment	Financial plan, board view
Main failure	Long-tail noise, upward drift	Wrong mix, misses NPI/EOL
Natural owner	Demand planning, ops	Finance, FP&A

Read the table as a map, not a scoreboard. Each method is strong exactly where the other is weak, which is the whole reason the winning approach uses both.

The answer: middle-out reconciliation

The method that actually wins isn't picking one. It's middle-out — forecast at the level where the signal is strongest (usually product family or product-line), then disaggregate down to SKU for execution and aggregate up to total for the financial plan. You reconcile in both directions, and the reconciliation is the part that pays.

Here's the workflow that fixed our 9% gap:

Forecast at the family level statistically. This is where you get the best accuracy-to-effort ratio.
Disaggregate to SKU using a mix model, not flat historical share. Bias the mix toward what's growing and away from what's dying.
Let planners override at the SKU level for known events — promos, launches, lost accounts — then re-aggregate.
Reconcile the bottom-up sum against the top-down total. When they diverge more than a tolerance (we used 5%), that's a meeting, not a rounding error. The gap is information.
Lock one number that feeds both production and the financial plan.

Why the reconciliation step is the whole point

A divergence between bottom-up and top-down isn't a problem to average away. It's the early-warning system. When sales' bottom-up was 9% above finance's top-down, one of them was wrong, and finding out which prevented either a stockout or a warehouse full of dead stock.

This is also where forecasting stops being a math problem and becomes a process problem. Reconciling the two numbers is the core of consensus demand planning and the demand-review step of S&OP — getting finance, sales, and operations to commit to one number instead of defending three. The statistics give you the candidate forecast; the meeting decides whose override wins.

How to measure whether it's working

You can't manage the gap if you don't measure it. Track forecast error at every level of the hierarchy, not just the total, because a clean aggregate can hide ugly SKU-level swings that still cause stockouts.

Use a weighted metric so big SKUs carry their weight. APQC's open-standards benchmarking (2024) tracks average monthly demand-forecast error as MAPE, and weighted MAPE keeps a handful of high-value items from being drowned out by noise on the long tail — the trade-off covered in MAPE vs WMAPE. Watch bias separately from error; persistent over-forecast is the silent driver of obsolete stock.

The payoff for getting this right is concrete. McKinsey (2023) reports that embedding AI in operations can cut inventory by 20 to 30 percent, largely by tightening demand forecasts — and a reconciled middle-out process is the planning discipline that makes those gains stick instead of evaporating in the next cycle.

Tooling makes or breaks this

Middle-out is painful in spreadsheets. You're disaggregating, overriding, re-aggregating, and reconciling across thousands of SKUs every cycle, and the moment finance and demand planning work in separate files you're back to two truths.

This is exactly what an integrated planning platform is for — one model where the family-level forecast, the SKU disaggregation, and the financial roll-up are the same object, so reconciliation is automatic instead of a quarterly fight. When demand and finance plan against one live model, the 9% gap doesn't get a chance to form. If you're still living in Excel, the switching decision is its own question worth working through in Excel vs demand planning software.

Stop planning against two numbers

If your bottom-up SKU forecast and your top-down financial plan don't tie out, you already have stranded inventory or stockouts hiding in the gap — you just haven't measured it yet.

We'll run a free planning-maturity and stranded-inventory teardown to find where the two numbers diverge and what it's costing you in trapped cash. Book a call and we'll reconcile your forecast together.

Frequently asked questions

Is bottom-up or top-down forecasting more accurate?

It depends on the level you measure. Top-down is more accurate at the aggregate because SKU-level errors cancel when summed, while bottom-up is more accurate for individual high-volume A items where you have clean history. For the full hierarchy, a reconciled middle-out forecast usually beats either method used alone.

What is middle-out forecasting?

Middle-out forecasting generates the base forecast at an intermediate level — typically product family or product line — then disaggregates down to SKUs for execution and aggregates up to the total for the financial plan. It captures the aggregate stability of top-down and the granularity of bottom-up. Reconciling the two directions is what makes it work.

Which teams should own each forecast?

Demand planning and operations naturally own the bottom-up SKU forecast because they execute against it, while finance and FP&A own the top-down aggregate that feeds the financial plan. The failure mode is letting each team run its own number in isolation. A consensus or S&OP process is what forces them to reconcile to one committed forecast.

How big a gap between bottom-up and top-down is normal?

There's no universal threshold, but many mid-market manufacturers set a tolerance band of around 5% and treat anything beyond it as a signal to investigate rather than average away. A persistent gap in the same direction usually points to forecast bias — often optimism compounding through bottom-up SKU forecasts. The gap itself is useful information about which number to trust.

Does AI replace the choice between bottom-up and top-down?

No. Machine-learning models still forecast at specific levels of the hierarchy and still need reconciliation to stay coherent across SKU, family, and total. As the M5 competition showed, AI methods win on accuracy but don't eliminate the hierarchical structure — they make middle-out reconciliation easier to run at scale, not unnecessary.

Let's see what's worth building first.

A 15-minute call: tell me where your AI or planning is stuck, and I'll tell you the one thing worth building first — and whether it's worth doing at all.

Book a 15-min call →More field notes

More field notes

Consensus Demand Planning: How It Works and Why How to Improve Forecast Accuracy: 9 Proven Tactics How to Calculate Forecast Accuracy (Formula + Examples)MAPE vs WMAPE: Which Forecast Error Metric to Use