Commissioned, Curated and Published by Russ. Researched and written with AI.

This is the living version of this post. View versioned snapshots in the changelog below.


What’s New This Week

5 March 2026 (evening): The 2025 numbers are in, and they are stark. DRAM prices rose 172% throughout 2025 – not a rounding error. Samsung halted new orders for DDR5 modules entirely to reassess pricing structures. Micron went further and exited its Crucial consumer brand altogether. Contract DDR5 pricing more than doubled over the course of 2025, moving from around $7 per unit to $19.50. Samsung’s 32GB consumer modules went from $149 to $239 – a 60% increase – before Samsung paused the market.

Micron’s Q2 2026 gross margin is expected to reach 68%, compared to losses at the 2022-2023 DRAM trough. The Crucial brand exit is clarifying: that business competed on commodity pricing in a market that is no longer commodity. Why maintain low-margin consumer retail infrastructure when the HBM and enterprise server channels are the entire growth story?

The downstream impact is arriving on schedule. PC prices are projected to rise 15-20% in Q1 2026 as system builders absorb higher memory costs. This is the cascade the post’s main body describes – AI accelerator demand pulling wafer capacity toward HBM, compressing conventional DDR5 supply, and the cost landing on the consumer at the point of purchase. It is no longer speculative.


Changelog

DateSummary
5 Mar 20262025 DRAM prices up 172%, Samsung halts DDR5 orders, Micron exits Crucial brand – the cascade arrives.
5 Mar 2026AMD Ryzen AI 400 for AM5 desktop – 50 TOPS NPU on consumer platform, standard DDR5.
5 Mar 2026Inaugural publication

What’s Actually Happening to Memory Prices

The memory industry is not in uniform crisis. It is in a bifurcation. Two types of memory are diverging rapidly in price, manufacturing priority, and strategic importance.

HBM (High Bandwidth Memory) is in a seller’s market with demand that outstrips supply by a significant margin. HBM3E – the variant required by Nvidia’s H200, B200, and GB200 accelerators – commands roughly $20,000 to $25,000 per stack at reported contract prices. Each H200 GPU requires eight such stacks for its 141 GB memory complement. The maths on what that does to accelerator build costs are left as an exercise. SK Hynix reported in 2024 that its HBM business was sold out through 2025; Micron made similar noises when it entered the HBM3E market in late 2024. Samsung, by contrast, spent much of 2024 working through yield and qualification issues with Nvidia’s HBM3E programme, putting it behind schedule. Note: market share figures for early 2026 are extrapolated from late 2024 reporting – treat specific percentages with caution.

DDR5 is a more nuanced story. After the catastrophic oversupply crash of 2022-2023 – where a 32 GB DDR5-5600 kit fell from launch prices north of $200 to under $90 – prices have been recovering. The recovery is not driven by strong consumer demand. It is driven by the same manufacturers pulling wafer capacity away from conventional DRAM to feed HBM. By late 2024, decent 32 GB DDR5 kits were back in the $110-130 range. The trajectory through 2025 has been upward – sharply so, with contract prices roughly doubling over the year. Specific current pricing is moving fast enough that checking a tracker like Newegg or Camelcamelcamel directly is more useful than any figure printed here.

Server DRAM (DDR5 RDIMM) is under additional pressure from a different angle. AI infrastructure deployments do not just need HBM on the GPU. They need substantial DDR5 registered ECC memory on the host side too – a DGX H100 system runs eight GPUs alongside dual-socket Intel or AMD CPUs that want their own memory. Scale that across hyperscaler deployments and you have a separate demand signal compressing server DRAM supply. Enterprise procurement teams reported lead times stretching through 2024 into 2025.

LPDDR5 – the mobile memory type used in phones, Arm laptops, and Apple Silicon systems – has been somewhat insulated but not immune. The Apple Silicon story is intertwined with this: M-series chips use integrated LPDDR5 packages that Apple controls supply for, which partly decouples them from the spot market volatility. For everyone else building Arm laptops on Snapdragon X or MediaTek, LPDDR5 sits in a supply chain also stressed by the overall fab capacity crunch.

Who is making money: SK Hynix is the clear winner. Its 2024 results reflected HBM-driven revenue that transformed its financial position – HBM reportedly accounted for over 30% of total DRAM revenue by mid-2024, at dramatically higher margins than commodity DRAM. Samsung’s semiconductor division recovered in 2024 as DRAM prices generally rebounded, but HBM qualification delays meant it captured less of the premium market than SK Hynix. Micron is a late entrant to HBM3E and is growing share, helped by US policy tailwinds and significant CHIPS Act funding. All three are profitable. The question is who captures the highest-margin AI accelerator supply contracts.


Why It’s Happening: The HBM Supply Chain Explained

HBM is not just expensive DRAM. It is structurally different, and understanding why matters for understanding the squeeze.

Conventional DDR5 is manufactured on standard DRAM process nodes, stacked flat on a PCB, and connected via a relatively simple interface. HBM is manufactured as a stack – multiple DRAM dies bonded together vertically using Through-Silicon Vias (TSVs), mounted on a silicon interposer alongside the compute die (GPU, ASIC, or AI accelerator). The interposer itself requires advanced packaging capacity, typically from TSMC or Samsung’s advanced packaging lines. The whole assembly is a system-in-package rather than components on a board.

This matters for two reasons:

First, HBM DRAM dies cannot be repurposed as DDR5. They are fabricated on different process nodes with different cell architectures. A Samsung or SK Hynix fab running HBM wafers is not running DDR5 wafers. So HBM demand does not just compete with DDR5 for final revenue – it competes for wafer starts at the fabrication stage.

Second, HBM requires advanced packaging capacity that is separately constrained. TSMC’s Chip-on-Wafer-on-Substrate (CoWoS) packaging line – the technology used to mount HBM stacks alongside Nvidia GPUs – was reportedly booked out through 2024 and into 2025. TSMC has been expanding CoWoS capacity aggressively, but it takes time to qualify new lines.

The result is a cascade. Nvidia, AMD (for its Instinct MI300 line), and Google (for its TPU pods) are all competing for the same HBM supply from the same three manufacturers. AI chip revenue justifies premiums that conventional computing cannot match. Memory fabs follow the money. DDR5 capacity tightens. Prices rise.

There is a further wrinkle: new HBM generations require new fab process nodes. HBM4, standardised by JEDEC in April 2025, will use even more advanced process nodes and more complex packaging. The ramp-up period for each new generation pulls engineering and capacity resources away from mature production – which includes the conventional DRAM that everyone else buys.


The AI Hardware Paradox

Here is the uncomfortable loop: the chips designed to make AI workloads fast are the direct cause of hardware becoming more expensive for everyone doing conventional computing.

An Nvidia B200 GPU contains 192 GB of HBM3E across eight stacks. Building enough B200s to satisfy current hyperscaler demand requires a staggering amount of HBM fabrication capacity – capacity that, a few years ago, was producing the DDR5 modules that ended up in your workstation or server rack.

The acceleration loop works like this: AI investment drives GPU orders, GPU orders drive HBM demand, HBM demand pulls wafer capacity from conventional DRAM, conventional DRAM tightens in supply and rises in price, on-prem hardware refresh becomes more expensive, which makes cloud (with its amortised infrastructure costs) comparatively more attractive, which increases cloud usage, which increases hyperscaler GPU demand, which loops back to the start.

This is not a conspiracy – it is just capital allocation behaving rationally. HBM commands 5-10x the per-bit margin of commodity DDR5. Of course fab capacity follows that signal.

The paradox is most visible in the desktop market. AMD’s inability to release a flagship RDNA 4 or CDNA-based desktop card with serious HBM backing is at least partly a supply story – the HBM a flagship consumer card would need is more lucratively deployed in data centre accelerators. Consumer graphics has been somewhat stuck at the upper end as a result. Nvidia’s RTX 50-series uses GDDR7 rather than HBM (which helps consumer availability), but the mid-to-high tier remains pressured.

For engineering organisations, the paradox manifests as a cost problem in both directions: cloud AI compute is expensive and getting more so, while on-prem refresh to support conventional workloads is also getting more expensive. There is no cheap lane.


Real-World Impact

Desktop and workstation refresh cycles are stretching. If you were planning to refresh a developer workstation pool in 2024 or 2025, you will have noticed that the cost trajectory reversed compared to 2022-2023, when DRAM was cheap. Systems that were reasonably priced 18 months ago now cost meaningfully more to replicate. Most organisations are holding hardware longer.

On-prem server upgrades are painful. DDR5 RDIMM prices for server-grade memory (ECC registered, often with specific speed grades) have been tracking upward alongside the broader DRAM recovery. A dense memory configuration for a modern server – say 512 GB or 1 TB for a database or analytics workload – costs more today than it would have at the market trough in 2023. The pain is not catastrophic, but it is real, and it is compounding with CPU platform transition costs (LGA1700 to LGA1851 for Intel, SP5 for AMD EPYC).

Cloud instance pricing is a lagging indicator. Hyperscalers hedge memory costs through large-scale contracts and their own supply chain leverage, but their infrastructure costs are rising. Expect cloud compute pricing – particularly for memory-intensive instance types – to follow with a 12-24 month lag. That timeline is, to be transparent, an estimate. Cloud providers do not announce price increases loudly.

Budget AI compute is accessible; pro compute is constrained. This is perhaps the most interesting split. Apple Silicon – M4 Pro, M4 Max – delivers strong per-watt AI inference performance using LPDDR5 in a unified memory architecture that sidesteps the HBM supply chain entirely. An M4 Max MacBook Pro with 128 GB unified memory is, right now, one of the more cost-effective ways to run mid-sized language model inference locally. Snapdragon X Elite laptops occupy a similar niche for those in the Windows ecosystem. These are not datacenter replacements – they cannot do large-scale training – but for individual engineers running inference or fine-tuning smaller models, they are genuinely useful and priced at commodity laptop levels.

Pro-tier AI compute – anything involving HBM-based accelerators – remains scarce and expensive. An H100 80GB still commands prices well above its original MSRP on secondary markets. Blackwell-generation hardware is allocated to hyperscalers before it reaches the broader market. If you need GPU cluster capacity in 2026, you are almost certainly renting it.

AMD’s desktop chip situation is worth a specific note. RDNA 4 (RX 9070 series) launched in early 2025 without the flagship HBM-backed card that would compete directly with Nvidia’s top tier. This is partly a design choice (RDNA 4 was optimised differently) and partly a supply reality. AMD’s Instinct MI300X – its data centre AI accelerator – uses HBM3 and is competing directly with Nvidia for the same fab and packaging capacity. AMD cannot simultaneously flood the consumer market with HBM-backed cards and maintain its AI accelerator ramp. Something gives, and it was the consumer flagship.


When Does It Ease?

The honest answer is: not soon, and it depends on which segment you care about.

New fab capacity is coming, but slowly. Semiconductor fab construction timelines run 3-5 years from groundbreaking to meaningful production. Announced expansions include:

  • Micron’s expansion in Boise, Idaho and Clay, New York – supported by approximately $6.1 billion in CHIPS Act direct funding. The New York fab is focused on advanced DRAM and HBM. Production ramps are expected in the 2025-2027 window, with meaningful HBM output probably not before 2026-2027.
  • Samsung’s advanced DRAM expansion in Pyeongtaek, South Korea – continuing investment in HBM and advanced DRAM, with a multi-year ramp. Samsung’s Taylor, Texas fab is a logic/foundry facility (advanced nodes for fabless customers), not directly a memory expansion, despite the headlines it generated.
  • SK Hynix’s expansion in Cheongju and planned US fab in Indiana – SK Hynix announced a $3.87 billion US expansion in Indiana in 2023, with HBM and advanced packaging as focus areas. Again, a multi-year build.
  • TSMC’s Arizona fabs – producing logic chips (N4P and N3 nodes) rather than memory, but relevant because TSMC’s advanced packaging (CoWoS) expansion in Arizona and Taiwan directly affects how many HBM-bearing AI accelerators can be assembled. CoWoS capacity has been the gating factor at various points through 2024-2025.

The 2-3 year lag problem is real and structural. Even if AI demand softened tomorrow, the decisions made in 2023-2024 about where to invest fab capacity determine supply through 2026-2027 at minimum. There is no tap to turn on quickly.

What would change the trajectory faster:

  • AI demand softening. If hyperscaler GPU orders declined – due to economic conditions, a shift in AI investment sentiment, or the emergence of more compute-efficient model architectures – HBM demand would ease and wafer capacity would flow back to conventional DRAM. This would help DDR5 and server DRAM pricing relatively quickly (6-12 months). There is no strong current signal of this happening, but AI capital expenditure at this scale is not historically stable.
  • HBM alternatives. CXL-attached memory (Compute Express Link) is a longer-term prospect that could allow non-HBM memory to serve some roles currently filled by HBM. Samsung’s CMM-D (CXL Memory Module) is a real product. But CXL memory is not a drop-in replacement for the tight-coupled HBM in a GPU package – it addresses a different problem (memory capacity expansion for host CPUs). Near-memory compute and processing-in-memory research could eventually reduce HBM dependence, but this is a 5+ year horizon.
  • Chiplet packaging innovations. If HBM can be integrated into chiplet designs that require less CoWoS interposer area, or if new packaging techniques reduce the per-accelerator HBM cost, the economics shift. Intel’s Foveros and TSMC’s SoIC are relevant here. Again, multi-year horizon.
  • HBM4 ramp. HBM4 (standardised April 2025) offers higher bandwidth per stack and potentially more efficient use of fab capacity per bit of bandwidth delivered. As HBM4 production scales, it could allow the same fab capacity to serve more accelerators, easing the intensity of the supply squeeze. This assumes HBM4 qualification and ramp proceed without the yield issues that slowed Samsung’s HBM3E programme.

Realistic expectation: meaningful relief in conventional DRAM pricing is probably a 2027 story if AI investment continues at current rates. If AI demand softens, 2026 is possible. HBM supply will remain tight as long as accelerator demand stays where it is.


What Engineers and Engineering Leaders Should Actually Do

On cloud versus on-prem refresh decisions:

The standard advice to “move to cloud when hardware is expensive” deserves scrutiny here. Cloud providers are not immune to the same cost pressures – they are just better at absorbing them over multi-year contract cycles. If you have workloads with predictable, steady-state compute requirements and are currently on-prem, the calculus has not fundamentally changed: on-prem is still cheaper at scale for baseline load, cloud is better for burst and for avoiding capital expenditure on depreciating assets.

What has changed: the cost of being wrong about on-prem hardware choices is higher. If you buy server memory now and DRAM prices correct sharply in 2027, you overpaid. If you buy now and prices keep rising, you locked in at a better point. No one knows which of these happens. In an environment of genuine uncertainty, phasing purchases – rather than doing a large refresh all at once – reduces exposure.

On timing hardware purchases:

  • For memory specifically: there is no clear signal that DDR5 prices will correct sharply downward in the near term. If you need the hardware, buy it. If you can defer 12-18 months, the expanded fab capacity coming online might improve the market – but this is speculative.
  • For AI accelerators: if you need H100/H200 class hardware, the secondary market remains the most accessible route. Prices are elevated but stable-ish. Waiting for Blackwell hardware at accessible prices is likely a 2026-2027 story at best for non-hyperscale buyers.
  • For laptops and developer workstations: Apple Silicon remains genuinely compelling if your team can work on macOS. The M4 generation offers strong performance-per-pound and sidesteps the conventional DRAM supply chain. If you need Windows, Snapdragon X Elite machines are a reasonable alternative for inference-heavy developer workflows.

On AI workload planning given the pricing environment:

Be honest about what you actually need. Running a 70B parameter model locally on a workstation cluster is expensive both in hardware and in operational complexity. For most engineering teams, API-based inference from a hyperscaler is cheaper than owned infrastructure until you reach very high query volumes. The break-even point has shifted upward as accelerator costs have increased.

For fine-tuning and training: cloud burst for the heavy lifting, small-scale local experimentation for iteration. An M4 Max or equivalent can do meaningful small-model fine-tuning and prototype work. Reserve cluster time for production runs.

For latency-sensitive inference: this is the strongest case for owned or reserved infrastructure, because the economics of API calls at high frequency do not scale. But even here, reserved cloud instances (not on-demand) often win unless your load is highly predictable and 24/7.

For engineering leaders specifically: treat memory and accelerator hardware costs as a genuine procurement risk, not just a cost-of-doing-business. The volatility of the last 3 years (crash in 2022-2023, recovery and tightening in 2024-2025) is unlikely to resolve into stable, predictable pricing quickly. Build hardware refresh cycles with more margin than you historically might have, and keep an eye on fab capacity announcements as a leading indicator of where prices are heading.


Sources

The following informed this post. Where a specific URL could not be verified at time of writing, the publication and general reference are given.

  1. JEDEC Solid State Technology Association, 2022, High Bandwidth Memory (HBM) DRAM Standard (JESD235C), JEDEC, https://www.jedec.org/standards-documents/docs/jesd235c

  2. JEDEC Solid State Technology Association, 2025, HBM4 Standard Announcement, JEDEC, https://www.jedec.org/

  3. Wikipedia contributors, 2026, 2024-2026 global memory supply shortage, Wikipedia, https://en.wikipedia.org/wiki/2024%E2%80%932026_global_memory_supply_shortage

  4. SK Hynix Investor Relations, 2024, Q4 2024 Earnings Release, SK Hynix, https://www.skhynix.com/investor/ (HBM revenue and market share data)

  5. Micron Technology, 2024, Micron Receives Up to $6.1 Billion in CHIPS Act Funding, Micron Press Release, https://investors.micron.com/

  6. US Department of Commerce, 2024, CHIPS and Science Act: Micron Award Announcement, US Government, https://www.commerce.gov/

  7. Network World, 2026, Samsung warns of memory shortages driving industry-wide price surge in 2026, https://www.networkworld.com/article/4113772/samsung-warns-of-memory-shortages-driving-industry-wide-price-surge-in-2026.html

  8. EE Times, 2026, The Great Memory Stockpile, https://www.eetimes.com/the-great-memory-stockpile/ (Micron Q2 2026 gross margin data)

  9. SemiAnalysis (various authors), 2024, HBM Market Dynamics and Fab Capacity Allocation, SemiAnalysis, https://www.semianalysis.com/subscription research; publicly discussed in semiconductor community.

  10. Samsung Semiconductor, 2024, Samsung Electronics Q3 2024 Business Results, Samsung, https://www.samsung.com/semiconductor/

  11. AMD, 2024, AMD Instinct MI300X Product Brief, AMD, https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html

  12. Nvidia, 2024, NVIDIA H200 Tensor Core GPU Datasheet, Nvidia, https://www.nvidia.com/en-us/data-center/h200/

  13. Nvidia, 2024, NVIDIA Blackwell Architecture Technical Overview, Nvidia, https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/

  14. Apple, 2024, Apple M4 Max Chip, Apple Newsroom, https://www.apple.com/newsroom/


Commissioned, Curated and Published by Russ. Researched and written with AI.

You are reading the latest version of this post. View all snapshots.