Commissioned, Curated and Published by Russ. Researched and written with AI.


What’s New This Week

The Trivy supply chain attack (March 19-21, 2026) is a reminder that the security tooling in your CI pipeline is itself an attack surface – but that’s covered in the Security signal. On the hardware side: Nvidia’s RTX 50-series (Blackwell consumer) is rolling out, with the RTX 5090 confirmed at 32GB GDDR7. Pricing has not yet stabilised enough to recommend. The 4090 equivalent pricing position hasn’t dropped yet. Hold on Blackwell unless you need the bleeding edge. The RX 7900 XTX remains the VRAM-per-pound leader at the high end.


Changelog

DateSummary
23 Mar 2026Initial publication.

The Principle: VRAM Is the Constraint

For local LLM inference, one number matters more than any other: VRAM. The rule of thumb is roughly 2GB of VRAM per billion parameters at standard quantisation (Q4/Q5). A 13B model needs around 8-10GB. A 70B model needs around 40GB. These are floors, not ceilings – context window and batch size push the number up.

The practical thresholds in 2026:

  • 8GB VRAM: 3-7B models comfortably. Entry point, fine for experimentation, not recommended if you’re running agents.
  • 12GB VRAM: 7-8B well, 13B in a squeeze. Workable but you’ll feel the limit quickly.
  • 16GB VRAM: 13-14B models smoothly, 30B with aggressive quantisation. The practical agent floor. Most people running local coding assistants or agent loops sit here.
  • 24GB VRAM: 30-34B comfortably, quantised 70B possible. Where serious inference work lives.
  • 48GB+: 70B cleanly, 120B with quantisation. Research territory or production pipelines.

The implication: any GPU you buy in 2026 should have at least 12GB VRAM. Ideally 16GB. The 8GB cards are already a compromise and models are not getting smaller.


£500 “The Enabler” – CPU Inference Focus

For a first local AI machine or a tight budget, you don’t need a discrete GPU. The AMD Ryzen 5 8600G has an integrated Radeon 760M that can allocate up to 8GB of shared VRAM from system RAM, running 7B models at reasonable speed.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 5 8600G£190
MotherboardB650 budget£80
RAM64GB DDR5£90
Storage1TB NVMe SSD£55
Case + 550W 80+ Gold PSU£70
Total~£500

The 64GB RAM matters here: it feeds the iGPU with enough headroom and also enables CPU inference on larger models via llama.cpp. With CPU inference, a 13B Q4 model is usable – just slow (a few tokens per second). The iGPU path gets you 7B at something approaching interactive speed.

What it runs: 7B-13B models via CPU inference, 7B reasonably fast via iGPU. What it doesn’t: Fast inference on anything over 13B. Who it’s for: First local AI machine, experimenting, tight budget.

This is the sweet spot for 2026. The RTX 4060 Ti 16GB is the specific recommendation: 16GB VRAM at the lowest price point of any 16GB card, full CUDA support, runs 13-14B models at 20-30 tokens/sec. That’s fast enough for real work.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 7 7700X£185
GPURTX 4060 Ti 16GB£300
MotherboardB650 mid-range£100
RAM32GB DDR5£60
Storage2TB NVMe Gen4£80
Case + 750W 80+ Gold PSU£95
Total~£800

The Intel i7-14700F (~£200) is a reasonable CPU alternative if you find a better deal. On the GPU, the RTX 4060 Ti 16GB is specifically the 16GB variant – the 8GB version exists at a similar price point and is not recommended (see What to Avoid below).

What it runs: 13-14B models smoothly, 30B quantised, full agent loops, ComfyUI for image work. What it doesn’t: 70B at useful speed. Who it’s for: Engineers running daily agent work, coding assistants, local LLM power users.

£1500 “The Serious Setup” – Maximum VRAM Per Pound

At this tier, the GPU choice is the interesting one. The RX 7900 XTX at £600-650 gives you 24GB GDDR6 at roughly half the price of an RTX 4090. ROCm support with llama.cpp is now production-ready, which wasn’t confidently true a year ago.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 9 7950X£380
GPURX 7900 XTX 24GB£625
MotherboardX670E£180
RAM64GB DDR5£110
Storage2TB NVMe Gen4 + 2TB data£140
Case + 1000W 80+ Gold PSU£140
Total~£1500

The Ryzen 9 7950X (16 cores) earns its place here: parallel inference requests, heavy compilation, and fine-tuning runs all benefit from real core counts. The 16-core chip handles cases where the GPU is waiting on CPU-side processing.

For the GPU, if you prefer the CUDA ecosystem: the RTX 4080 Super at ~£750 is the alternative. You get 16GB instead of 24GB, but better Tensor cores, deeper framework support, and no ROCm dependency. The right choice depends on your tooling requirements.

What it runs: 34B models cleanly, 70B quantised at ~15-20 tokens/sec, small model fine-tuning. Who it’s for: Running multiple models, production agent pipelines, serious local inference.


GPU Quick Reference

BudgetPickVRAMNotes
Under £250Intel Arc B58012GBSurprise pick. ROCm + llama.cpp work. Best VRAM:price at this tier.
£300-400RTX 4060 Ti 16GB16GBRecommended. RTX 4070 12GB if CUDA perf matters more than VRAM.
£450-600RX 7900 GRE 16GB16GBSolid AMD option. ROCm production-ready.
£550-650RTX 4070 Super 16GB16GBGood CUDA card, competitive at this range.
£650-750RX 7900 XTX 24GB24GBBest VRAM:price at the high end for AI workloads specifically.
£1,200+RTX 4090 24GB24GBFastest consumer card. RX 7900 XTX closes the gap on LLM inference specifically.

AMD vs Nvidia in 2026: The CUDA ecosystem remains deeper for AI tooling – fine-tuning, Triton kernels, some research code, and many commercial tools are CUDA-first. For pure inference with llama.cpp or Ollama, AMD ROCm is now production-ready and the gap is small. Choose AMD for VRAM budget; choose Nvidia for ecosystem breadth and fine-tuning use cases.


What to Avoid

Any GPU with less than 12GB VRAM for new purchases in 2026. Models are not getting smaller. An 8GB card bought today will feel the ceiling within months.

The RTX 4060 8GB specifically. The 16GB version exists at a similar price. There is no reason to buy the 8GB variant.

Pre-built “AI PCs” from OEMs. They typically ship mediocre GPUs at premium prices with inadequate cooling. The “AI PC” label is marketing. Build it yourself.

Mining GPUs on the secondary market. VRAM runs hot under sustained load and mining rigs run sustained load continuously. Degraded VRAM, no warranty, and the seller’s incentive is to not tell you this.


Gaming & General Purpose Self-Build

The same three price points, optimised for games and everyday use rather than AI inference. Gaming builds prioritise GPU clock speed, single-core CPU performance, and fast memory – different priorities from the AI tiers above.

£500 “The 1080p Machine”

Budget gaming that doesn’t compromise where it counts. The GPU gets the lion’s share of the budget.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 5 5600£90
GPURX 7600 XT 16GB£210
MotherboardB550 mid-range£65
RAM32GB DDR4-3600£40
Storage1TB NVMe SSD£50
Case + 550W 80+ Gold PSU£65
Total~£520

The RX 7600 XT’s 16GB is unusual at this price – most cards at this budget ship with 8GB, which is tightening for modern titles. At 1080p max settings this build handles anything currently released. The Ryzen 5 5600 remains excellent value for gaming: fast single-core clocks and no need for DDR5 keeps costs down.

Upgrade path: drop in an RX 7700 XT or RTX 4060 Ti later without touching anything else.

£800 “The 1440p Sweet Spot”

The real gaming sweet spot. The RTX 4070 handles 1440p at max settings with DLSS – and at this price nothing else touches it for the combination of performance and efficiency.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 5 7600£150
GPURTX 4070 12GB£350
MotherboardB650£90
RAM32GB DDR5-6000£65
Storage2TB NVMe Gen4£80
Case + 750W 80+ Gold PSU£90
Total~£825

DDR5-6000 matters here – AMD Ryzen 7000 series benefits measurably from fast memory in CPU-limited scenarios. The RTX 4070 is the sweet spot for DLSS 3 (frame generation) which effectively doubles perceived frame rate in supported titles. At 1440p this rig doesn’t need to compromise.

Alternative GPU: RX 7800 XT 16GB (~£280) saves £70 and gives you 4GB more VRAM – trade DLSS for VRAM headroom and FSR 3.

£1500 “The 4K Powerhouse”

The Ryzen 7 7800X3D is the standout choice at this tier – the 3D V-Cache gives it a larger lead in CPU-limited games than any other architecture. Pair it with the RTX 4080 Super for 4K high/ultra framerates with headroom to spare.

ComponentChoiceApprox. Cost
CPUAMD Ryzen 7 7800X3D£290
GPURTX 4080 Super 16GB£750
MotherboardX670£160
RAM32GB DDR5-6000£70
Storage2TB NVMe Gen4£90
Case + 850W 80+ Gold PSU£120
Total~£1,480

The 7800X3D’s 3D V-Cache is the reason for this specific CPU choice – in CPU-bound gaming scenarios it can outperform chips costing twice as much. The RTX 4080 Super handles 4K with DLSS Quality mode (effectively native quality) at high framerates across modern titles.

Note: if you also want to run local AI inference on this build, swap the RTX 4080 Super for an RX 7900 XTX (£640, saves ~£110) and accept the DLSS trade-off – you gain 8GB VRAM (24GB total) at lower cost. See the AI inference builds above for context on when that matters.


The Hardware Landscape: What to Track

This section will update as the market moves.

Nvidia Blackwell (RTX 50-series): Consumer rollout is underway. RTX 5090 at 32GB GDDR7 is confirmed, but pricing has not stabilised. Not yet recommended over established 40-series pricing unless you specifically need the latest architecture for research work. RTX 5080 at 16GB is expected – watch for that positioning against the 4060 Ti 16GB.

AMD RDNA 4: RX 9000-series details emerging. RDNA 4 is expected to improve ROCm performance further. Watch for any 20GB+ VRAM options in the mid-range.

Memory pricing trends: DDR5 continues to fall. Budget more aggressively on RAM capacity – 64GB is achievable cheaply now and matters for CPU-side inference and iGPU configurations.

Notable prebuilts: The Tinybox and DGX Station GB300 are interesting at the far end of the budget range but outside self-build territory. Apple Silicon (M3/M4 Ultra) remains a competitive option for unified memory configurations – 192GB of unified memory for 70B inference is compelling if you’re already in the Apple ecosystem, though at significantly higher cost per GB of effective VRAM.