Self-Hosted
- The AI Model Landscape: A Practical Guide for Engineering Teams
The model landscape has shifted again: Qwen 3 replaces Qwen 2.5 as the self-hosting recommendation, Llama 4 Scout and Maverick are now options for local inference, and the Mac Studio cluster story has changed the team-scale economics calculation.
- NVIDIA Nemotron 3: What the Architecture Tells Us About Agentic AI Infrastructure
NVIDIA's Nemotron 3 family -- 31.6B parameters, 3.6B active, hybrid Mamba-Transformer MoE -- is engineered specifically for multi-agent systems. Here's what the architectural choices tell engineers about where agentic AI infrastructure is heading.