Infrastructure
- Self-Hosting Your AI Stack: A Practical Guide
Updated 3 April 2026: Google releases Gemma 4 under Apache 2.0 -- the 26B MoE activates only 3.8B parameters at inference, the 31B Dense hits #3 on Arena, and E2B/E4B run on Raspberry Pi at 6GB RAM; Gemma is now a credible primary alternative to Qwen for self-hosted inference.
- The Agentic Turn: Personal AI Agents Are Becoming Infrastructure
Anthropic's leaked Claude Code source reveals an 'Undercover Mode' for stealth AI contributions to public open-source repos without attribution, documenting the accountability gap inside a major lab's own tooling; NVIDIA OpenShell gains broad enterprise security partnerships, confirming agent governance is now a baseline category expectation.
- CISA Says CVE-2026-22719 Is Being Exploited. Broadcom Says It Can't Confirm That.
CISA added a high-severity unauthenticated command injection flaw in VMware Aria Operations to its Known Exploited Vulnerabilities catalog on March 3. The federal patching deadline has passed. Broadcom acknowledges reports of exploitation but says it cannot independently confirm them.
- The Agentic Turn: Personal AI Agents Are Becoming Infrastructure
Anthropic has shipped Dispatch inside claude.com: scheduled tasks, proactive updates, and persistent memory as a native consumer feature. The reactive-to-proactive shift that defines a Claw is now available without self-hosting.
- Self-Hosting Your AI Stack: A Practical Guide
Updated 26 March 2026: Intel Arc Pro B70 launches with 32GB VRAM at $949, the first single card to hit that tier under $1,000; first-person LiteLLM malware incident account adds depth to the supply-chain risk section.
- One Compromised Key: How the Resolv Hack Printed $23M
An attacker compromised an AWS KMS private key to bypass oracle controls and mint ~$80M in unbacked stablecoin, crashing the Resolv protocol and cascading into 15 Morpho vaults. The engineering lesson is about key management and oracle architecture, not crypto.
- Infrastructure in the Line of Fire: What the AWS Drone Strikes Actually Mean for SREs
Drone activity has disrupted AWS Bahrain twice in March 2026. Two strikes in one month is a pattern, not a one-off. What the confirmed recurrence means for SREs thinking about region risk, DR planning, and cloud vendor exposure in active conflict zones.
- When the Cloud Goes Down, 150,000 Drivers Can't Start Their Cars
A cyberattack on Intoxalock, a maker of court-mandated ignition interlock breathalyzers, knocked out its cloud services from March 14 to March 22, leaving drivers across 46 US states unable to start their vehicles. The incident is a case study in what happens when legally mandated infrastructure has no offline fallback.
- NixOS Is the Right Infrastructure for AI Agents
AI agent environments are uniquely brittle in ways that traditional software is not. NixOS, with its declarative model, atomic rollbacks, and immutable base layer, addresses the specific failure modes that make agent infrastructure hard to operate at scale.
- Locked In: What $1 Trillion in AI Compute Capital Means for Your Infrastructure Decisions
At GTC 2026, Jensen Huang said he now sees at least $1 trillion in purchase orders for Blackwell and Vera Rubin through 2027. That capital is already committed and being manufactured -- and it has structural implications for every engineering team making build vs buy decisions over the next three years.
- Your AI Agent's Sandbox Has a Hole in It: DNS Exfiltration and the Bedrock AgentCore Flaw
AWS Bedrock AgentCore's Sandbox mode was documented as providing complete network isolation -- it doesn't. Researchers demonstrated a full bidirectional command-and-control channel over DNS, entirely bypassing egress controls. Here's what that means for every cloud-hosted AI agent.
- The Blast Radius Problem: Why AI Agent Security Is a Different Category
A capable AI agent must have access to do useful things. That access is also the attack surface. Using OpenClaw's documented security incidents as a case study, this piece examines why agent security is structurally different from traditional software security and what engineers should actually do about it.
- Nvidia's Open-Source Play: Nemotron 3 and the Agentic Token Tax
Running agentic AI workflows through closed APIs is getting expensive fast. Nvidia's Nemotron 3 Super is the most credible open-weight answer yet -- but the hardware strategy underneath it is worth understanding before you reach for the Ollama docs.
- Local AI Inference Has Crossed a Threshold
Three things converged in 2026: hardware that can actually run useful models, open-weight models that match cloud quality for most engineering tasks, and economics that make the API-forever assumption look increasingly expensive. The architectural question has shifted from 'can you run AI locally?' to 'why are you paying per-token when you don't have to?'
- 31.4 Tbps: The World's Largest DDoS Botnet, Taken Down
The DoJ disrupted four IoT botnets behind a 31.4 Tbps world record DDoS attack. Three million infected devices, mostly off-brand Android TVs and set-top boxes. Kimwolf, AISURU, JackSkid, and Mossad are Mirai variants operated as a professional cybercrime-as-a-service business. C2 is down. The devices are still infected.
- arXiv After Cornell: When Research Infrastructure Goes Independent
arXiv is leaving Cornell after 35 years and establishing itself as an independent nonprofit. For the AI industry, which depends on arXiv for paper distribution, training data, and research circulation, this is a story about critical infrastructure going through a governance transition.
- NemoClaw: Nvidia's Enterprise Agent Security Stack
NemoClaw is Nvidia's enterprise agent security stack for OpenClaw -- a single-command install that adds OpenShell sandboxing, policy-based guardrails, and a privacy router to autonomous agents. Launched at GTC 2026 on March 16. This signal tracks how the enterprise AI agent security infrastructure layer develops.
- NVIDIA Vera Rubin: What 10x Cheaper Inference Actually Means
NVIDIA announced Vera Rubin at GTC 2026: 3.3-5x inference improvement over Blackwell, 10x inference token cost reduction, custom Vera ARM CPU, HBM4 at 22 TB/s. Ships H2 2026. The performance numbers matter for procurement. The cost numbers matter for every engineer deciding what to build.
- The wiper era: why your ransomware IR plan has a gap
Enterprise incident response has been ransomware-centric for a decade. Nation-state proxies using destructive wipers operate on completely different incentives -- and your playbook assumes an attacker who wants something.
- The Tool That Protects Your Enterprise Just Destroyed Stryker's
Handala, an Iran-linked hacktivist group, wiped 200,000+ Stryker endpoints by abusing Microsoft Intune's remote wipe capability after compromising Entra admin credentials. The attack is a case study in how your highest-trust security tooling becomes your largest attack surface.
- AI Agents Are Destroying Production Databases. This Is a Pattern.
Multiple documented incidents of AI coding agents -- primarily Claude Code -- executing irreversible destructive commands against production databases. This is not a one-off; it is a repeatable failure mode with a clear root cause.
- Europe Is Building Its Own Cloud. Here's What That Actually Means.
At MWC 2026, the European Commission unveiled EURO-3C -- a €75 million federated Telco-Edge-Cloud project backed by Europe's biggest telcos. Here's what it means in practice for engineers building global products.
- $130 Billion in Illegal Tariffs: What the Refund Ruling Means for Hardware Teams
A US trade court ordered refunds on $130B in tariffs ruled illegal by the Supreme Court, affecting ~300,000 importers including hardware buyers. Here's what it means for engineering budgets, CapEx planning, and procurement strategy.