Version History: State of AI
Changelog
| Date | Summary |
|---|---|
| 27 Mar 2026 | 27 Mar 2026 – First ARC-AGI-3 scores published: Symbolica’s Agentica scores 36.08% for $1,005; frontier CoT baselines (Opus 4.6 Max, GPT-5.4 High) score 0.2-0.3% at up to $8,900, a concrete illustration of the agent benchmark gap the post identifies. |
| 26 Mar 2026 | 26 Mar 2026 – ARC Prize launches ARC-AGI-3, an interactive benchmark measuring agent learning efficiency and long-horizon adaptation rather than static answers; a concrete response to the agent evaluation gap the post identifies. |
| 25 Mar 2026 | 25 Mar 2026 – Arm launches first-ever Arm-designed data center CPU for agentic AI workloads; LiteLLM PyPI supply chain attack introduces credential theft via compromised agent infrastructure dependency. |
| 24 Mar 2026 | 24 Mar 2026 – GPT-5.4 Pro solves a frontier math open problem; iPhone 17 Pro demonstrated running a 400B LLM locally, extending the local inference ceiling. |
| 23 Mar 2026 | 23 Mar 2026 – Quiet day, thesis holds. |
| 22 Mar 2026 | Tinybox ships commercially: purpose-built offline AI device runs 120B models on 384GB GPU RAM, adding a hardware product layer to the local inference maturation story. |
| 21 Mar 2026 | White House releases national AI framework to pre-empt state rules; OpenCode reaches 5M monthly developers as open-source CLI coding agent goes mainstream. |
| 20 Mar 2026 | OpenAI acquires Astral (Ruff/uv) for Codex team; autoresearch scaled to 16 GPUs runs 910 experiments in 8 hours with emergent heterogeneous hardware strategy. |
| 15 Mar 2026 | Meta planning 20%+ workforce layoffs explicitly to offset AI infrastructure costs, the largest single AI-driven displacement event yet reported at a major tech company. |
| 14 Mar 2026 | Anthropic makes 1M context GA for Opus 4.6 and Sonnet 4.6; Morgan Stanley projects 9-18 GW US power shortfall through 2028 and flags accelerating AI-driven workforce reductions. |
| 13 Mar 2026 | Washington and Utah pass significant AI disclosure and safety legislation; Nvidia GTC (March 16-19) previewed with NemoClaw agent platform and Groq inference chip expected. |
| 12 Mar 2026 | METR study finds half of SWE-bench-passing PRs would be rejected by real maintainers; Amazon insiders report AI tools slowing work even as company mandates adoption and cuts 30,000 staff. |
| 11 Mar 2026 | Yann LeCun raises $1B for AMI to pursue physical world models as a direct alternative to LLMs; overnight agent trust problem identified as structural challenge as coding agents scale. |
| 10 Mar 2026 | chardet copyleft controversy surfaces a new legal fault line for AI reimplementation; Claude Code cost rebuttal confirms ~10x gap between retail API pricing and actual inference costs. |
| 9 Mar 2026 | Agent Safehouse ships: macOS-native deny-first kernel sandboxing for local agents, 518 HN points, a direct community response to the governance gap the post identifies. |
| 8 Mar 2026 | Karpathy ships autoresearch: autonomous agent that runs ML experiments overnight unsupervised, directly illustrating his own December inflection observation. |
| 7 Mar 2026 | Tech employment worse than 2008 or 2020 recessions; China’s five-year plan makes AI dominance official national policy. |
| 6 Mar 2026 | GPT-5.4 launches with native computer-use; Anthropic sues DoD over supply chain risk designation; Anthropic labor research finds real but limited displacement. |
| 5 Mar 2026 | Alibaba Qwen leadership exodus – Justin Lin resigns, two others. |
| 2 Mar 2026 | 1.0 Inaugural edition |
| 3 Mar 2026 | OpenAI $110B raise. |
| 4 Mar 2026 | Knuth’s “Claude’s Cycles” paper on HN. |
Snapshots
No dated snapshots yet.