Self-Hosting
- Self-Hosting Your AI Stack: A Practical Guide
Updated 6 March 2026: A quieter day -- no major new developments. The stack remains stable: Qwen3.5-35B-A3B on local GPU, PersonaPlex 7B for voice on Apple Silicon, Ollama or llama.cpp for inference serving.