I Manage My Entire IT Infrastructure with AI and I will NEVER go back!
3 months ago I embarked on an experiment. As a 30 year network engineer I wanted to see how much infrastructure AI could actually manage. I built a Claude Code agent to run my entire stack. Gateway to workstations, everything in between. It's been working amazing! I don't think I can ever go back to managing it myself anymore, it saves me HOURS of manual configurations when deploying anything!. It's stood up containers and applications, configured my VLANs, configured my Netbird mesh, manages the dual Pi-hole pair, the SIEM, the secrets vault, the workflow runtime, the backup server. Anything I need to do in my stack, I just talk to it. Natural language conversation with Claude in terminal. Done. Plus a proactive layer on top: daily Slack alerts the agent acts on autonomously. New CVE drops on a package I'm running? Triage and a patch sequence land in Slack before I'm awake. Pi-hole drift across the resolver pair? Auto-corrected, journaled, summarized. I read the digest, not the dashboards. The cluster it manages: - 4 Proxmox nodes - 18 LXCs + 1 VM - 3 months continuous, weekly kernel updates on cron - Only manual step: I loaded Proxmox onto the host boxes. The agents did everything else. Today I documented it all. Public runbook on my website. Open-source GitHub repo for anyone who wants to run the same experiment. And today I'm also breaking that single agent into 11 specialists — mesh, vault, substrate, edge, LAN, telemetry, workflow, docs, plus three cyber peers (security, threat detection, vuln management). Each one 80–150 lines instead of one big 432-line spec. Each one runs on the smallest model that handles its work. Haiku on the read paths, Sonnet on change-prone domains, Opus reserved for incidents only. The split should drop my API cost to roughly 1/13 of what a typical agent call runs today. Faster responses, less wasted context, cleaner reasoning, same coverage. → Repo: https://github.com/Mfrostbutter/Infra-AI-IT-Team-Runbook