Alex Kao

Running Claude Code on Opus is powerful. But most turns in a session are mechanical, reading files, writing boilerplate, simple edits. You're paying top-tier rates for work that Haiku could handle in its sleep. Anthropic just released the Advisor Tool: A server-side pattern where a cheap executor model consults Opus mid-generation for strategic guidance. But it's API-only. Can't use it inside Claude Code. So I rebuilt the pattern inside Claude Code using what already exists. The setup: Opus stays in the orchestrator seat. It plans. It makes architecture decisions. It reviews output. It never touches files directly unless the task genuinely needs Opus-level judgment. Everything else gets dispatched to cheaper models: - Agent({ model: "haiku" }) — Claude subagents with full file access for simple edits - Agent({ model: "sonnet" }) — for multi-file changes that need moderate reasoning - A CLI tool (ask.py) that routes to Gemini, Kimi, MiniMax, or local Gemma via Ollama — for code generation, research, video analysis, anything where you just need text back The routing logic: Does the task need file access? → Agent tool (haiku/sonnet) Is the code complex? → sonnet or gemini Is the code simple? → haiku, kimi, or minimax (cheaper) Need video analysis? → gemini --video (native, no frame extraction) Need parallel research? → kimi-swarm (spawns up to 100 sub-agents) Want zero API cost? → gemma running locally via Ollama When a subagent hits something it can't handle, it reports back NEEDS_GUIDANCE. Opus thinks it through and re-dispatches with better context. That's the advisor pattern, strategic guidance exactly when needed, cheap execution everywhere else. Cost impact: A 10-task session where 8 tasks go to Haiku and 2 to Sonnet — your Opus tokens are only spent on planning and review. Maybe 20% of the total token volume. The other 80% runs at Haiku rates. With local models mixed in, it drops further. Open-sourced the whole thing:

New comment 7d ago

Alex Kao

1 like • 12d

have you tested gemma for for haiku and sonnet while keeping opus in driver seat? does it test out well?

Alex Kao

13d •

📚 Resources & Finds

When Claude Code gets dumbed down, hard gate Folder Architecture don’t use open source models

Has anyone else noticed Claude was actually really unintelligent this week after mythos preview was announced? thank god for Clief’s Folder Architecture, all I had to do was hard gate into global Claude config and Claude become intelligent enough to use again PSA: don’t use open sourced models for coding + reasoning, no matter how good they are on benchmarks if you plug it into Claude code it will never be as good as Claude code, just take a couple days break Open sourced models can do task execution well but other than that honestly don’t waste your budget

New comment 13d ago

Alex Kao

0 likes • 13d

@Mark Gubuan Claude announced a new advisor feature I think can replace openclaw if you pair it with scheduler command, cowork burns too may tokens if you give it multi-tool tasks but if it’s one tool like search through gmail and then create a excel report or any report it does that pretty cheaply and reliably per my experience, might have to figure a way out to control it all from one chat like openclaw but using desktop app I think is a ok patch for now cause you can use Claude code in desktop app as well

Alex Kao

0 likes • 13d

@David C yeah I tried this before as well but I got burned multiple times before because every time a new frontier llm comes out all the routing all the experimenting i did became obsolete because the new model could do it in one shot 🥹 and all the other open source would come out with better model then I had to redo everything again 🥹

Alex Kao

27d •

💬 General discussion

New Features in Claude Code

Is everyone caught up with all the new claude code features? so many new features i don't know what new capabilities i should start learning 🥲

New comment 25d ago

Alex Kao

1 like • 26d

@Neil Napier i honestly don't trust claude enough for it to give me a honest answer especially in regards to their new features, not in a bad way because i understand if i'm using claude, claude wants to keep me in its ecosystem you can test it out by talking to different llm's, its very subtle but its there

Alex Kao

1 like • 26d

@Roger Roland true agi? 😦😦😦

Alex Kao

27d •

💬 General discussion

Obsidian Vault = Persistent Memory

Has anyone tried to apply the folder structure learned here into their obsidian vault? Clief folder architecture really helped with retrieval and how knowledge is linked, this new meta really documents my current understanding on various subjects very well, in claude code sessions i don't have to explain as much context as before because claude code "knows" what i mean even better because of obsidian vault, but i realized that there needs to be a naming convention that works with clief folder architecture to make it truly optimized Anyone have any suggestions as to how to set up the naming convention?

New comment 15d ago

Alex Kao

0 likes • 26d

@Michael Roach-Duquette agreed, but i would argue that what you have brainstormed is only suitable for non-coding information obsidian as a second brain to claude code is to allow coders to not have to remember everything in their head that was learned or done in coding sessions but gets applied in future coding sessions if that something learned or done gets re-used so you don't have to waste time experimenting and trying things you already did once before ie. gemini embedding 2 api is not optimized for helping with RAG and embedding retrieval it often drops or takes too long, could be due to high-volume usage because its an api or their embeddings 2 is only suited for creating embeddings and not retrieval

Alex Kao

1 like • 26d

@Alex Shaddy hey man, thanks a million, i just realized how over-engineered my original my approach was

David Bio Sounon

Mar 12 •

💬 General discussion

👋🏽 I would love your perspective

I would love to get some feedback on my situation/dilemmas. I had accumulated some know-how in defining and encoding the workflow, context, structure, constraints, etc., for different tasks without using tailored personas/instructions for projects within the main AI chat apps. Then I thought that most employees don't have the time or energy to dive deep and dig through layers of AI influencers, prompt packs, etc., to actually optimize their work beyond one-time conversations or simple tasks, and so I could bring some value. Went all in on finding, learning, and designing the best ways to achieve that. As Opus 4.6 and Codex 5.3 came out, I dove into the "agentic" area and its possibilities, where I feel my experience has whole new potential, as there is only so much that can be done within projects and instructions, and they allow for hardcoding some deterministic parts. That led me to realize that, in the current state, I can leverage my somewhat shallow understanding and insight into different areas, as models can compensate for my lack of technical background. So I was able to create a website with an integrated reservation system for my wife's lessons. And, at least in my opinion, it escapes the generic feel of most AI websites. So I defined some workflows and principles to achieve similar results faster for other potential clients. What I am trying to figure out are 3 things: - Whether to keep exploring more complex solutions or to focus on improving people's workflows in the everyday layer of the main apps through instructions/projects. Since I feel most people are and will continue to use them as their primary layer. - How to deliver the value in the best way, as I am not much of a salesperson and have an ick about selling courses or teaching for the sake of it. - I feel the value is there, but it is hard for me to define it clearly and position myself. And it all kind of falls under generic categories of AI workflows, consulting, specialist which are oversaturated

New comment Mar 19

Alex Kao

1 like • Mar 15

Honestly I don’t think Ai consulting is oversaturated, we are at a point in the market where more and more people are starting to recognize Ai slop and what I call Slop Marketing as Ai is not a new novel thing anymore, so as long as you provide value in a clear and straightforward way like @Jake Van Clief does through his content you’ll be able to cut through the noise But I think this goes back to a general rule in business, as long as you provide actual value at a cheaper price you are going to get market share regardless of what others do

Alex Kao

0 likes • Mar 19

@David Bio Sounon totally agree What do you think of nvidias new nemotron model? Tailor made seems more plausible now with such a big model imo

1-10 of 12

Level 3 - 🖍️ Highlight

28points to level up

Alex Kao

@alex-kao-3961

learning make automations

Active 3d ago

Joined Mar 15, 2026

Contributions

Followers

Following