Joe G.

Clief Notes

Activity

Mon

Wed

Fri

Sun

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

What is this?

Less

Memberships

Clief Notes

40k members • Free

26 contributions to Clief Notes

Krystian Swierk

May 20 •

📚 Resources & Finds

Dual View Architecture - Full Orchestration Engine.

Here is a failure mode that ships more often than it should. The model that writes an output is also the model that checks it. You send the result back with "review this and flag anything weak," and the review skews toward approval, because a model reviewing its own work shares its own blind spots. This is a documented limitation, not a hypothesis. It is not universal. Plenty of teams already run multi-step pipelines, separate critic models, and output validators. But self-review as the only quality gate is still common, and for enterprise-grade output, where a confident wrong answer is a liability, it is worth solving properly. This post is how we approached it, where the design is genuinely sound, and where it is not. I want this community to review both. Why self-review is weak Two things work against a single model checking itself. The first is autoregressive momentum. A model picks each word partly from the words it already wrote, so the opening of an output conditions everything after it. A model that has generated thousands of reports has a strong format prior: summary, background, analysis, recommendation. Your spec might say to lead with the competitive threat and drop the background. That instruction competes with the prior, and a few sentences in, the prior often wins. The output looks like a report. It is not your report. The second is that evaluation has a prior too. In training data, reviews of polished work skew positive, so a model asked to "evaluate this" leans toward approval. A reviewer that is the same model, or the same model family, shares the writer's biases. Here is the honest version of the claim, and it is less dramatic than how this is usually sold. Prompting is not powerless, it is unreliable as your only quality control. Self-critique does catch real errors. It just catches fewer than an independent reviewer would, and you cannot tell from the output which case you got. So you do not throw prompting away. You add structure around it.

New comment May 22

Joe G.

0 likes • May 22

@Krystian Swierk I found this out by accident, I am spending my weekly Codex tokens Sun to Tue and I knew using both models to build was risky so I used Claude to do the gap analysis and roadmap and deferred work register and be overall project critic. I had it write to new files (e.g. ROADMAP_REVIEW.md to analyze ROADMAP.md) and not touch Codex product. Then when I got a new Codex ration I had it process the review files, then I could hit the ground running. Codex is very good at processing them and justifying any rejections. Totally unanticipated result.

Oreoluwa Bruce

May 20 •

💬 General discussion

Folder structure vs Obsidian

Hi everyone, I feel like the hype around “Obsidian second brain” is not important as a good folder/file structure going through the class lessons here but everywhere I turn someone is yelling “second brain” this and that. I am here to ask if Obsidian really does enhance anything particularly, asides from looking cool?… AI Performance? AI reasoning? or Is it just a duplication ? If anyone can help. I don’t wanna go down the hole… too many things to learn. 😅

New comment 24d ago

Joe G.

1 like • May 21

@Deborah T If you ever want to learn a skill germane to your project, just make it a project objective in CLAUDE.md.

Joe G.

1 like • May 21

@Deborah T Regarding plugins, I fully agree. Instruct your project to keep it simple. My list above is the result of a very sophisticated analysis that I did 3 percent of myself. Just cut and paste it into your project and tell Claude "USE ONLY THESE PLUGINS" and you will be able to make very sophisticated graphs in Obsidian.

Jamie Smith

May 18 •

💬 General discussion

Deployment

I'm struggling with a gap in knowledge. Forgive me if this is already mentioned somewhere else. And if so point me in the right direction. Building and working with the folder structure is great, but how are people deploying these projects? Building for yourself, obviously the project is on your machine. But what about building for other people? Are you providing the repo for friends, colleagues, clients etc or sending them the folder structure so their own claude can use it? I'm kind of getting hung up on this point. And feel if I understand this a bit more it will help with more prospective shaping how the projects are built.

New comment May 18

Joe G.

1 like • May 18

@Alex Brown @Jamie Smith Agreed, for so many questions like this one, whatever else you do, you can always ask the model for an analysis of options and a recommendation.

Joe G.

May 15 •

📚 Resources & Finds

Free LLM Tier and VSCode Extension

I relied on a recent post (which turns out to be from Reddit) about the "free tier" of LLM and found some discrepancies, notably there has not been a "data sharing credit" from xAI for almost a year now. So I asked Grok to dig deeper and verify answers for a better landscape, results are here: https://grok.com/share/bGVnYWN5LWNvcHk_f65155a2-6bdd-4e69-b51f-a7242b000ad0 I think this answer is more helpful, especially since a lot of free tokens is not helpful if the rates are not good. I also found a VS Code extension called Cline (thanks again Grok) that so far has been a bit clunky (it hung up on me once and I had to restart it), but it connects to a lot of these "free" options and replaces some of my need for Open WebUI also. You cannot understate the usefulness of VS Code if you are using LLM to build a product, as opposed to building an LLM-powered product. Finally, let me say in the gentlest possible way that a lot of people are relying on each other here for up-to-date, complete, and accurate answers. Let's reinforce that.

New comment May 15

Joe G.

0 likes • May 15

@David Vogel I'm setting it up right now. I am going to watch a few videos before I show it a project.

Joe G.

1 like • May 15

@Dominic Franco I think we should assume they are all zero-privacy unless you get an enterprise account with written terms. I am learning the benefits of being model-agnostic, such as you are future-proofing for the day your current cloud model goes dark. It makes sense to try several models and the practical way is to use some free ones. This is where Open WebUI is helpful.

Moon Kim

May 15 •

💬 General discussion

Testing the effects of updated instructions?

When you update the .md files to fix the broken agentic flow, how do you check if the LLM behavior is actually changed? Do you run the same task with a clean session?

New comment May 16

Joe G.

2 likes • May 15

@Moon Kim @Blaine Chartier Not knowing the details of what the issue or the fix was, my general answer is to ask the model. Something like: "Design and run a test to see if [describe change] as a response to [describe issue] resulted in [desired behavior]. Output is a report of the differential behavior of the model before the change and after." This will work if the broken behavior is in context, or if the bad output was preserved. This general problem is also aided by two things in your session work: lexicography, and addendums. First, Give the target behavior a name, and the model remembers it in context (or in project files if you put them there) and then you have a code word for your target. Second, examine your project file from the outside, for example to fix an issue in ROADMAP.md, do not write the fix to ROADMAP.md write it to a ROADMAP_FIX.md and tell the model to try it twice, once with the fix and once without. I concede that is a lot of tokens though.

1-10 of 26

Level 4 - 📝 Annotation

84points to level up

Joe G.

@joe-gusmano-6692

Lifelong learner.

Active 3d ago

Joined Apr 5, 2026

INFP

Contributions

Followers

Following