Activity
Mon
Wed
Fri
Sun
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
What is this?
Less
More

Owned by Matt

4 Keys To Wealth Legacy

128 members • Free

Learn the 4 Keys—Faith, Leadership, Leverage & Financial Literacy—to build lasting wealth, time freedom, and a legacy for your family.

Memberships

AI & QA Accelerator

596 members • Free

Nobility Digital AI Scaling

16 members • Free

The Profit Paths Collective

377 members • Free

The Freedom Builders Circle

1.2k members • Free

DG Community Builders

163 members • $97/m

Digital Growth Community

59.7k members • Free

Digital Growth Circle

60 members • $625/m

Clarity-To-Cash

35 members • $1,000/m

The Creators Community

4.2k members • Free

1 contribution to AI & QA Accelerator
AI Coding Agents for QA: Part 4 — Why the Same Model Gives Different Test Results
In Part 3 I introduced Cursor and why IDE tools beat CLI for QA automation. But before we go deeper into Cursor features, there is a bigger question worth answering. ──────────────────────────────────────── 𝐓𝐰𝐨 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐬. 𝐒𝐚𝐦𝐞 𝐌𝐨𝐝𝐞𝐥. 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭 𝐑𝐞𝐬𝐮𝐥𝐭𝐬. Engineer A asks GPT-5.4 to write a login test. Gets back: a clean, structured test. Uses their proper fixtures. Follows their naming convention. Works on first run. Engineer B does the same thing. Same model. Same task. Gets back: a generic, broken test. Hardcoded credentials. No page objects. Fails immediately. ──────────────────────────────────────── 🚫 𝐌𝐨𝐬𝐭 𝐏𝐞𝐨𝐩𝐥𝐞 𝐁𝐥𝐚𝐦𝐞 𝐭𝐡𝐞 𝐌𝐨𝐝𝐞𝐥 "GPT is bad at tests." "GPT doesn't understand Playwright." "I need a better model." That is the wrong diagnosis. The model is not the problem. All modern models can code really well. Three other things determine quality. ──────────────────────────────────────── ⚙️ 𝐋𝐚𝐲𝐞𝐫 𝟏: 𝐓𝐡𝐞 𝐓𝐨𝐨𝐥 As covered in Part 1, you never talk to the model directly. ► You ► Tool ► Model The tool decides what to send to the model. What context. What files. What history. Cursor sends your repo structure, open files, and recent edits. A chat app sends nothing. Same model. Different tool. Completely different output. ──────────────────────────────────────── 📁 𝐋𝐚𝐲𝐞𝐫 𝟐: 𝐑𝐞𝐩𝐨 𝐐𝐮𝐚𝐥𝐢𝐭𝐲 AI agents amplify whatever already exists in your project. Good framework? The agent writes tests that slot right in. No page objects, no fixtures, no structure? The agent writes whatever it can. Which is usually a mess. This is the hard truth: AI cannot rescue a bad codebase. It makes it worse, faster. The model is only as good as what it can see. If your repo has: ∙ Clear fixture files ∙ Consistent naming ∙ Reusable page objects ∙ Good test examples The agent pattern-matches against all of that and writes code that fits. If it sees nothing, it invents everything. Pure lottery. ──────────────────────────────────────── 📝 𝐋𝐚𝐲𝐞𝐫 𝟑: 𝐓𝐡𝐞 𝐓𝐚𝐬𝐤 𝐒𝐩𝐞𝐜 "Write a login test" is not a task spec. It is a hint.
AI Coding Agents for QA: Part 4 — Why the Same Model Gives Different Test Results
1 like • 28d
Good stuff my friend. This makes a lot of sense and why it works that way!
1-1 of 1
Matt Robbins
1
4points to level up
@matt-robbins-tlyaw
I help Dads get out of their jobs so they can be stay at-home-dads without sacrificing their income.

Active 2d ago
Joined Mar 12, 2026
Utah
Powered by