AI & QA Accelerator

Write something

6d •

How to Install Playwright CLI for AI Test Automation

In the last post I explained why Playwright CLI is a better fit than Playwright MCP for AI coding agents. So before we talk about workflows, debugging, or best practices, let's make the tool clear. - What is it? - How do you install it? - And how do you run one simple command against a real website? ──────────────────────────────────────── 🟢 𝐖𝐡𝐚𝐭 𝐏𝐥𝐚𝐲𝐰𝐫𝐢𝐠𝐡𝐭 𝐂𝐋𝐈 𝐀𝐜𝐭𝐮𝐚𝐥𝐥𝐲 𝐈𝐬 Playwright CLI is a command-line tool for controlling a browser. You run commands in the terminal, and Playwright CLI can: ➜ Open a website ➜ Click buttons ➜ Fill inputs ➜ Press keys ➜ Take screenshots ➜ Read a page snapshot It was designed for AI coding agents. But it is not only for AI. You can use it yourself from the terminal to check that the browser opens, the page loads, and the command returns useful page information. ──────────────────────────────────────── 🧠 𝐇𝐨𝐰 𝐈𝐭 𝐅𝐢𝐭𝐬 𝐖𝐢𝐭𝐡 𝐀𝐈 𝐂𝐨𝐝𝐢𝐧𝐠 𝐀𝐠𝐞𝐧𝐭𝐬 The workflow is simple: 1. You ask the AI agent to inspect a page or debug a UI flow. 2. The agent runs Playwright CLI commands in the terminal. 3. Playwright CLI controls the browser. 4. The agent reads the result and decides what to do next. This does not replace Selenium, Cypress, or Playwright Test. It acts as a new layer on top of the testing frameworks. ──────────────────────────────────────── 🍎 𝐈𝐧𝐬𝐭𝐚𝐥𝐥 𝐏𝐥𝐚𝐲𝐰𝐫𝐢𝐠𝐡𝐭 𝐂𝐋𝐈 𝐎𝐧 𝐌𝐚𝐜 You need `Node.js` and `npm ` first. If you already have them, check in Terminal: ──────────────── > node -v ──────────────── > npm -v ──────────────── If those commands do not work, install Node.js LTS first: ──────────────── > brew install node ──────────────── Once `node` and `npm` work, install Playwright CLI: ──────────────── > npm install -g @playwright/cli@latest ──────────────── Then verify it: ──────────────── > playwright-cli --version ──────────────── You can also print the available commands: ──────────────── > playwright-cli --help ──────────────── Now go to the project where you want to use it: ──────────────── > cd your-project-folder

Matviy Cherniavski

14d •

AI&QA

Playwright CLI: The Practical Guide

🧠 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻 𝘁𝗼𝗼𝗹𝘀 𝘂𝘀𝗲𝗱 𝘁𝗼 𝗯𝗲 𝗯𝘂𝗶𝗹𝘁 𝗳𝗼𝗿 𝗵𝘂𝗺𝗮𝗻𝘀. 1. A QA engineer wrote the code. 2. Read the errors. 3. Decided what to try next. That was the normal workflow for years. But now everything has changed. Starting in early 2026, AI Coding Agents can handle all of those steps, while QA engineers act as managers and agentic leads. ──────────────────────────────────────── 🟠 𝐏𝐥𝐚𝐲𝐰𝐫𝐢𝐠𝐡𝐭 𝐌𝐂𝐏 It was the first serious tool for this new AI QA workflow. It let an AI Agent look at the page, click buttons, take page snapshots, and do basic browser tasks. Main use cases for the Playwright MCP in Test Automation: - Gathering locators for the UI tests - Debugging flaky or failed tests - Read console and network logs How it works: 1. User asks an AI agent that has access to Playwright MCP to do a task. 2. The AI coding agent controls the Playwright MCP to interact with a browser. For a while, that seemed like a great option, but soon enough it was discovered that it has a few fatal issues... ──────────────────────────────────────── 🔴 𝗣𝗹𝗮𝘆𝘄𝗿𝗶𝗴𝗵𝘁 𝗠𝗖𝗣 𝗶𝘀 𝗻𝗼𝘁 𝘁𝗵𝗲 𝗯𝗲𝘀𝘁 𝗼𝗽𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝘁𝗲𝘀𝘁 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻 Here is how Playwright MCP works: 1. It loads a full page snapshot (HTML + CSS) into the AI agent’s context after each page interaction. 2. It also loads large MCP metadata that tells the agent how to use the tool. That means Playwright MCP can eat 20–30% of that memory in a single use. And once context crosses 50–60%, agents start making mistakes and losing track of earlier instructions. So technically it works, but the context overhead and cost are not great. Quick recap: the AI agent’s context is its working memory. It holds the current conversation, instructions, code, and everything else the agent needs to stay on track. ──────────────────────────────────────── 🟢 𝐏𝐥𝐚𝐲𝐰𝐫𝐢𝐠𝐡𝐭 𝐂𝐋𝐈 Playwright CLI was built to solve those problems. It gives AI agents a simple command-line utility they can call like any other terminal command: - The agent runs small commands and gets back short results. - It reads the full HTML page only when needed, not on every interaction like Playwright MCP does.

New comment 11d ago

Matviy Cherniavski

Jul '25 •

AI&QA

AI Coding Agents: 𝐀𝐈 𝐀𝐮𝐭𝐨𝐓𝐞𝐬𝐭 𝐋𝐢𝐯𝐞 𝐖𝐨𝐫𝐤𝐬𝐡𝐨𝐩

AI is changing Software Development. And it is changing QA with it. QA Engineers who know how to use AI will: ⬩Deliver in days what used to take two weeks ⬩Do work that used to require deep expertise. With AI, basic knowledge can produce senior-level results ⬩Get instant AI feedback on tests, code, and debugging decisions The same applies to Software Developers. AI multiplies their delivery speed. QA becomes the bottleneck. That's why companies are fighting to hire QA Engineers who can match that speed. 💡 In fact, as of early 2026, many companies started adding AI coding tasks to their interview process. QA Engineers who ignore AI won't just fall behind, they risk losing their career entirely. That's not doomsaying. In 2026, tech companies laid off 55,775 people (https://www.trueup.io/layoffs). So, are those layoffs because AI is replacing people? No. AI is not replacing anyone. People who use AI are replacing people who don’t. Unlike the transition from Manual Testing to QA Automation, which took a decade, this shift is happening fast. Capable AI Coding Agents only became real in late 2025. Just a few months later, the entire tech world had changed. That's what this community is about. It's for people who see this shift and understand that right now is not just a pivotal moment for them. It's a short golden window to become one of the first truly AI-Powered QA Automation Engineers / SDETs and set yourself up for a long, safe, and extremely high-paying QA career. ──────────────────────────────────────── 𝐀𝐛𝐨𝐮𝐭 𝐌𝐞, 𝐚𝐧𝐝 𝐰𝐡𝐲 𝐈 𝐛𝐮𝐢𝐥𝐭 𝐀𝐈 𝐀𝐮𝐭𝐨𝐓𝐞𝐬𝐭 𝐋𝐢𝐯𝐞 𝐖𝐨𝐫𝐤𝐬𝐡𝐨𝐩 I'm 𝐌𝐚𝐭𝐯𝐢𝐲, a Vegas-based 𝐏𝐫𝐢𝐧𝐜𝐢𝐩𝐚𝐥 𝐒𝐃𝐄𝐓 with 𝟏𝟎+ 𝐲𝐞𝐚𝐫𝐬 𝐨𝐟 𝐞𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐜𝐞. I’ve worked across startups and large enterprises, building QA automation frameworks and testing infrastructure across pretty much all modern stacks and tools. In 2025 I introduced AI coding agents into my team's QA Automation workflows. The team adopted it. Management noticed. To get there, I spent $3,000+ of my own money. Not on theory, but on practice.

103

New comment 14d ago

AI Coding Agents: 𝐀𝐈 𝐀𝐮𝐭𝐨𝐓𝐞𝐬𝐭 𝐋𝐢𝐯𝐞 𝐖𝐨𝐫𝐤𝐬𝐡𝐨𝐩

Matviy Cherniavski

Feb 28 •

AI&QA

AI Coding Agents for QA: Part 1 — What They Are and Why It Matters

AI is everywhere, and it's easy to feel overwhelmed. Codex. Claude Code. Cursor. Windsurf. Copilot. New names every week, new hype every day. But they all describe the same concept: AI coding agents. ──────────────────────────────────────── 𝐖𝐡𝐚𝐭 𝐈𝐬 𝐚𝐧 𝐀𝐈 𝐂𝐨𝐝𝐢𝐧𝐠 𝐀𝐠𝐞𝐧𝐭? Simple: it's a tool that interacts with AI and generates code. That's it. But like any tool in a QA engineer's kit, not all of them are equal. Some are great for specific tasks, some are poor at most things, and some are solid generalists you can use anywhere and get good results. I spent over $3,000 testing them so you don't have to. In this series of posts I'll share exactly what I found. Today, we start with the fundamentals. ──────────────────────────────────────── 🧠 𝐖𝐡𝐚𝐭 𝐈𝐬 𝐚𝐧 𝐋𝐋𝐌? LLM stands for Large Language Model, the brain powering every AI coding agent. But here's the key thing to understand: you never talk to the LLM directly. There's always a tool sitting in between: ► YOU ► Tool (Cursor / Copilot / Claude Code) ► LLM (GPT-5 / Claude / Gemini) The same pattern applies when you use AI chat apps, except the interface is built for conversation, not code. ──────────────────────────────────────── ⚡ 𝐖𝐡𝐲 𝐓𝐡𝐢𝐬 𝐌𝐚𝐭𝐭𝐞𝐫𝐬 𝐟𝐨𝐫 𝐘𝐨𝐮 The tool (cursor, etc) you pick is responsible for roughly 50% of your results. Here's why: the tool reads your code, decides what information to send to the LLM, and determines how much the AI actually understands about your project and how it can write the actual code. Different tools. Different developers. Different quality. Same LLM. Wildly different output. This is exactly why the same engineer, using the same LLM but a different tool, can get completely different results. For example, using the exact same ChatGPT LLM in Cursor versus Copilot for the same task will produce very different quality output. ──────────────────────────────────────── 📌 𝐊𝐞𝐲 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬 - LLM = the brain. You can't access it directly. - Tools (Cursor, Copilot, Claude Code) sit between you and the LLM. - The tool accounts for ~50% of the quality you get. - Different tools, different quality, different output even with the same LLM underneath.

New comment 18d ago

AI Coding Agents for QA: Part 1 — What They Are and Why It Matters

Matviy Cherniavski

Mar 25 •

AI&QA

AI Coding Agents for QA: Part 5 — Stop Writing Prompts. Start Writing Task Specs

You open Cursor, Copilot or whatever AI tool you like ... You type: "write a login test" The agent responds. It looks like a test. Imports are there. Structure looks familiar. But you look closer. - Hardcoded credentials. - Wrong file location. - No page objects. - Naming convention are ignored. - And on top of all that, you run it... it fails. ──────────────────────────────────────── 🧠 𝐖𝐡𝐲 𝐭𝐡𝐞 𝐀𝐠𝐞𝐧𝐭 𝐆𝐮𝐞𝐬𝐬𝐞𝐬 𝐖𝐫𝐨𝐧𝐠 Most people at this point blame the model. - "Claude is bad at tests." - "GPT doesn't understand Playwright." - "I need a better model." But the reality is... the model did not fail you. You gave it nothing useful to work with. Think of the agent like a new hire. Smart. Fast. Capable. But they have never seen your project before. ➤ They do not know where your fixtures live. ➤ They do not know how you name test files. ➤ They do not know what credential pattern you use. ➤ They do not know whether you run tests after every change. You told them: "write a login test." So they try to find all that information and make a lot of assumptions. Every assumption is a guess. Every guess is a risk of being wrong. That is an onboarding problem and a lack of proper documentation. ──────────────────────────────────────── 📝 𝐖𝐡𝐚𝐭 𝐚 𝐑𝐞𝐚𝐥 𝐓𝐚𝐬𝐤 𝐒𝐩𝐞𝐜 𝐋𝐨𝐨𝐤𝐬 𝐋𝐢𝐤𝐞 In the AI coding agents world, that documentation is often called "Task Spec." A task spec is not a longer prompt. It is a precise set of constraints that leaves the agent very little room to guess. Here is the difference. 𝗪𝗲𝗮𝗸 𝗽𝗿𝗼𝗺𝗽𝘁: ``` write a login test ``` 𝗚𝗼𝗼𝗱 𝗧𝗮𝘀𝗸 𝗦𝗽𝗲𝗰: `` Write a login test. Before making any changes, inspect the existing tests in /tests/auth/ and follow the existing suite structure, naming, and conventions. Task: - Add a test for successful login using the existing credentials fixture. - Place it in the appropriate existing auth test suite. - Do not hardcode credentials or duplicate fixture data. - Do not create new files unless no existing test file is appropriate.

New comment 19d ago