From ChatGPT for Testing to Full Automation: Why I Built My Own AI Testing Tool
7 min read
I got tired of pasting DOM snippets, so I built a tool that uses video to generate Playwright scripts.
If you are anything like us, your browser history is likely a graveyard of Gemini conversations and ChatGPT threads named "Fix this Playwright Selector" or "Generate Test Script."
We have all tried using ChatGPT for testing. It usually goes like this: you open the dev tools, copy a massive chunk of your DOM, paste it into ChatGPT, and pray it understands which <button> is actually the "Submit" button.
It works, but it is exhausting. It feels less like "full automation" and more like advanced copy-pasting.
We recently went down a rabbit hole looking for a better AI testing tool. We needed something that could take the logic of an LLM but apply it directly to the browser without the manual friction. When we couldn't find exactly what we needed, we teamed up with our co-founder, Sandeep, and the Bug0 crew to build it.
It is called Bug0 Studio (at vibe.bug0.com), and it has completely shifted how we generate Playwright scripts. It is the same core technology we use to serve our enterprise customers in our Forward Deployed Engineering (FDE) model in QA testing, so we knew it had to be robust.
Here is the story of why we built it, and why it has become the best AI software testing tool for our own workflow.
The Problem: Why Text Prompts Aren't Enough
The biggest limitation we faced when using ChatGPT for usability testing or script generation was blindness. An LLM (Large Language Model) doesn’t see your site; it only reads the text you feed it.
Sure, you can use ChatGPT for guerilla usability testing ideas. You can ask it to generate user personas or hypothetical scenarios. But when it comes to the actual execution, specifically writing the code that clicks the button, it often fails because it lacks context. It doesn't know that your "Login" button is actually hidden behind a cookie banner, or that your specific React component renders differently on mobile.
We wanted a workflow that felt like ChatGPT. It needed to be conversational and smart, but we needed it to have "eyes" on our application.
Enter Bug0 Studio: The "Vibe" Testing Approach
We designed Bug0 Studio as a "Natural Language to Playwright" platform. The idea was straightforward. We did not want to write code or mess with complex prompts. We just wanted to show the AI what to do.
We designed the workflow to be as simple as possible:
Record, Upload, or Type: We made it flexible - you can record your browser tab directly, upload a video file (mp4/webm), or just describe the flow in natural language text.
Describe: You tell it what you are testing in plain English.
Get Code: It processes the video recording along with your text description and spits out a fully working Playwright test.
We needed a real AI tool for automation testing that connects the user's view directly to the code logic.

Bug0 Studio Screenshot: The entry point is simple: either record live or upload a video.
The "Lazy Developer's" Dream Setup
What we focused on most wasn't just the AI generation, but the specific configuration options. Most AI tools for testing require a lot of boilerplate setup. We tried to identify the exact pain points developers (including ourselves) have when setting up test suites.
When you start a session, we designed a configuration panel that handles the boring stuff:
Site Setup: You just drop in your Base URL.
Login Credentials: You can provide optional Email/Password fields so the AI understands authentication flows.
Storage State JSON: This is the killer feature we knew we needed.
We are paranoid about security, so we made a specific design choice here. We designed the system so that your Base URL and credentials stay in localStorage on your machine. They aren't stored permanently on our servers, which is a huge relief.

Being able to paste a Playwright storage state JSON means you can test deep-link flows without re-logging in every time.
If you have ever written E2E tests for a dashboard behind a login wall, you know the pain. Usually, you have to script the login steps for every single test. With Bug0, you can just paste your Playwright storageState JSON, and the AI assumes that session. It effectively acts as an AI tool for manual testing logic that automates the setup for you.
A Practical Example
A simple checkout flow is the best way to show how we use it effectively. This is usually a nightmare for brittle selectors.
The Workflow:
We clicked "Record Tab" in Bug0 Studio.
We clicked on a product, added it to the cart, and proceeded to checkout.
We stopped the recording.
The Validation Step: We didn't want it to be a "black box," so we built an intermediate step. The AI analyzes the video and extracts a list of ordered steps. We can actually edit, add, or remove steps right there. This allows us to fix a logic error we made during recording before the AI writes any code.
We verified the steps and hit "Generate."
The Result: Once the generation started, the tool spun up a live cloud browser session. We watched a split-screen view: on the left, the AI was explaining its logic in real-time; on the right, we saw a live preview of the test actually running on the cloud browser.
It wasn't just "guessing" based on HTML; it was executing the "vibe" of the user journey in front of our eyes, ending with a generated Playwright script that used robust locators like getByRole.

The Landscape: Why We Built This
When searching for an AI testing tool, we mostly found three types of solutions. Here is why we found them frustrating enough to build our own solution:
1. The Chatbots (ChatGPT, Claude)
We love them, but they are blind. You have to feed them HTML snapshots or screenshot uploads manually. If the DOM changes or uses dynamic classes (like CSS modules), the bot often hallucinates selectors that look correct but fail immediately.
Zero real-time visual context; requires constant manual copy-pasting.
2. The "Literal" Recorders (Playwright Codegen)
These tools are technically impressive, but they are literal to a fault. If we accidentally click the wrong field and then correct ourselves, the recorder scripts both actions. They also tend to default to specific hierarchical selectors (e.g., div:nth-child(3)) which break the moment we add a new UI element.
Brittle scripts. They record actions, not intent.
3. The "Black Box" Enterprise Suites
There are plenty of "Autonomous Testing Agents" out there. They promise self-healing and zero-code, but they usually require us to install their heavy SDKs, run tests on their specific cloud infrastructure, and pay enterprise pricing.
Vendor lock-in. We designed Bug0 Studio to just give us standard Playwright code that we can run in our own local VS Code or CI/CD pipeline.
Conclusion
We are seeing a flood of AI tools right now, and it is hard to tell which ones are actually useful. But if you are looking for a ChatGPT for testing alternative that is purpose-built for modern web apps, we invite you to check out what we've built.
It solves the context problem by using video, and it solves the setup problem by handling Auth and Storage States natively.
Give it a spin at Bug0 Studio. It’s surprisingly fun to just upload a video and watch clean code appear. We truly believe this is not just a temporary fix, but the blueprint for the best AI software testing tool of the future.
Feel free to hop on into our Discord and share your thoughts or feature requests.
Recent posts
Stay in the loop
Subscribe for new posts, updates, and changelogs.






