From ChatGPT for Testing to Full Automation: Why I Built My Own AI Testing Tool

Updated Nov 28, 2025

7 min read

I got tired of pasting DOM snippets, so I built a tool that uses video to generate Playwright scripts.

If you are anything like us, your browser history is likely a graveyard of Gemini conversations and ChatGPT threads named "Fix this Playwright Selector" or "Generate Test Script."

We have all tried using ChatGPT for testing. It usually goes like this: you open the dev tools, copy a massive chunk of your DOM, paste it into ChatGPT, and pray it understands which <button> is actually the "Submit" button.

It works, but it is exhausting. It feels less like "full automation" and more like advanced copy-pasting.

We recently went down a rabbit hole looking for a better AI testing tool. We needed something that could take the logic of an LLM but apply it directly to the browser without the manual friction. When we couldn't find exactly what we needed, we teamed up with our co-founder, Sandeep, and the Bug0 crew to build it.

It is called Bug0 Studio (at vibe.bug0.com), and it has completely shifted how we generate Playwright scripts. It is the same core technology we use to serve our enterprise customers in our Forward Deployed Engineering (FDE) model in QA testing, so we knew it had to be robust.

Here is the story of why we built it, and why it has become the best AI software testing tool for our own workflow.

The Problem: Why Text Prompts Aren't Enough

The biggest limitation we faced when using ChatGPT for usability testing or script generation was blindness. An LLM (Large Language Model) doesn’t see your site; it only reads the text you feed it.

Sure, you can use ChatGPT for guerilla usability testing ideas. You can ask it to generate user personas or hypothetical scenarios. But when it comes to the actual execution, specifically writing the code that clicks the button, it often fails because it lacks context. It doesn't know that your "Login" button is actually hidden behind a cookie banner, or that your specific React component renders differently on mobile.

We wanted a workflow that felt like ChatGPT. It needed to be conversational and smart, but we needed it to have "eyes" on our application.

Enter Bug0 Studio: The "Vibe" Testing Approach

We designed Bug0 Studio as a vibe testing platform. The idea was straightforward: instead of writing code or messing with complex prompts, we simply wanted to show the AI the "vibe" of the user journey and have it generate the script.

We designed the workflow to be as simple as possible:

Record, Upload, or Type: We made it flexible - you can record your browser tab directly, upload a video file (mp4/webm), or just describe the flow in natural language text.
Describe: You tell it what you are testing in plain English.
Get Code: It processes the video recording along with your text description and spits out a fully working Playwright test.

We needed a real AI tool for automation testing that connects the user's view directly to the code logic.

A screenshot of the Bug0 Studio landing page showing the "Record Tab" and "Upload Video" options side-by-side.

Bug0 Studio Screenshot: The entry point is simple: either record live or upload a video.

The "Lazy Developer's" Dream Setup

What we focused on most wasn't just the AI generation, but the specific configuration options. Most AI tools for testing require a lot of boilerplate setup. We tried to identify the exact pain points developers (including ourselves) have when setting up test suites.

When you start a session, we designed a configuration panel that handles the boring stuff:

Site Setup: You just drop in your Base URL.
Login Credentials: You can provide optional Email/Password fields so the AI understands authentication flows.
Storage State JSON: This is the killer feature we knew we needed.

We are paranoid about security, so we made a specific design choice here. We designed the system so that your Base URL and credentials stay in localStorage on your machine. They aren't stored permanently on our servers, which is a huge relief.

A close-up screenshot of the Bug0 Studio configuration panel. Specifically highlighting the "Storage State JSON" field and the Login Credentials section.

Being able to paste a Playwright storage state JSON means you can test deep-link flows without re-logging in every time.

If you have ever written E2E tests for a dashboard behind a login wall, you know the pain. Usually, you have to script the login steps for every single test. With Bug0, you can just paste your Playwright storageState JSON, and the AI assumes that session. It effectively acts as an AI tool for manual testing logic that automates the setup for you.

A Practical Example

A simple checkout flow is the best way to show how we use it effectively. This is usually a nightmare for brittle selectors.

The Workflow:

We clicked "Record Tab" in Bug0 Studio.
We clicked on a product, added it to the cart, and proceeded to checkout.
We stopped the recording.
The Validation Step: We didn't want it to be a "black box," so we built an intermediate step. The AI analyzes the video and extracts a list of ordered steps. We can actually edit, add, or remove steps right there. This allows us to fix a logic error we made during recording before the AI writes any code.
We verified the steps and hit "Generate."

The Result: Once the generation started, the tool spun up a live cloud browser session. We watched a split-screen view: on the left, the AI was explaining its logic in real-time; on the right, we saw a live preview of the test actually running on the cloud browser.

It wasn't just "guessing" based on HTML; it was executing the "vibe" of the user journey in front of our eyes, ending with a generated Playwright script that used robust locators like getByRole.

A split-screen image. On the left, a frame of the video recording showing the checkout. On the right, the generated code block showing the Playwright script.

The Landscape: Why We Built This

When searching for an AI testing tool, we mostly found three types of solutions. Here is why we found them frustrating enough to build our own solution:

1. The Chatbots (ChatGPT, Claude)

We love them, but they are blind. You have to feed them HTML snapshots or screenshot uploads manually. If the DOM changes or uses dynamic classes (like CSS modules), the bot often hallucinates selectors that look correct but fail immediately.

Zero real-time visual context; requires constant manual copy-pasting.

2. The "Literal" Recorders (Playwright Codegen)

These tools are technically impressive, but they are literal to a fault. If we accidentally click the wrong field and then correct ourselves, the recorder scripts both actions. They also tend to default to specific hierarchical selectors (e.g., div:nth-child(3)) which break the moment we add a new UI element.

Brittle scripts. They record actions, not intent.

3. The "Black Box" Enterprise Suites

There are plenty of "Autonomous Testing Agents" out there. They promise self-healing and zero-code, but they usually require us to install their heavy SDKs, run tests on their specific cloud infrastructure, and pay enterprise pricing.

Vendor lock-in. We designed Bug0 Studio to just give us standard Playwright code that we can run in our own local VS Code or CI/CD pipeline.

Conclusion

We are seeing a flood of AI tools right now, and it is hard to tell which ones are actually useful. But if you are looking for a ChatGPT for testing alternative that is purpose-built for modern web apps, we invite you to check out what we've built.

It solves the context problem by using video, and it solves the setup problem by handling Auth and Storage States natively.

Give it a spin at Bug0 Studio. It’s surprisingly fun to just upload a video and watch clean code appear. We truly believe this is not just a temporary fix, but the blueprint for the best AI software testing tool of the future.

Feel free to hop on into our Discord and share your thoughts or feature requests.