Vibe Coding — Testing Guide

— 01

Why Testing Matters More in Vibe Coding

When you write every line yourself, you develop an intuitive feel for the code — you remember the edge cases you handled, the logic you chose, the tradeoffs you made. Bugs feel surprising but traceable.

When AI writes the code, you don't have that intuition. The code might look right, run right, and still have subtle issues you'd never catch by reading it — especially under edge conditions, concurrent users, or unexpected inputs.

Testing is the process of replacing intuition with evidence. It's how vibe-coded software earns trust.

"AI doesn't test — it generates. The responsibility for knowing whether it works is entirely yours."

⚠ Common AI Code Issues

AI-generated code tends to fail on: missing input validation · off-by-one errors in loops · race conditions · unclosed connections · incorrect error handling · silent failures · assumptions about data format

✓ What Good Testing Catches

Behaviour that doesn't match your intent · Edge cases AI didn't account for · Regression (new change broke old thing) · Performance under realistic conditions · Security issues that look fine on surface

— 02

Phase 1: Planning Your Tests

01

Start Here

Define what "working" means

Before writing a single test, write down the acceptance criteria for each feature. Not "the button submits the form" — but "when valid data is submitted, the user sees a success message and the record appears in the list within 2 seconds."

✓

What is the expected outcome for the happy path?

✓

What should happen when input is invalid or empty?

✓

What should NOT happen? (negative acceptance criteria)

02

Prioritise

Identify your critical paths

Not everything needs the same test depth. Focus testing energy on: the core user journey, data-handling logic, anything that touches money/auth/permissions, and anything the AI had to make complex decisions about.

✓

What's the single most important thing the app does?

✓

What would destroy user trust if it broke silently?

✓

What was most complex for the AI to generate?

03

Choose Type

Match test type to the risk

Different layers need different testing. Don't apply the same approach everywhere. Unit tests for logic, integration tests for data flow, manual exploratory for UX and edge cases, automated e2e for the critical journey.

✓

Is this a logic/calculation issue? → Unit test

✓

Is this a flow issue? → Integration or E2E test

✓

Is this a feel/UX issue? → Manual exploration

04

Use AI

Ask AI to write the test plan

Give your AI the feature description and acceptance criteria, then ask it to generate a test plan. It's excellent at thinking of edge cases you'd miss and writing the boilerplate. You review and approve.

✓

"What edge cases should I test for [feature]?"

✓

"Generate a test file for this component"

✓

"What inputs might break this function?"

Test Planning Prompt Template

I've built the following feature: [describe it briefly] Here's the code: [paste relevant code or file] Please generate a comprehensive test plan covering: 1. Happy path scenarios (normal expected usage) 2. Edge cases and boundary conditions 3. Invalid/unexpected input handling 4. Things that could silently fail 5. Any security or data integrity concerns For each scenario, specify: - What to test - What input to use - What the expected output/behaviour is Then write the test file using [Jest/Vitest/pytest — your framework].

— 03

Types of Testing & When to Use Them

Unit Testing

Tests a single function or component in isolation. Fast, precise, and easy to automate. Best for logic-heavy code — calculations, transformations, validators.

→ "Does calculateTotal() return the right value with a 20% discount applied?"

Integration Testing

Tests that multiple parts work together. Catches the bugs that live in the gaps between components — correct in isolation, broken when combined.

→ "When a user submits the form, does the data actually get saved to the database?"

End-to-End (E2E) Testing

Simulates a real user journey through the full application stack. Slow and brittle but catches what everything else misses. Reserve for your most critical flows.

→ "Can a new user sign up, create a task, and see it in their dashboard?"

Manual / Exploratory Testing

You use the product like a real user would — including trying unexpected things. Essential for UX, responsiveness, and the weird edge cases no one thought to script.

→ "What happens if I paste 10,000 characters into the text field?"

Regression Testing

Re-running old tests after a change to ensure nothing broke. Critical in vibe coding where AI changes can have unintended side effects. Automate this so it happens every time.

→ Run your full test suite before merging any AI-generated change

Smoke Testing

A quick check that the most essential functions still work before you do anything else. Run this immediately after deployment or any significant AI-assisted change.

→ "Does the app start? Can I log in? Can I do the core action?"

Performance Testing

Checks that the app works under realistic load. AI-generated code can have surprising performance characteristics — N+1 queries, re-renders on every keystroke, etc.

→ "Does the search still feel fast with 10,000 records?"

Accessibility Testing

Verifies the app is usable with keyboard navigation, screen readers, and in low-vision scenarios. AI often generates visually correct but accessibility-poor interfaces.

→ Run axe DevTools or Lighthouse. Tab through every interactive element.

The Vibe Coder's Testing Priority Order

Start with: 1) manual smoke test → 2) critical path integration test → 3) unit tests for complex logic → 4) automated regression suite. Don't try to test everything at once — build coverage incrementally as the codebase stabilises.

— 04

Phase 2: Running Tests Effectively

1. Test before you change

Always run your existing tests before making AI-assisted changes. This creates a baseline — you'll know if a problem existed before your edit or was introduced by it.

2. Write the test first when possible

For new features, try writing the test (or acceptance criteria) before asking AI to build the feature. This clarifies requirements and gives the AI something concrete to satisfy. Even a rough test sketch helps.

3. Run tests after every significant change

Don't accumulate 5 AI-assisted changes before running tests. The more you stack, the harder it is to isolate which change caused a failure. One change → test → commit is the ideal loop.

4. Test at the boundary

Always test: exactly at the limit (if max 100 chars, test 100), just over (101), and with empty/null values. AI code often handles the middle correctly but breaks at the edges.

5. Use AI to debug failing tests

When a test fails, paste the test, the implementation, and the error output into your AI and ask it to explain why it's failing and how to fix it. This is one of the highest-value AI interactions in the testing workflow.

Manual Testing Checklist (per feature)

Does the happy path work end-to-end? What happens with empty inputs? What happens with very large inputs? What happens if I submit the form twice quickly? Does it work on mobile screen size? Does it work in a different browser? What does the error state look like? Can I navigate away and come back without data loss? Does the feature work for edge-case data (special characters, unicode, long strings)?

Automated Test Setup Prompt

# Ask AI to set up your test infrastructure

"Set up a testing environment for this project.
Stack: [your stack]. I need:
- A test runner configured and working
- One example unit test for [function name]
- One integration test for [key flow]
- A script to run all tests

Keep the setup minimal. Show me what to run."
        

— 05

Phase 3: Assessing Your Findings

Not all bugs are equal. Before deciding what to fix — and in what order — assess each finding against two dimensions: how likely is it to happen, and how bad is it when it does?

The Severity/Likelihood Matrix

Impact ↑

Low Likelihood

High Likelihood

Monitor

High impact but rare. Log it, document it, fix it when you have capacity.

Fix Immediately

High impact AND common. Stop everything. Fix this now.

Backlog

Low impact, low frequency. Track it but deprioritise.

Fix Soon

Happens often but low impact. Schedule it for next session.

Severity Classification

Severity	Definition	Action
Critical	Data loss, security breach, core feature completely broken	Fix before shipping anything
High	Major feature broken, significant UX degradation, data inconsistency	Fix in current sprint
Medium	Feature partially broken, workaround exists, minor data issue	Schedule and track
Low	Cosmetic, minor UX friction, edge case with low impact	Backlog, fix when convenient

Finding Assessment Prompt

"I found this bug: [describe it]. Here's the code: [paste]. Help me understand: 1) the root cause, 2) what other areas of the app might be affected, 3) the severity, 4) the fix."

The Bug Log: What to Record

#	What happened	Steps to reproduce	Expected	Actual	Severity	Status
001	Form submits with empty required fields	1. Leave name blank. 2. Click Submit.	Validation error shown	Form submits, creates empty record	High	Open
002	Delete button colour incorrect on hover	1. Hover over delete button.	Red background	Blue background	Low	Fixed
003	Search is slow with large dataset	1. Load 5000+ records. 2. Type in search box.	< 100ms response	~2 second lag	Medium	In Progress

— 06

Phase 4: Using Findings to Improve

🔍

Find the bug

→

🧠

Understand root cause

→

🔧

Fix with AI

→

✅

Write a test that catches it

→

🚀

Never regress

The Fix Workflow

1. Isolate before fixing

Before asking AI to fix anything, reproduce the problem in the simplest possible case. The more isolated the example, the more accurate the fix will be.

2. Explain, then fix

Ask the AI to explain the root cause before implementing the fix. If the explanation doesn't make sense to you, push back. A fix you don't understand is a future mystery bug.

3. Write the regression test first

Before applying the fix: write a test that currently fails because of the bug. Apply the fix. The test should now pass. This proves the fix works and protects against regression.

4. Check for similar issues

Many bugs are a symptom of a pattern. Ask: "Are there other places in the codebase where this same mistake might have been made?" Then search or ask AI to scan for similar patterns.

5. Run the full suite

After fixing, run all your tests — not just the new one. AI fixes frequently solve one problem and silently introduce another. The suite will catch it before users do.

Using Patterns to Improve Prompting

Testing findings aren't just about fixing bugs — they're intelligence about how you and your AI work together. Track patterns:

Pattern: Missing Validation

AI often skips input validation. Add to your standard prompt: "Always include input validation and error states. Never assume inputs are valid."

Pattern: Unhandled Async Errors

AI sometimes omits try/catch on async operations. Add: "Wrap all async calls in try/catch with meaningful error handling."

Pattern: No Loading States

AI-generated UIs often have no loading indicators. Add: "Always show a loading state during async operations. Disable interactive elements while loading."

Build Your Own System Prompt

Keep a running list of your recurring bugs. Paste them into your next session as "things to always do." This permanently improves AI output for your codebase.

— 07

Quick Reference: Prompts for Every Testing Stage

Generate Tests

# Unit tests
"Write unit tests for this function.
Cover: happy path, empty input,
boundary values, invalid types.
Use [Jest/Vitest/pytest]."

# Component tests
"Write tests for this React
component. Test: renders correctly,
handles click events, shows error
state, and is accessible."
        

Debug Failures

# Failing test
"This test is failing:
[paste test]

Here's the implementation:
[paste code]

Error: [paste error]

Explain why it's failing and
suggest a fix."
        

Review for Issues

# Code review prompt
"Review this code for:
1. Missing error handling
2. Security vulnerabilities
3. Edge cases not handled
4. Performance issues
5. Missing input validation

Be specific. Show line numbers."