Guide · Testing
Test Like
You Mean It
Plan, run, assess, and use test findings to build better — with AI doing the heavy lifting
Vibe coding moves fast. Testing is how you stay honest about what you've actually built — and turn "it seems to work" into "I know it works."
This guide covers the full loop: planning what to test, running tests effectively, understanding what findings mean, and feeding results back into better code.
1Plan
2Test
3Assess
4Improve
Repeat
Tools that fail the people relying on them cause real harm. Testing is how you make sure yours doesn't.
THE COMPLETE SERIES Included in The Complete Series ($29).
— 01

Why Testing Matters More in Vibe Coding

When you write every line yourself, you develop an intuitive feel for the code — you remember the edge cases you handled, the logic you chose, the tradeoffs you made. Bugs feel surprising but traceable.

When AI writes the code, you don't have that intuition. The code might look right, run right, and still have subtle issues you'd never catch by reading it — especially under edge conditions, concurrent users, or unexpected inputs.

Testing is the process of replacing intuition with evidence. It's how vibe-coded software earns trust.

"AI doesn't test — it generates. The responsibility for knowing whether it works is entirely yours."
⚠ Common AI Code Issues
AI-generated code tends to fail on: missing input validation · off-by-one errors in loops · race conditions · unclosed connections · incorrect error handling · silent failures · assumptions about data format
✓ What Good Testing Catches
Behaviour that doesn't match your intent · Edge cases AI didn't account for · Regression (new change broke old thing) · Performance under realistic conditions · Security issues that look fine on surface
— 02

Phase 1: Planning Your Tests

01
Start Here
Define what "working" means
Before writing a single test, write down the acceptance criteria for each feature. Not "the button submits the form" — but "when valid data is submitted, the user sees a success message and the record appears in the list within 2 seconds."
What is the expected outcome for the happy path?
What should happen when input is invalid or empty?
What should NOT happen? (negative acceptance criteria)
02
Prioritise
Identify your critical paths
Not everything needs the same test depth. Focus testing energy on: the core user journey, data-handling logic, anything that touches money/auth/permissions, and anything the AI had to make complex decisions about.
What's the single most important thing the app does?
What would destroy user trust if it broke silently?
What was most complex for the AI to generate?
03
Choose Type
Match test type to the risk
Different layers need different testing. Don't apply the same approach everywhere. Unit tests for logic, integration tests for data flow, manual exploratory for UX and edge cases, automated e2e for the critical journey.
Is this a logic/calculation issue? → Unit test
Is this a flow issue? → Integration or E2E test
Is this a feel/UX issue? → Manual exploration
04
Use AI
Ask AI to write the test plan
Give your AI the feature description and acceptance criteria, then ask it to generate a test plan. It's excellent at thinking of edge cases you'd miss and writing the boilerplate. You review and approve.
"What edge cases should I test for [feature]?"
"Generate a test file for this component"
"What inputs might break this function?"
Test Planning Prompt Template
I've built the following feature: [describe it briefly] Here's the code: [paste relevant code or file] Please generate a comprehensive test plan covering: 1. Happy path scenarios (normal expected usage) 2. Edge cases and boundary conditions 3. Invalid/unexpected input handling 4. Things that could silently fail 5. Any security or data integrity concerns For each scenario, specify: - What to test - What input to use - What the expected output/behaviour is Then write the test file using [Jest/Vitest/pytest — your framework].
— 03

Types of Testing & When to Use Them

Unit Testing
Tests a single function or component in isolation. Fast, precise, and easy to automate. Best for logic-heavy code — calculations, transformations, validators.
→ "Does calculateTotal() return the right value with a 20% discount applied?"
Integration Testing
Tests that multiple parts work together. Catches the bugs that live in the gaps between components — correct in isolation, broken when combined.
→ "When a user submits the form, does the data actually get saved to the database?"
End-to-End (E2E) Testing
Simulates a real user journey through the full application stack. Slow and brittle but catches what everything else misses. Reserve for your most critical flows.
→ "Can a new user sign up, create a task, and see it in their dashboard?"
Manual / Exploratory Testing
You use the product like a real user would — including trying unexpected things. Essential for UX, responsiveness, and the weird edge cases no one thought to script.
→ "What happens if I paste 10,000 characters into the text field?"
Regression Testing
Re-running old tests after a change to ensure nothing broke. Critical in vibe coding where AI changes can have unintended side effects. Automate this so it happens every time.
→ Run your full test suite before merging any AI-generated change
Smoke Testing
A quick check that the most essential functions still work before you do anything else. Run this immediately after deployment or any significant AI-assisted change.
→ "Does the app start? Can I log in? Can I do the core action?"
Performance Testing
Checks that the app works under realistic load. AI-generated code can have surprising performance characteristics — N+1 queries, re-renders on every keystroke, etc.
→ "Does the search still feel fast with 10,000 records?"
Accessibility Testing
Verifies the app is usable with keyboard navigation, screen readers, and in low-vision scenarios. AI often generates visually correct but accessibility-poor interfaces.
→ Run axe DevTools or Lighthouse. Tab through every interactive element.
The Vibe Coder's Testing Priority Order
Start with: 1) manual smoke test → 2) critical path integration test → 3) unit tests for complex logic → 4) automated regression suite. Don't try to test everything at once — build coverage incrementally as the codebase stabilises.
— 04

Phase 2: Running Tests Effectively

1. Test before you change
Always run your existing tests before making AI-assisted changes. This creates a baseline — you'll know if a problem existed before your edit or was introduced by it.
2. Write the test first when possible
For new features, try writing the test (or acceptance criteria) before asking AI to build the feature. This clarifies requirements and gives the AI something concrete to satisfy. Even a rough test sketch helps.
3. Run tests after every significant change
Don't accumulate 5 AI-assisted changes before running tests. The more you stack, the harder it is to isolate which change caused a failure. One change → test → commit is the ideal loop.
4. Test at the boundary
Always test: exactly at the limit (if max 100 chars, test 100), just over (101), and with empty/null values. AI code often handles the middle correctly but breaks at the edges.
5. Use AI to debug failing tests
When a test fails, paste the test, the implementation, and the error output into your AI and ask it to explain why it's failing and how to fix it. This is one of the highest-value AI interactions in the testing workflow.
Manual Testing Checklist (per feature)
Automated Test Setup Prompt
# Ask AI to set up your test infrastructure "Set up a testing environment for this project. Stack: [your stack]. I need: - A test runner configured and working - One example unit test for [function name] - One integration test for [key flow] - A script to run all tests Keep the setup minimal. Show me what to run."
— 05

Phase 3: Assessing Your Findings

Not all bugs are equal. Before deciding what to fix — and in what order — assess each finding against two dimensions: how likely is it to happen, and how bad is it when it does?

The Severity/Likelihood Matrix

Impact ↑
Low Likelihood
High Likelihood
Monitor
High impact but rare. Log it, document it, fix it when you have capacity.
Fix Immediately
High impact AND common. Stop everything. Fix this now.
Backlog
Low impact, low frequency. Track it but deprioritise.
Fix Soon
Happens often but low impact. Schedule it for next session.

Severity Classification

Severity Definition Action
Critical Data loss, security breach, core feature completely broken Fix before shipping anything
High Major feature broken, significant UX degradation, data inconsistency Fix in current sprint
Medium Feature partially broken, workaround exists, minor data issue Schedule and track
Low Cosmetic, minor UX friction, edge case with low impact Backlog, fix when convenient
Finding Assessment Prompt
"I found this bug: [describe it]. Here's the code: [paste]. Help me understand: 1) the root cause, 2) what other areas of the app might be affected, 3) the severity, 4) the fix."

The Bug Log: What to Record

# What happened Steps to reproduce Expected Actual Severity Status
001 Form submits with empty required fields 1. Leave name blank. 2. Click Submit. Validation error shown Form submits, creates empty record High Open
002 Delete button colour incorrect on hover 1. Hover over delete button. Red background Blue background Low Fixed
003 Search is slow with large dataset 1. Load 5000+ records. 2. Type in search box. < 100ms response ~2 second lag Medium In Progress
— 06

Phase 4: Using Findings to Improve

🔍
Find the bug
🧠
Understand root cause
🔧
Fix with AI
Write a test that catches it
🚀
Never regress

The Fix Workflow

1. Isolate before fixing
Before asking AI to fix anything, reproduce the problem in the simplest possible case. The more isolated the example, the more accurate the fix will be.
2. Explain, then fix
Ask the AI to explain the root cause before implementing the fix. If the explanation doesn't make sense to you, push back. A fix you don't understand is a future mystery bug.
3. Write the regression test first
Before applying the fix: write a test that currently fails because of the bug. Apply the fix. The test should now pass. This proves the fix works and protects against regression.
4. Check for similar issues
Many bugs are a symptom of a pattern. Ask: "Are there other places in the codebase where this same mistake might have been made?" Then search or ask AI to scan for similar patterns.
5. Run the full suite
After fixing, run all your tests — not just the new one. AI fixes frequently solve one problem and silently introduce another. The suite will catch it before users do.

Using Patterns to Improve Prompting

Testing findings aren't just about fixing bugs — they're intelligence about how you and your AI work together. Track patterns:

Pattern: Missing Validation
AI often skips input validation. Add to your standard prompt: "Always include input validation and error states. Never assume inputs are valid."
Pattern: Unhandled Async Errors
AI sometimes omits try/catch on async operations. Add: "Wrap all async calls in try/catch with meaningful error handling."
Pattern: No Loading States
AI-generated UIs often have no loading indicators. Add: "Always show a loading state during async operations. Disable interactive elements while loading."
Build Your Own System Prompt
Keep a running list of your recurring bugs. Paste them into your next session as "things to always do." This permanently improves AI output for your codebase.
— 07

Quick Reference: Prompts for Every Testing Stage

Generate Tests
# Unit tests "Write unit tests for this function. Cover: happy path, empty input, boundary values, invalid types. Use [Jest/Vitest/pytest]." # Component tests "Write tests for this React component. Test: renders correctly, handles click events, shows error state, and is accessible."
Debug Failures
# Failing test "This test is failing: [paste test] Here's the implementation: [paste code] Error: [paste error] Explain why it's failing and suggest a fix."
Review for Issues
# Code review prompt "Review this code for: 1. Missing error handling 2. Security vulnerabilities 3. Edge cases not handled 4. Performance issues 5. Missing input validation Be specific. Show line numbers."