When you write every line yourself, you develop an intuitive feel for the code — you remember the edge cases you handled, the logic you chose, the tradeoffs you made. Bugs feel surprising but traceable.
When AI writes the code, you don't have that intuition. The code might look right, run right, and still have subtle issues you'd never catch by reading it — especially under edge conditions, concurrent users, or unexpected inputs.
Testing is the process of replacing intuition with evidence. It's how vibe-coded software earns trust.
Not all bugs are equal. Before deciding what to fix — and in what order — assess each finding against two dimensions: how likely is it to happen, and how bad is it when it does?
| Severity | Definition | Action |
|---|---|---|
| Critical | Data loss, security breach, core feature completely broken | Fix before shipping anything |
| High | Major feature broken, significant UX degradation, data inconsistency | Fix in current sprint |
| Medium | Feature partially broken, workaround exists, minor data issue | Schedule and track |
| Low | Cosmetic, minor UX friction, edge case with low impact | Backlog, fix when convenient |
| # | What happened | Steps to reproduce | Expected | Actual | Severity | Status |
|---|---|---|---|---|---|---|
| 001 | Form submits with empty required fields | 1. Leave name blank. 2. Click Submit. | Validation error shown | Form submits, creates empty record | High | Open |
| 002 | Delete button colour incorrect on hover | 1. Hover over delete button. | Red background | Blue background | Low | Fixed |
| 003 | Search is slow with large dataset | 1. Load 5000+ records. 2. Type in search box. | < 100ms response | ~2 second lag | Medium | In Progress |
Testing findings aren't just about fixing bugs — they're intelligence about how you and your AI work together. Track patterns: