Complete Mode in Action: Real Examples
Learn through annotated examples showing how Complete Problem Solving Mode's multi-pass discovery reveals hidden issues that standard prompting misses.
Example 1: API Performance Degradation
The Problem
Production API response times increased from 150ms (p95) to 800ms over 24 hours. No obvious errors in logs.
Standard Approach (Single Pass)
User: "API is slow, check the logs?"
AI: "Found slow database queries. Add an index on users.email."
Result: Performance improves to 600ms but still 4x baseline. Root causes missed: connection pool exhaustion and memory leak.
Complete Mode Approach (Multi-Pass)
@complete-mode
Task = Fix API response time degradation
Context = REST API, p95 went 150ms → 800ms in 24h, have logs + Datadog + DB access
[DONE overlay — Software/DevOps]
- Performance: p95 back to baseline (≤200ms) for ≥30min
- Logs clean: no errors last 2min
- Zero criticals
- Evidence: Datadog dashboard, load test results
scope:runtime depth:deep strict:on
Pass 1 — Application Logs:
- Found: Slow queries (2-4s on users table)
- Action: Added index on users.email
- Result: Queries now 50ms, API improves to 450ms
- Re-scan trigger: Still not at baseline
Pass 2 — Database Connection Metrics:
- Found: "Connection pool exhausted" every 30s
- Action: Increased pool size 10 → 30
- Result: p95 improves to 180ms
- Re-scan trigger: Verify under sustained load
Pass 3 — Load Testing:
- Found: Memory grows to 85% under sustained 100 req/s
- Action: Fixed missing client.release() in error handlers
- Result: Memory stable at 45%, p95=165ms
Final Evidence:
- Datadog: p95=165ms for 30min (see screenshot)
- Load test: 1000 reqs, p95=170ms, 0 errors
- PR #789 with all changes + tests
What Was Caught: Standard approach found 1 of 3 issues. Complete Mode found all 3 through different discovery methods each pass.
Example 2: Data Analysis — False Signal
The Problem
Product team claims new checkout flow increased conversion by 8%.
Standard Approach
User: "Did new checkout increase conversion?"
AI: "Yes! Conversion went from 2.1% to 2.9%. That's 38% relative improvement."
Result: Team announces success. Later discovered the increase was due to Black Friday sale starting the same day. Actual conversion slightly decreased.
Complete Mode Approach
@complete-mode
Task = Validate conversion increase claim
Context = Launched 2 weeks ago, 50K sessions, have Amplitude + DB
[DONE overlay — Data/Analytics]
- Objective met: increase is real with `p<0.05`
- Statistical validity: controlled for confounds
- Reproducibility: SQL + Amplitude provided
- Zero criticals
- Evidence: plots, stats, query, dashboard
scope:experiment-design depth:deep strict:on
Pass 1 — Basic Metrics:
- Found: 2.1% → 2.9% (p=0.03, appears significant)
- Re-scan trigger: Check for external events
Pass 2 — External Events:
- Found: Black Friday sale launched same day
- Action: Exclude sale days from analysis
- Result: 2.1% → 2.0% (p=0.68, NOT significant)
- Re-scan trigger: Segment by device/demographic
Pass 3 — Segmentation:
- Found: Mobile conversion dropped 1.8% → 1.5% (p=0.04)
- Desktop: No change (2.4% → 2.6%, p=0.15)
- Critical finding: Mobile UX regression
Final Evidence:
- SQL queries showing confound-adjusted analysis
- Time series plot highlighting sale spike
- Segment breakdown revealing mobile issue
- Statistical tests: overall p=0.68, mobile p=0.04
What Was Caught: Standard approach missed confound and mobile regression. Complete Mode prevented false announcement and identified actionable UX fix.
Example 3: Deploy Failure (Vercel)
Standard Approach
User: "Vercel deploy failing, help?"
AI: "Looks like a build error. Check your TypeScript config."
Result: One issue fixed, deploy still fails with different error. Takes 5 iterations to resolve.
Complete Mode Approach
@complete-mode
Task = Fix Vercel production deploy
Context = Next.js 14 + TypeScript + Supabase, last success 3 days ago
[DONE overlay — Software/DevOps]
- Cloud health: /api/health 200 for ≥30min
- Logs clean: last 2min no errors
- Local gates: build + test + lint pass
- Zero criticals
- Evidence: health URL, logs, test output, deploy URL
scope:build depth:deep strict:on
Pass 1 — Build Logs:
| ID | Sev | Category | Evidence | Root Cause | Action | Status | Conf |
|----|-----|----------|----------|------------|--------|--------|------|
| P-01 | P0 | Build | Type error dashboard/page.tsx:45 | Supabase types out of sync | Regen types | Resolved | 0.9 |
Pass 2 — Runtime Logs (Preview Deploy):
| P-02 | P0 | Config | "Missing SUPABASE_SERVICE_KEY" | Not in Vercel env | Add to dashboard | Resolved | 1.0 |
Pass 3 — Manual Testing:
| P-03 | P1 | Auth | OAuth callback 500, redirect_uri | Not in Supabase allowed list | Add preview URL pattern | Resolved | 0.85 |
What Was Caught: All 3 issues found in sequence using different discovery methods (build logs → runtime logs → manual test). Standard approach would have required 5+ back-and-forth iterations.
Key Patterns Across Examples
Pattern 1: Different Discovery Methods Reveal Different Issues
| Method | Catches |
|---|---|
| Application logs | Runtime errors, exceptions |
| System metrics | Resource exhaustion, performance |
| Load testing | Behavior under stress |
| Manual testing | Integration issues |
| External data | Confounds, timing |
| Code review | Logic errors, security |
Lesson: Single method = single class of issues found. Multi-method = comprehensive discovery.
Pattern 2: Independent Verification Catches False Positives
-
Primary: Tests pass
-
Independent: Load test reveals memory leak
-
Primary: Metric improved
-
Independent: Confound analysis shows it's spurious
Lesson: Two verification methods prevent false confidence.
Pattern 3: Re-scan Triggers
Know when to keep going:
- ✅ Baseline not restored
- ✅ Only 1 issue found
- ✅ Fix seems "too easy"
- ✅ High impact context
- ✅ Cannot explain full symptom
Before/After Comparison
Typical Standard Prompting Session
User: Problem X
AI: Try solution A
User: *applies A*
User: Still broken
AI: Try solution B
User: *applies B*
User: Still broken
AI: Try solution C
...
Characteristics:
- 5-10 iterations
- Reactive fixing
- No audit trail
- Partial solutions
- No verification strategy
Complete Mode Session
User: @complete-mode [Task + Context + Overlay]
AI: Pass 1 discovery → Fix P-01 → Verify → Re-scan
Pass 2 discovery → Fix P-02 → Verify → Re-scan
Pass 3 discovery → Fix P-03 → Verify → DONE
Evidence pack + Problem Register + What to Watch
Characteristics:
- 1 structured request
- Proactive discovery
- Full audit trail
- Comprehensive solution
- Multi-method verification
When Complete Mode Saves Time
Seems slower: 3 passes vs 1 quick fix
Actually faster when:
- Standard approach would need 5+ iterations to find all issues
- Problem recurs within days/weeks (rework cost)
- Handoff required (documentation time)
- High stakes (mistake cost)
ROI calculation:
- Complete Mode: 2x initial time
- Avoided rework: 10x time savings
- Net: 5x efficiency gain
Try It Yourself: Practice Prompts
Exercise 1: Debug a "Working" Feature
@complete-mode
Task = Verify password reset flow is production-ready
Context = Feature branch merged, manual test passed once
[DONE overlay — Software/DevOps]
[Add appropriate gates]
depth:deep strict:on
Expected discoveries:
- Email deliverability issues
- Token expiration edge cases
- Rate limiting gaps
- Security header misconfigurations
Exercise 2: Validate an A/B Test
@complete-mode
Task = Confirm blue CTA button outperformed red
Context = 1 week test, 5K users each variant, 12% vs 10% conversion
[DONE overlay — Data/Analytics]
[Add appropriate gates]
scope:experiment-design depth:deep
Expected discoveries:
- Sample size adequacy
- Multiple testing correction
- Temporal confounds (day of week, time of day)
- Segment heterogeneity
Common Mistakes
Mistake 1: Using same discovery method multiple times Fix: Explicitly vary methods each pass (logs → metrics → tests)
Mistake 2: Accepting "looks good" without independent verification Fix: Always require second validation approach
Mistake 3: Stopping after first issue resolved
Fix: Add require_passes:3 to force deeper exploration
Mistake 4: Not documenting Problem Register Fix: Maintain register from start, it's your audit trail
Next Steps
- Pick a recent "fixed" issue — Rerun Complete Mode on it, see what was missed
- Practice with low-stakes problems — Build comfort with the pattern
- Compare outcomes — Track recurrence rates before/after adoption
- Customize gates — Adapt domain overlays to your standards
Related Resources
- Complete Mode Framework — Core concepts
- Quick Reference — Templates and toggles
- Software/DevOps Guide — Domain-specific patterns
- Windsurf Rule Setup — Installation