Complete Mode in Action: Real Examples

Learn through annotated examples showing how Complete Problem Solving Mode's multi-pass discovery reveals hidden issues that standard prompting misses.

Example 1: API Performance Degradation

The Problem

Production API response times increased from 150ms (p95) to 800ms over 24 hours. No obvious errors in logs.

Standard Approach (Single Pass)

User: "API is slow, check the logs?"

AI: "Found slow database queries. Add an index on users.email."

Result: Performance improves to 600ms but still 4x baseline. Root causes missed: connection pool exhaustion and memory leak.

Complete Mode Approach (Multi-Pass)

@complete-mode

Task = Fix API response time degradation
Context = REST API, p95 went 150ms → 800ms in 24h, have logs + Datadog + DB access

[DONE overlay — Software/DevOps]
- Performance: p95 back to baseline (≤200ms) for ≥30min
- Logs clean: no errors last 2min
- Zero criticals
- Evidence: Datadog dashboard, load test results

scope:runtime depth:deep strict:on

Pass 1 — Application Logs:

Found: Slow queries (2-4s on users table)
Action: Added index on users.email
Result: Queries now 50ms, API improves to 450ms
Re-scan trigger: Still not at baseline

Pass 2 — Database Connection Metrics:

Found: "Connection pool exhausted" every 30s
Action: Increased pool size 10 → 30
Result: p95 improves to 180ms
Re-scan trigger: Verify under sustained load

Pass 3 — Load Testing:

Found: Memory grows to 85% under sustained 100 req/s
Action: Fixed missing client.release() in error handlers
Result: Memory stable at 45%, p95=165ms

Final Evidence:

Datadog: p95=165ms for 30min (see screenshot)
Load test: 1000 reqs, p95=170ms, 0 errors
PR #789 with all changes + tests

What Was Caught: Standard approach found 1 of 3 issues. Complete Mode found all 3 through different discovery methods each pass.

Example 2: Data Analysis — False Signal

The Problem

Product team claims new checkout flow increased conversion by 8%.

Standard Approach

User: "Did new checkout increase conversion?"

AI: "Yes! Conversion went from 2.1% to 2.9%. That's 38% relative improvement."

Result: Team announces success. Later discovered the increase was due to Black Friday sale starting the same day. Actual conversion slightly decreased.

Complete Mode Approach

@complete-mode

Task = Validate conversion increase claim
Context = Launched 2 weeks ago, 50K sessions, have Amplitude + DB

[DONE overlay — Data/Analytics]
- Objective met: increase is real with `p<0.05`
- Statistical validity: controlled for confounds
- Reproducibility: SQL + Amplitude provided
- Zero criticals
- Evidence: plots, stats, query, dashboard

scope:experiment-design depth:deep strict:on

Pass 1 — Basic Metrics:

Found: 2.1% → 2.9% (p=0.03, appears significant)
Re-scan trigger: Check for external events

Pass 2 — External Events:

Found: Black Friday sale launched same day
Action: Exclude sale days from analysis
Result: 2.1% → 2.0% (p=0.68, NOT significant)
Re-scan trigger: Segment by device/demographic

Pass 3 — Segmentation:

Found: Mobile conversion dropped 1.8% → 1.5% (p=0.04)
Desktop: No change (2.4% → 2.6%, p=0.15)
Critical finding: Mobile UX regression

Final Evidence:

SQL queries showing confound-adjusted analysis
Time series plot highlighting sale spike
Segment breakdown revealing mobile issue
Statistical tests: overall p=0.68, mobile p=0.04

What Was Caught: Standard approach missed confound and mobile regression. Complete Mode prevented false announcement and identified actionable UX fix.

Example 3: Deploy Failure (Vercel)

Standard Approach

User: "Vercel deploy failing, help?"

AI: "Looks like a build error. Check your TypeScript config."

Result: One issue fixed, deploy still fails with different error. Takes 5 iterations to resolve.

Complete Mode Approach

@complete-mode

Task = Fix Vercel production deploy
Context = Next.js 14 + TypeScript + Supabase, last success 3 days ago

[DONE overlay — Software/DevOps]
- Cloud health: /api/health 200 for ≥30min
- Logs clean: last 2min no errors
- Local gates: build + test + lint pass
- Zero criticals
- Evidence: health URL, logs, test output, deploy URL

scope:build depth:deep strict:on

Pass 1 — Build Logs:

| ID | Sev | Category | Evidence | Root Cause | Action | Status | Conf |
|----|-----|----------|----------|------------|--------|--------|------|
| P-01 | P0 | Build | Type error dashboard/page.tsx:45 | Supabase types out of sync | Regen types | Resolved | 0.9 |

Pass 2 — Runtime Logs (Preview Deploy):

| P-02 | P0 | Config | "Missing SUPABASE_SERVICE_KEY" | Not in Vercel env | Add to dashboard | Resolved | 1.0 |

Pass 3 — Manual Testing:

| P-03 | P1 | Auth | OAuth callback 500, redirect_uri | Not in Supabase allowed list | Add preview URL pattern | Resolved | 0.85 |

What Was Caught: All 3 issues found in sequence using different discovery methods (build logs → runtime logs → manual test). Standard approach would have required 5+ back-and-forth iterations.

Key Patterns Across Examples

Pattern 1: Different Discovery Methods Reveal Different Issues

Method	Catches
Application logs	Runtime errors, exceptions
System metrics	Resource exhaustion, performance
Load testing	Behavior under stress
Manual testing	Integration issues
External data	Confounds, timing
Code review	Logic errors, security

Lesson: Single method = single class of issues found. Multi-method = comprehensive discovery.

Pattern 2: Independent Verification Catches False Positives

Primary: Tests pass
Independent: Load test reveals memory leak
Primary: Metric improved
Independent: Confound analysis shows it's spurious

Lesson: Two verification methods prevent false confidence.

Pattern 3: Re-scan Triggers

Know when to keep going:

✅ Baseline not restored
✅ Only 1 issue found
✅ Fix seems "too easy"
✅ High impact context
✅ Cannot explain full symptom

Before/After Comparison

Typical Standard Prompting Session

User: Problem X
AI: Try solution A
User: *applies A*
User: Still broken
AI: Try solution B
User: *applies B*
User: Still broken
AI: Try solution C
...

Characteristics:

5-10 iterations
Reactive fixing
No audit trail
Partial solutions
No verification strategy

Complete Mode Session

User: @complete-mode [Task + Context + Overlay]
AI: Pass 1 discovery → Fix P-01 → Verify → Re-scan
    Pass 2 discovery → Fix P-02 → Verify → Re-scan  
    Pass 3 discovery → Fix P-03 → Verify → DONE
    Evidence pack + Problem Register + What to Watch

Characteristics:

1 structured request
Proactive discovery
Full audit trail
Comprehensive solution
Multi-method verification

When Complete Mode Saves Time

Seems slower: 3 passes vs 1 quick fix

Actually faster when:

Standard approach would need 5+ iterations to find all issues
Problem recurs within days/weeks (rework cost)
Handoff required (documentation time)
High stakes (mistake cost)

ROI calculation:

Complete Mode: 2x initial time
Avoided rework: 10x time savings
Net: 5x efficiency gain

Try It Yourself: Practice Prompts

Exercise 1: Debug a "Working" Feature

@complete-mode

Task = Verify password reset flow is production-ready
Context = Feature branch merged, manual test passed once

[DONE overlay — Software/DevOps]
[Add appropriate gates]

depth:deep strict:on

Expected discoveries:

Email deliverability issues
Token expiration edge cases
Rate limiting gaps
Security header misconfigurations

Exercise 2: Validate an A/B Test

@complete-mode

Task = Confirm blue CTA button outperformed red
Context = 1 week test, 5K users each variant, 12% vs 10% conversion

[DONE overlay — Data/Analytics]
[Add appropriate gates]

scope:experiment-design depth:deep

Expected discoveries:

Sample size adequacy
Multiple testing correction
Temporal confounds (day of week, time of day)
Segment heterogeneity

Common Mistakes

Mistake 1: Using same discovery method multiple times Fix: Explicitly vary methods each pass (logs → metrics → tests)

Mistake 2: Accepting "looks good" without independent verification Fix: Always require second validation approach

Mistake 3: Stopping after first issue resolved Fix: Add require_passes:3 to force deeper exploration

Mistake 4: Not documenting Problem Register Fix: Maintain register from start, it's your audit trail

Next Steps

Pick a recent "fixed" issue — Rerun Complete Mode on it, see what was missed
Practice with low-stakes problems — Build comfort with the pattern
Compare outcomes — Track recurrence rates before/after adoption
Customize gates — Adapt domain overlays to your standards

Complete Mode Framework — Core concepts
Quick Reference — Templates and toggles
Software/DevOps Guide — Domain-specific patterns
Windsurf Rule Setup — Installation

Complete Mode in Action: Real Conversation Examples

Complete Mode in Action: Real Examples

Example 1: API Performance Degradation

The Problem

Standard Approach (Single Pass)

Complete Mode Approach (Multi-Pass)

Example 2: Data Analysis — False Signal

The Problem

Standard Approach

Complete Mode Approach

Example 3: Deploy Failure (Vercel)

Standard Approach

Complete Mode Approach

Key Patterns Across Examples

Pattern 1: Different Discovery Methods Reveal Different Issues

Pattern 2: Independent Verification Catches False Positives

Pattern 3: Re-scan Triggers

Before/After Comparison

Typical Standard Prompting Session

Complete Mode Session

When Complete Mode Saves Time

Try It Yourself: Practice Prompts

Exercise 1: Debug a "Working" Feature

Exercise 2: Validate an A/B Test

Common Mistakes

Next Steps

Related Resources