Beginner5 min readGuides

Problem Register Template: Track Issues Systematically

Ready-to-use Problem Register templates for tracking issues during AI-assisted problem solving with severity tracking and audit trails.

Published October 29, 2025

Problem Register Template

The Problem Register is your audit trail during systematic problem solving. It tracks every issue discovered, its severity, root cause, proposed action, and resolution status.

Why Use a Problem Register?

Benefits:

  • Complete view of all issues, not just the obvious ones
  • Prevents scope creep by tracking what's in/out of scope
  • Enables handoffs with full context for next person
  • Creates accountability with explicit status tracking
  • Supports retrospectives to understand what went wrong

Basic Template (Copy-Paste)

## Problem Register

| ID | Sev | Category | Evidence | Root Cause | Proposed Action | Status | Confidence |
|----|-----|----------|----------|------------|-----------------|--------|------------|
| P-01 |  |  |  |  |  |  |  |
| P-02 |  |  |  |  |  |  |  |
| P-03 |  |  |  |  |  |  |  |

Column Definitions

ID

Format: P-01, P-02, etc.

Sequential identifier. Use leading zeros for sorting.

Sev (Severity)

Values: P0, P1, P2

P0 (Critical):

  • Production completely down
  • Data loss or corruption
  • Active security vulnerability
  • Legal/financial impact
  • Must resolve before claiming done

P1 (High):

  • Partial outage (>10% users affected)
  • Core feature broken
  • Wrong business logic
  • Performance degradation (>2x normal)
  • Must resolve before claiming done

P2 (Medium):

  • Edge cases (<5% users)
  • Cosmetic issues
  • Technical debt
  • Nice-to-have improvements
  • Can defer with documented rationale

Category

Examples:

  • Runtime, Build, Config, Database, Auth, API, Frontend, Backend, Performance, Security, Data Quality, UX, Documentation

Choose categories relevant to your domain.

Evidence

Format: Short snippet or pointer

Good examples:

  • "EADDRINUSE" in startup logs
  • p95 latency 800ms (was 150ms)
  • 502 errors on /api/auth endpoint
  • Missing index on users.email, query takes 4s

Bad examples:

  • "It's slow" (not specific)
  • "Doesn't work" (no concrete observation)

Root Cause

Format: Hypothesis about underlying cause

Good examples:

  • Start command doesn't bind to $PORT
  • Connection pool size too small (10 connections, 50 concurrent requests)
  • Missing environment variable DATABASE_URL

Bad examples:

  • "Bug" (too vague)
  • "User error" (deflecting responsibility)

If unknown, write: "Under investigation" and update when determined.

Proposed Action

Format: Specific, reversible action

Good examples:

  • Update start script to use process.env.PORT
  • Increase pool size from 10 to 30
  • Add DATABASE_URL to Vercel environment variables

Bad examples:

  • "Fix it" (not actionable)
  • "Investigate more" (not an action, keep status as Investigating)

Status

Values: Planned, In Progress, Resolved, Blocked, Deferred

Planned: Root cause known, action identified, not started
In Progress: Currently being addressed
Resolved: Fixed and verified
Blocked: Cannot proceed (specify blocker in Evidence or Root Cause)
Deferred: Acknowledged but postponed (requires rationale)

Confidence

Format: 0.0 to 1.0

Your confidence in the root cause hypothesis.

0.0-0.3: Low confidence, multiple competing hypotheses
0.4-0.6: Medium confidence, needs more investigation
0.7-0.9: High confidence, clear evidence
1.0: Certain (only use when verified with multiple methods)


Example: API Performance Issue

## Problem Register

| ID | Sev | Category | Evidence | Root Cause | Proposed Action | Status | Confidence |
|----|-----|----------|----------|------------|-----------------|--------|------------|
| P-01 | P1 | Database | Queries taking 2-4s to users table | Missing index on users.email | CREATE INDEX idx_users_email | Resolved | 0.85 |
| P-02 | P0 | Runtime | "Connection pool exhausted" every 30s in logs | Pool size (10) < concurrent requests (50) | Increase pool size to 30 | Resolved | 0.95 |
| P-03 | P1 | Performance | Memory grows to 85% under sustained load | Missing client.release() in error handlers | Add release() calls | Resolved | 0.75 |
| P-04 | P2 | Monitoring | No alert fired when p95 exceeded 500ms | Alert threshold too high (1000ms) | Lower threshold to 500ms | Deferred | 0.90 |

Example: Deploy Failure

## Problem Register

| ID | Sev | Category | Evidence | Root Cause | Proposed Action | Status | Confidence |
|----|-----|----------|----------|------------|-----------------|--------|------------|
| P-01 | P0 | Build | Type error in dashboard/page.tsx:45 | Supabase types out of sync with schema | Regenerate types with npx supabase gen | Resolved | 0.90 |
| P-02 | P0 | Config | "Missing SUPABASE_SERVICE_KEY" in preview logs | Not set in Vercel env vars | Add to Vercel dashboard | Resolved | 1.0 |
| P-03 | P1 | Auth | OAuth callback returns 500, redirect_uri mismatch | Preview URL not in Supabase allowed list | Add *.vercel.app to allowed URLs | Resolved | 0.85 |

Example: Data Analysis

## Problem Register

| ID | Sev | Category | Evidence | Root Cause | Proposed Action | Status | Confidence |
|----|-----|----------|----------|------------|-----------------|--------|------------|
| P-01 | P0 | Confound | Conversion 2.1% → 2.9% coincides with Black Friday | Sale inflated baseline comparison | Exclude sale days from analysis | Resolved | 0.95 |
| P-02 | P1 | Segment | Mobile conversion dropped 1.8% → 1.5% (p=0.04) | New checkout has UX issue on mobile | Flag to product team for investigation | Resolved | 0.80 |
| P-03 | P2 | Sample | Desktop segment has only 2K users | May lack power for desktop-specific conclusions | Document limitation, track for 2 more weeks | Deferred | 0.70 |

Companion: Action Log Template

Use alongside Problem Register to document each fix:

### Pass N — Action Log

**Date**: YYYY-MM-DD HH:MM

**Problems Addressed**: P-01, P-02

**Changes Made**:
- `[file/command/decision]`
- `[file/command/decision]`

**Before → After**:
- Metric/observation before: [value/state]
- Metric/observation after: [value/state]

**Verification (Primary)**:
- Method: [how you verified, e.g., "health check endpoint"]
- Result: [what you observed]

**Verification (Independent)**:
- Method: [different method, e.g., "load test"]
- Result: [what you observed]

**New Signals Discovered?**
- [Yes/No]
- If yes: [brief description, added to Problem Register as P-XX]

**Next Pass Focus**:
- [What discovery method will you use next?]

Example: Complete Pass Documentation

### Pass 1 — Action Log

**Date**: 2025-10-29 14:30

**Problems Addressed**: P-01 (Missing index)

**Changes Made**:
```sql
CREATE INDEX idx_users_email ON users(email);

Before → After:

  • Query time: 2.4s → 0.05s
  • API p95 latency: 800ms → 450ms

Verification (Primary):

  • Method: EXPLAIN ANALYZE on sample query
  • Result: Index scan used, query completes in 50ms

Verification (Independent):

  • Method: Sample 10 API requests via curl
  • Result: Average response time 450ms (improved but not at baseline)

New Signals Discovered?

  • Yes: "Connection pool exhausted" appearing in logs every 30s
  • Added to Problem Register as P-02

Next Pass Focus:

  • Check database connection metrics and pool configuration

---

## Tips for Effective Problem Registers

### 1. Start Early
Create the register before making any changes. Capture your initial understanding.

### 2. Update Continuously
Add entries as you discover issues. Don't wait until the end.

### 3. Be Specific
Vague entries like "performance issue" don't help future you or teammates.

### 4. Link Evidence
Reference log timestamps, metric screenshots, PR numbers, commit hashes.

### 5. Track Confidence
If confidence is low (`<0.7`), consider additional investigation before applying the fix.

### 6. Don't Delete Entries
Mark as Resolved/Deferred instead. Preserves audit trail.

### 7. Review Before "Done"
Check: All P0/P1 items Resolved? Any new issues discovered in last pass?

---

## Integration with Complete Mode

When using Complete Problem Solving Mode, the Problem Register is mandatory:

@complete-mode

Task = [Your task] Context = [Context]

[DONE overlay — Domain]

  • Zero criticals: Problem Register has 0 P0/P1 remaining
  • Evidence pack: Problem Register with all entries

The AI will maintain the register throughout the conversation and refuse to claim done while P0/P1 items remain.

---

## Export Formats

### For GitHub Issues
```markdown
**Problem Register Summary**

Resolved (3):
- P-01: Missing index on users.email
- P-02: Connection pool too small  
- P-03: Memory leak in error handlers

Deferred (1):
- P-04: Alert threshold too high (low priority)

[Full register in PR description]

For Incident Reports

## Root Cause Analysis

**Timeline** (from Problem Register):
1. 14:00 - P-01 identified: Slow queries
2. 14:15 - P-01 resolved: Index added
3. 14:20 - P-02 identified: Pool exhaustion
4. 14:35 - P-02 resolved: Pool increased
5. 14:40 - P-03 identified: Memory leak
6. 15:00 - P-03 resolved: Release calls added

**All Critical Issues**: Resolved
**Monitoring**: Added alerts for connection pool usage

For Handoffs

## Handoff: API Performance Work

**Completed**:
- See Problem Register P-01, P-02, P-03 (all Resolved)

**Remaining**:
- P-04 (Deferred): Alert tuning, low priority
- Consider adding connection pool metrics to dashboard

**Evidence Pack**:
- PR #789 with all changes
- Datadog screenshot showing p95 back to baseline
- Load test results attached

Quick Start Checklist

  • Copy basic template to your working doc
  • Add first entry for known issue
  • Include Evidence and Sev for each entry
  • Update Status after each action
  • Verify all P0/P1 Resolved before done
  • Save final register for future reference