chore: initial monorepo scaffold + WDS Phase 1+2 artifacts

- Nx 22.7 monorepo (pnpm 11.1, TypeScript 5.9, Node 24) - apps/api: NestJS 11 (CJS conforme CODING-RULES.md PGD-DB-004) - apps/web: React 19 + Vite 8 (ESM) - libs/shared/api-interface: Zod contract base - Docker Compose dev: Postgres 18, Valkey 8, MinIO, Mailpit - WDS artifacts: - design-artifacts/A-Product-Brief/ (5 docs canônicos + 16 dialogs) - design-artifacts/B-Trigger-Map/ (hub + 4 personas + feature impact) - Stack canon: STACK.md v2.2 + CODING-RULES.md v2.0 + brand.md - AGENTS.md + README.md como entrada para devs/agentes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:34:20 +00:00
commit 17c08e6392
3631 changed files with 855518 additions and 0 deletions
--- a/.claude/skills/bmad-story-automator/data/retry-fallback-strategy.md
+++ b/.claude/skills/bmad-story-automator/data/retry-fallback-strategy.md
@@ -0,0 +1,131 @@
+# Retry & Fallback Strategy
+
+**Purpose:** Universal retry and fallback agent pattern for all workflow steps (create, dev, auto, review).
+
+**Version:** 2.0.0
+
+---
+
+## Core Principle
+
+**NEVER escalate to user on first failure.** Exhaust all retry options first:
+1. Try fallback agent (if configured for this task)
+2. Retry with alternating agents up to 5 total attempts
+3. Sleep between retries if network issues detected
+4. Only escalate after all attempts exhausted
+
+---
+
+## Agent Configuration (v3.0.0)
+
+**Deterministic agent resolution via agents file:**
+
+```bash
+# Resolve agent for a specific task (create, dev, auto, review)
+# Uses agents file generated during preflight (complexity-aware)
+resolve_agent_for_task() {
+    local task="$1"
+    local state_file="$2"
+    local story_id="$3"
+
+    result=$("$scripts" orchestrator-helper agents-resolve \
+        --state-file "$state_file" \
+        --story "$story_id" \
+        --task "$task")
+
+    primary_agent=$(echo "$result" | jq -r '.primary')
+    fallback_agent=$(echo "$result" | jq -r '.fallback')
+
+    # Handle "false"/null meaning disabled
+    [ "$fallback_agent" = "false" ] && fallback_agent=""
+}
+
+# Usage:
+resolve_agent_for_task "review" "$state_file" "{story_id}"
+echo "Review task: primary=$primary_agent, fallback=$fallback_agent"
+```
+
+**Fallback behavior:**
+- If `fallback_agent` is empty, "false", or same as primary → retry with primary only
+- If `fallback_agent` differs → alternate between agents on retries
+- Complexity overrides win per task, then per-task overrides, then defaults
+
+---
+
+## Retry Sequence (5 Attempts Max)
+
+| Attempt | Agent | Delay Before | Notes |
+|---------|-------|--------------|-------|
+| 1 | primary | none | Initial attempt |
+| 2 | fallback | 0-60s | Switch agent; delay if network error |
+| 3 | primary | 0-60s | Back to primary |
+| 4 | fallback | 60s | Always delay by attempt 4 |
+| 5 | primary | 60s | Final attempt |
+
+**If no fallback configured:** All 5 attempts use primary agent.
+
+---
+
+## Network Error Detection
+
+**Indicators of network/transient issues:**
+- Session output contains: "connection refused", "timeout", "rate limit", "503", "502"
+- Session crashed with zero output
+- `story-automator monitor-session` returns `final_state: "crashed"` with empty output
+- Session stuck at "never_active" state (no response from API)
+
+**On network error detection:**
+- Sleep 60 seconds before next attempt
+- Log: "Network issue detected, waiting 60s before retry..."
+
+---
+
+## Implementation & Validation Examples
+
+Detailed bash patterns and step-specific validation examples are moved to:
+
+- **`retry-fallback-implementation.md`** (implementation wrapper + per-step validation)
+
+---
+
+## Escalation (After All Attempts)
+
+Only after exhausting all 5 attempts:
+
+1. Update state: `status = "AWAITING_DECISION"`
+2. Log all attempt details:
+   ```
+   [timestamp] ESCALATION: {step} failed after 5 attempts
+   - Attempt 1 (primary): {result}
+   - Attempt 2 (fallback): {result}
+   - Attempt 3 (primary): {result}
+   - Attempt 4 (fallback): {result}
+   - Attempt 5 (primary): {result}
+   ```
+3. Present options to user:
+   - Retry with different settings
+   - Skip this story
+   - Abort orchestration
+
+---
+
+## Integration with Adaptive Retry
+
+This strategy **replaces** the simple retry logic. The adaptive-retry.md plateau detection still applies within this framework:
+
+- If same task plateau detected across 3+ attempts → DEFER instead of escalate
+- Plateau detection runs AFTER agent switching (so both agents hit same wall)
+
+---
+
+## Logging
+
+All retry attempts should be logged in the action log:
+```
+[timestamp] {step} attempt {N}/{max} with {agent}: {result}
+```
+
+On success after retry:
+```
+[timestamp] {step} succeeded on attempt {N} with {agent} (after {N-1} failures)
+```