chore: initial monorepo scaffold + WDS Phase 1+2 artifacts

- Nx 22.7 monorepo (pnpm 11.1, TypeScript 5.9, Node 24) - apps/api: NestJS 11 (CJS conforme CODING-RULES.md PGD-DB-004) - apps/web: React 19 + Vite 8 (ESM) - libs/shared/api-interface: Zod contract base - Docker Compose dev: Postgres 18, Valkey 8, MinIO, Mailpit - WDS artifacts: - design-artifacts/A-Product-Brief/ (5 docs canônicos + 16 dialogs) - design-artifacts/B-Trigger-Map/ (hub + 4 personas + feature impact) - Stack canon: STACK.md v2.2 + CODING-RULES.md v2.0 + brand.md - AGENTS.md + README.md como entrada para devs/agentes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:34:20 +00:00
commit 17c08e6392
3631 changed files with 855518 additions and 0 deletions
--- a/.claude/skills/bmad-story-automator/data/agent-fallback.md
+++ b/.claude/skills/bmad-story-automator/data/agent-fallback.md
@@ -0,0 +1,136 @@
+# Agent Fallback Strategy (v3.0.0)
+
+**Multi-Agent Support:** The orchestrator can use Claude or Codex as AI coding agents, with automatic fallback on failure.
+
+## Configuration
+
+From state document (v3.0.0):
+```yaml
+agentConfig:
+  defaultPrimary: "auto"
+  defaultFallback: false
+  perTask:
+    dev:
+      primary: "codex"
+      fallback: "claude"
+  complexityOverrides:
+    low:
+      dev:
+        primary: "claude"
+        fallback: false
+```
+
+Agent selection is resolved via the deterministic agents file created in preflight:
+`_bmad-output/story-automator/agents/agents-{state_filename}.md`
+
+## Agent Differences
+
+| Agent | CLI | Prompt Style | Timeout | Todo Tracking |
+|-------|-----|--------------|---------|---------------|
+| Claude | `claude --dangerously-skip-permissions` | Natural language skill prompt | 60min | ☒/☐ checkboxes |
+| Codex | `codex exec --full-auto` | Natural language prompt | 90min (1.5x) | Not supported |
+
+**CRITICAL:** Both Claude and Codex prompts must name the skill/workflow to execute and include the story ID.
+
+The `story-automator tmux-wrapper build-cmd` function automatically generates the correct prompt format based on `AI_AGENT` environment variable.
+
+**See `workflow-commands.md` for complete prompt templates.**
+
+## Fallback Behavior
+
+**When to fallback:**
+- Primary agent session crashes (non-zero exit)
+- Retries exhausted with primary agent
+- `fallback` is configured for the task and not disabled ("false")
+
+**Fallback procedure:**
+1. Log: "Primary agent ({primary}) failed after {retries} attempts. Trying fallback ({fallback})..."
+2. Set environment: `AI_AGENT={fallback}`
+3. Respawn session with fallback agent
+4. Monitor as normal (timeouts auto-adjust based on agent type)
+5. If fallback also fails → CRITICAL escalation
+
+**Environment Variable:**
+```bash
+# Set before spawning session
+export AI_AGENT="codex"  # or "claude"
+
+# story-automator tmux-wrapper reads this automatically and generates correct prompt format
+session=$("$scripts" tmux-wrapper spawn dev {epic} {story_id} \
+  --command "$("$scripts" tmux-wrapper build-cmd dev {story_id})")
+```
+
+## Codex Monitoring Notes
+
+- **No todo checkboxes:** Codex doesn't use ☒/☐ - `todos_done` and `todos_total` will be 0
+- **Longer waits:** Status check script returns 90s wait estimate for Codex (vs 60s for Claude)
+- **Different activity detection:** Uses output freshness + heartbeat (no marker reliance)
+- **Output staleness window:** `CODEX_OUTPUT_STALE_SECONDS` (default: 300)
+- **1.5x timeout multiplier:** `story-automator monitor-session` applies 1.5x multiplier when `--agent codex`
+- **Fake todo progress (v2.2):** When Codex is idle after activity, reports `1/1` to indicate "work done, needs verification"
+- **Idle vs Completed (v2.2):** Codex sessions report "idle" instead of "completed" when CLI stops but no terminal markers
+
+## ⚠️ Codex Code-Review Limitations (v1.5.0)
+
+**CRITICAL: Codex is NOT recommended for code-review workflow.**
+
+### Known Issue: Sprint-Status Not Updated
+
+Codex code-review sessions often complete (CLI exits) WITHOUT updating `sprint-status.yaml` to "done". This causes:
+- Monitor reports "completed" but sprint-status unchanged
+- Orchestrator loops indefinitely, spawning new review cycles
+- 8+ cycles with 0 progress (observed in Story 8.2)
+
+### Root Cause
+
+Codex runs non-interactively via `codex exec`. When it finishes:
+1. Tmux session goes idle (no active CLI process)
+2. Monitor sees "idle" and marks as "completed"
+3. But workflow step 5 (update sprint-status) may not have executed
+4. No way to verify workflow actually finished
+
+### Recommended Configuration
+
+```yaml
+agentConfig:
+  defaultPrimary: "codex"
+  defaultFallback: "claude"
+  perTask:
+    review:
+      primary: "claude"   # Never use Codex for code-review
+      fallback: false
+```
+
+### "incomplete" State (v2.2)
+
+The monitoring system now detects when Codex finishes but sprint-status wasn't updated:
+- `final_state: "completed"` → Verified: sprint-status shows "done"
+- `final_state: "incomplete"` → Session idle but sprint-status NOT "done"
+
+When "incomplete" is detected:
+- **Do NOT retry automatically** (prevents infinite loop)
+- Escalate to user with options:
+  1. Manual fix (update sprint-status yourself)
+  2. Run code-review with Claude
+  3. Skip this story
+
+### Verification Command (v2.2)
+
+Check if code-review actually completed:
+```bash
+"$scripts" orchestrator-helper verify-code-review {story_id}
+# Returns: {"verified":true/false, "sprint_status":"...", ...}
+```
+
+## Backwards Compatibility
+
+- If `agentConfig` is missing, the primary agent resolves from the active runtime provider and fallback is disabled
+- If `aiCommand` is set (legacy), use it directly with the generated natural language prompt
+- New orchestrations should use `agentConfig` instead of `aiCommand`
+- Agents file is authoritative when present
+
+---
+
+## Troubleshooting
+
+See `agent-fallback-troubleshooting.md` for detailed troubleshooting steps.