chore: initial monorepo scaffold + WDS Phase 1+2 artifacts
- Nx 22.7 monorepo (pnpm 11.1, TypeScript 5.9, Node 24) - apps/api: NestJS 11 (CJS conforme CODING-RULES.md PGD-DB-004) - apps/web: React 19 + Vite 8 (ESM) - libs/shared/api-interface: Zod contract base - Docker Compose dev: Postgres 18, Valkey 8, MinIO, Mailpit - WDS artifacts: - design-artifacts/A-Product-Brief/ (5 docs canônicos + 16 dialogs) - design-artifacts/B-Trigger-Map/ (hub + 4 personas + feature impact) - Stack canon: STACK.md v2.2 + CODING-RULES.md v2.0 + brand.md - AGENTS.md + README.md como entrada para devs/agentes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
154
.agents/skills/bmad-workflow-builder/references/build-process.md
Normal file
154
.agents/skills/bmad-workflow-builder/references/build-process.md
Normal file
@@ -0,0 +1,154 @@
|
||||
**Workspace.** Once intent is clear and the target skill is named (propose a kebab-case name for new skills if the user didn't give one — they can rename later, that's a logged decision not a redo), write `.decision-log.md` at the skill's root as a peer of `SKILL.md`. The decision log is canonical memory — load-bearing decisions, rejected alternatives, and overrides live on disk, not in the conversation. On resume, append a new session heading; at handoff, audit the log so the user signs off on how their thinking was handled.
|
||||
|
||||
## Phase 1: Classify
|
||||
|
||||
**Outcome:** you and the user agree on the skill type and whether it's part of a module. Reasoning is shared, not hidden.
|
||||
|
||||
| Type | When |
|
||||
|---|---|
|
||||
| **Simple Utility** | Composable building block with clear input → processing → output. Often deterministic. No multi-turn discovery. |
|
||||
| **Simple Workflow** | Multi-step process that fits inline in SKILL.md as named sections (`## Discovery`, `## Constraints`, etc.). Default. |
|
||||
| **Complex Workflow** | SKILL.md routing + carved-out sections in `references/` with descriptive filenames. Reserved for workflows whose SKILL.md would otherwise be too big to scan (~250+ lines). |
|
||||
|
||||
Default to Simple Workflow. Carving is a SIZE decision, not a stage-count decision.
|
||||
|
||||
If module-based: capture module code, other skills it'll invoke (with name / inputs / outputs), and config variables it needs.
|
||||
|
||||
For Workflows that produce an artifact: confirm whether `--headless` should be supported.
|
||||
|
||||
**On Edit:** classification is already set — read it from the existing skill or from `.decision-log.md` frontmatter. Skip this phase.
|
||||
|
||||
## Phase 2: Determine Spec
|
||||
|
||||
**Outcome:** you have everything needed to draft the skill — extracted from what the user has already shared (open-floor + decision log) plus targeted follow-ups for whatever's missing.
|
||||
|
||||
Through what's already known or further conversation, determine all of the following that are relevant:
|
||||
|
||||
| Field | Applies | Notes |
|
||||
|---|---|---|
|
||||
| Name | All | kebab-case. `{module-code}-{name}` for modules, `{name}` standalone. `bmad-` reserved for official. |
|
||||
| Description | All | `[5-8 word summary]. [Use when user says 'specific phrase'.]` See `references/standard-fields.md`. |
|
||||
| Overview | All | What / How / Why-Outcome. Domain framing + theory of mind for interactive or complex skills. |
|
||||
| Role | Workflows | "Act as a [role/expert]" primer. |
|
||||
| Design rationale | Where non-obvious | Choices the executing agent should understand so it doesn't optimize them away. |
|
||||
| External skills | All | Which other skills this calls. |
|
||||
| Scripts | All | Deterministic operations to push out of prompts; see `references/script-opportunities-reference.md`. List non-stdlib deps and get user approval (`uv` required). |
|
||||
| Output documents | All | Yes/no — uses `{document_output_language}` if yes. |
|
||||
| Revisable artifact | If output doc | If Update / Validate intents are likely, propose the Decision-Log Workspace pattern (`references/skill-quality-principles.md`). |
|
||||
| Inputs / outputs | Simple Utility | Format, schema, required fields. |
|
||||
| Stages | Workflows | Named sections (Simple) or carved files in `references/` with descriptive filenames (Complex). |
|
||||
| Module capability | If module-based | phase-name, after, before, is-required, short description. |
|
||||
| Customization | All | Fixed, or swappable templates / paths / hooks? Default no. If yes, walk each scalar (`<purpose>_template`, `<purpose>_output_path`, `on_<event>`); auto-promote in headless. |
|
||||
|
||||
The customization opt-in question (interactive only):
|
||||
|
||||
> "Should this support end-user customization (activation hooks, swappable templates, output paths)? If no, it ships fixed — users who need changes fork it."
|
||||
|
||||
For path conventions and customize.toml schema, see `references/skill-quality-principles.md`.
|
||||
|
||||
**On Edit:** spec is already defined by the existing skill. Read what's relevant to the change, ignore the rest. Update the decision-log with what's actually changing and why.
|
||||
|
||||
## Phase 3: Draft & Refine
|
||||
|
||||
**Load `references/skill-quality-principles.md` before reviewing the plan** — same principles file the quality scanners verify against. Building against it upfront is cheaper than fixing afterwards.
|
||||
|
||||
Present a plan. Point out vague areas. Iterate with the user until the outcome and shape are clear. Apply the principles file's core test to every planned instruction: **would an LLM do this correctly without being told?** If yes, cut it.
|
||||
|
||||
## Phase 4: Build
|
||||
|
||||
**Load:**
|
||||
|
||||
- `references/skill-quality-principles.md` — what earns its place, BMad institutional knowledge, failure modes (already loaded in Phase 3; keep open)
|
||||
- `references/standard-fields.md` — field-by-field schema reference for frontmatter, customize.toml, and the Overview formula
|
||||
- `references/complex-workflow-patterns.md` (Complex Workflow only) — config integration, compaction survival, document-as-cache
|
||||
|
||||
Load `assets/SKILL-template.md` and `references/template-substitution-rules.md`. Default to writing the entire workflow inline in SKILL.md as named sections. Carve out to `references/` ONLY when SKILL.md would otherwise be too big to scan; when you do, use descriptive filenames (`press-release.md`), never numbered prefixes (`01-discover.md`). Output to `{bmad_builder_output_folder}`.
|
||||
|
||||
**If the SKILL.md references multiple internal files** (anything in `references/`, `assets/`, `scripts/`, `agents/`), stamp the Conventions block at the top of SKILL.md (after Overview, before On Activation):
|
||||
|
||||
```markdown
|
||||
## Conventions
|
||||
|
||||
- Bare paths (e.g. `references/press-release.md`) resolve from the skill root.
|
||||
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
|
||||
- `{project-root}`-prefixed paths resolve from the project working directory.
|
||||
- `{skill-name}` resolves to the skill directory's basename.
|
||||
```
|
||||
|
||||
**If `{customizable}` is yes:**
|
||||
|
||||
- Emit `customize.toml` alongside SKILL.md from `assets/customize-template.toml`. Fill `[workflow]` with the Phase 2 scalars.
|
||||
- In SKILL.md, replace hardcoded references with `{workflow.<name>}` indirection. `assets/brief-template.md` → `{workflow.brief_template}` if lifted.
|
||||
- Add the resolver activation step before config load:
|
||||
|
||||
```markdown
|
||||
### Step 1: Resolve the Workflow Block
|
||||
|
||||
Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`
|
||||
|
||||
If the script fails, resolve the `workflow` block yourself by reading these three files in base → team → user order and applying structural merge rules: `{skill-root}/customize.toml`, `{project-root}/_bmad/custom/{skill-name}.toml`, `{project-root}/_bmad/custom/{skill-name}.user.toml`. Scalars override, tables deep-merge, arrays of tables keyed by `code`/`id` replace matching entries and append new ones, all other arrays append.
|
||||
```
|
||||
|
||||
- Execute `{workflow.activation_steps_prepend}` before the workflow's first stage and `{workflow.activation_steps_append}` after greet but before Stage 1. Treat `{workflow.persistent_facts}` as foundational context loaded on activation (`file:` prefix = path/glob; bare entries = literal facts).
|
||||
|
||||
**If `{customizable}` is no:** no `customize.toml`, no resolver step. SKILL.md uses hardcoded paths throughout.
|
||||
|
||||
**If the skill uses the Decision-Log Workspace pattern** (Phase 2 confirmed it produces a revisable artifact):
|
||||
|
||||
- Add `output_dir` and `output_folder_name` scalars to `customize.toml [workflow]`. Default shape:
|
||||
- `output_dir = "{planning_artifacts}/<purpose>"` (e.g. `briefs`, `analyses`)
|
||||
- `output_folder_name = "<purpose>-{project_name}-{date}"`
|
||||
- This implies `{customizable}=yes` — if the user declined customization, ask whether to enable it for these two scalars.
|
||||
- In SKILL.md Activation, after config resolution: bind `{doc_workspace} = {workflow.output_dir}/{workflow.output_folder_name}/`.
|
||||
- Wire Create / Update / Validate intents and a Finalize audit per `references/skill-quality-principles.md` § Decision-Log Workspace Pattern. Follow the **Treatment style** sub-section there: state the principle once where it first applies, mention reads at the moments that matter, no prescribed frontmatter schema, no `## Workspace` header, no tree diagram. The workspace is just files.
|
||||
- If the artifact will feed downstream LLM consumers: offer a `distillate.md` at finalize. Skip with a note if no distillation tool is available; never inline a substitute.
|
||||
|
||||
**Skill source tree** (only create folders that are needed):
|
||||
|
||||
```
|
||||
{skill-name}/
|
||||
├── SKILL.md # Frontmatter, Overview, Activation, the workflow itself (default), routing if carved
|
||||
├── customize.toml # Only if {customizable} is yes
|
||||
├── references/ # Carved-out workflow sections — descriptive names, no numbered prefixes
|
||||
├── assets/ # Templates and other static content the workflow loads
|
||||
├── scripts/ # Deterministic code with tests
|
||||
│ └── tests/
|
||||
```
|
||||
|
||||
Never put workflow content (`*.md` prompt files) directly at skill root — that's `SKILL.md`'s job. Carve-outs always go in `references/`.
|
||||
|
||||
| Location | Contains | LLM relationship |
|
||||
| ----------------- | --------------------------------------------------------- | ------------------------------------ |
|
||||
| **SKILL.md** | Overview, Activation, inline workflow OR routing to refs | LLM identity, the workflow itself |
|
||||
| **`references/`** | Carved-out workflow sections (descriptive names) | Loaded on demand by SKILL.md routing |
|
||||
| **`assets/`** | Templates, starter files, static content | Copied/transformed into output |
|
||||
| **`scripts/`** | Python, shell scripts with tests | Invoked for deterministic operations |
|
||||
|
||||
**If the built skill includes scripts**, also load `references/script-standards.md` — ensures PEP 723 metadata, correct shebangs, and `uv run` invocation from the start.
|
||||
|
||||
**Lint gate** — validate and auto-fix. If subagents are available, delegate lint-fix; otherwise run inline.
|
||||
|
||||
1. Run both lint scripts in parallel:
|
||||
```bash
|
||||
python3 scripts/scan-path-standards.py {skill-path}
|
||||
python3 scripts/scan-scripts.py {skill-path}
|
||||
```
|
||||
2. Fix high/critical findings, re-run (up to 3 attempts per script).
|
||||
3. Run unit tests if scripts exist in the built skill.
|
||||
|
||||
## Phase 5: Handoff
|
||||
|
||||
**Interactive:** show what was built, lint results, and offer next steps (commit, run quality analysis). Decision log is at `{target-skill-path}/.decision-log.md`.
|
||||
|
||||
**Headless** (`{headless_mode}=true`): emit JSON only. `intent` is `"build"` for new, `"edit"` for existing.
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "complete",
|
||||
"intent": "build",
|
||||
"skill": "{target-skill-path}",
|
||||
"decision_log": "{target-skill-path}/.decision-log.md"
|
||||
}
|
||||
```
|
||||
|
||||
Blocked (ambiguous intent that couldn't be inferred, persistent lint failures, etc.): replace `"complete"` with `"blocked"` and add `"reason": "<one-line cause>"`. The log carries the detail.
|
||||
@@ -0,0 +1,95 @@
|
||||
# Complex Workflow Patterns
|
||||
|
||||
Patterns for workflows whose SKILL.md got too big and had to carve out to `references/`. The default for any new skill is **inline** — a multi-stage coaching workflow lives in a single SKILL.md. Reach for these patterns only when SKILL.md genuinely won't fit.
|
||||
|
||||
## Carve-Out Conventions
|
||||
|
||||
When carving out to `references/`:
|
||||
|
||||
- Descriptive filenames (`press-release.md`, `customer-faq.md`, `verdict.md`). Never numbered prefixes — the carve-out is a section, not a "step." SKILL.md decides the order by routing.
|
||||
- Each file works standalone (context compaction can drop SKILL.md). No "as described in the overview."
|
||||
- SKILL.md keeps Overview, Activation, the Conventions block (see `references/skill-quality-principles.md`), and the routing logic. Everything else moves out.
|
||||
- `assets/` is for templates and other static content the workflow loads, not for stages.
|
||||
|
||||
## Workflow Persona
|
||||
|
||||
BMad workflows treat the human operator as the expert. The agent facilitates — asks clarifying questions, presents options with trade-offs, validates before irreversible actions. The operator knows their domain; the workflow knows the process.
|
||||
|
||||
## Config Reading and Integration
|
||||
|
||||
Workflows read config from `{project-root}/_bmad/config.yaml` and `config.user.yaml`.
|
||||
|
||||
**Module-based skills** load with fallback and setup-skill awareness:
|
||||
|
||||
```
|
||||
Load config from {project-root}/_bmad/config.yaml ({module-code} section) and config.user.yaml.
|
||||
If missing: inform user that {module-setup-skill} is available, continue with sensible defaults.
|
||||
```
|
||||
|
||||
**Standalone skills** load best-effort:
|
||||
|
||||
```
|
||||
Load config from {project-root}/_bmad/config.yaml and config.user.yaml if available.
|
||||
If missing: continue with defaults — no mention of a setup skill.
|
||||
```
|
||||
|
||||
Config variables resolved already contain `{project-root}` — never double-prefix.
|
||||
|
||||
## Decision-Log Workspace Pattern (canonical compaction survival)
|
||||
|
||||
For workflows that produce revisable artifacts, the Decision-Log Workspace pattern is the default. See `references/skill-quality-principles.md` for the full treatment.
|
||||
|
||||
**The pattern in one paragraph.** The workspace folder (artifact + `.decision-log.md` + optional `addendum.md` + optional `distillate.md`) exists from the moment intent is confirmed. Decision-log captures every meaningful decision and rationale; addendum captures rejected alternatives. Resume on activation, conflict-detect on update, audit at finalize. The decision log is the load-bearing artifact — the document is what the user takes; the log is what carries identity across sessions.
|
||||
|
||||
**For Complex Workflows that route to carved-out files**, each carved file must work standalone (compaction can drop SKILL.md mid-flow). Carved files reference the workspace by config-resolved path (`{workflow.output_dir}/{workflow.output_folder_name}/`) — never assume in-context state.
|
||||
|
||||
**YAML frontmatter on the primary artifact** (status + inputs survives compaction):
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: 'Analysis: Research Topic'
|
||||
status: 'discovery'
|
||||
inputs:
|
||||
- '{project-root}/docs/brief.md'
|
||||
created: '2025-03-02T10:00:00Z'
|
||||
updated: '2025-03-02T11:30:00Z'
|
||||
---
|
||||
```
|
||||
|
||||
**When NOT to apply:** purely conversational workflows, one-shot single-turn outputs, multi-artifact workflows where each artifact gets its own folder.
|
||||
|
||||
## Routing from SKILL.md
|
||||
|
||||
When SKILL.md routes to a carved-out file, the route is by descriptive name. Use a Stages table near the bottom of SKILL.md:
|
||||
|
||||
```markdown
|
||||
## Stages
|
||||
|
||||
| # | Stage | Purpose | Location |
|
||||
|---|-------|---------|----------|
|
||||
| 1 | Ignition | Raw concept, enforce customer-first thinking | SKILL.md (above) |
|
||||
| 2 | Press Release | Iterative drafting with hard coaching | `references/press-release.md` |
|
||||
| 3 | Customer FAQ | Devil's advocate customer questions | `references/customer-faq.md` |
|
||||
```
|
||||
|
||||
The `#` is a reading aid for the table, not a filename prefix.
|
||||
|
||||
## Module Metadata Reference
|
||||
|
||||
BMad module workflows require extended frontmatter metadata. See `references/metadata-reference.md` for the metadata template and field explanations.
|
||||
|
||||
## Architecture Checklist
|
||||
|
||||
Before finalizing a complex BMad workflow:
|
||||
|
||||
- [ ] Default reconsidered — would this fit inline as named sections in a single SKILL.md?
|
||||
- [ ] Facilitator persona — treats the operator as expert?
|
||||
- [ ] Config integration — language, output locations read and used?
|
||||
- [ ] Conventions block stamped at top of SKILL.md (when multiple internal files are referenced)
|
||||
- [ ] Carve-outs in `references/` use descriptive names, no numbered prefixes
|
||||
- [ ] Each carved file works standalone (compaction survival)
|
||||
- [ ] Decision-Log Workspace pattern applied (or explicit reason for skipping — Simple Utility, one-shot, purely conversational)
|
||||
- [ ] Resume protocol — Activation checks for existing workspace and offers to resume
|
||||
- [ ] Update mode reads `.decision-log.md` first; surfaces conflicts before applying changes
|
||||
- [ ] Final polish — subagent polish step at the end?
|
||||
- [ ] Finalize step includes decision-log audit (every entry → primary, addendum, or explicit process noise)
|
||||
@@ -0,0 +1,140 @@
|
||||
# Quality Analysis
|
||||
|
||||
Communicate with user in `{communication_language}`. Write report content in `{document_output_language}`.
|
||||
|
||||
You orchestrate quality analysis on a BMad workflow or skill. The pipeline is optimized for speed and completeness:
|
||||
|
||||
1. **Deterministic checks** (scripts) — zero tokens, instant
|
||||
2. **LLM scanners** (parallel subagents) — judgment-based analysis against `skill-quality-principles.md`
|
||||
3. **Fast JSON extraction** (deterministic script) — lossless capture of all scanner findings (~10 seconds, no LLM)
|
||||
4. **HTML generation** — interactive, auto-opening report from JSON (no wait for synthesis)
|
||||
5. **Optional markdown synthesis** (LLM subagent, background) — thematic analysis and archival markdown
|
||||
|
||||
The scanners verify against `references/skill-quality-principles.md` — the same file the build process loads at create/edit time. Findings cite the principle that's being violated rather than restating it.
|
||||
|
||||
## Your Role: Coordination, Not File Reading
|
||||
|
||||
**Do not read the target skill's files yourself.** Scripts and subagents do all analysis. You orchestrate: run deterministic scripts and pre-pass extractors, spawn LLM scanner subagents in parallel, hand off to the report creator for synthesis.
|
||||
|
||||
## Headless Mode
|
||||
|
||||
If `{headless_mode}=true`, skip user interaction, use safe defaults, note any warnings, and output structured JSON as specified in the Present Findings section.
|
||||
|
||||
## Pre-Scan Checks
|
||||
|
||||
Check for uncommitted changes. In headless mode, note warnings and proceed. In interactive mode, inform the user, confirm before proceeding, and confirm the workflow is currently functioning.
|
||||
|
||||
## Analysis Principles
|
||||
|
||||
**Effectiveness over efficiency.** The analysis may suggest leaner phrasing, but if the current phrasing captures the right guidance, it should be kept. The report presents opportunities — the user applies judgment.
|
||||
|
||||
## Scanners
|
||||
|
||||
### Lint Scripts (Deterministic — Run First)
|
||||
|
||||
Run instantly, cost zero tokens, produce structured JSON:
|
||||
|
||||
| # | Script | Focus | Output File |
|
||||
| -- | -------------------------------- | --------------------------------------- | -------------------------- |
|
||||
| S1 | `scripts/scan-path-standards.py` | Path conventions | `path-standards-temp.json` |
|
||||
| S2 | `scripts/scan-scripts.py` | Script portability, PEP 723, unit tests | `scripts-temp.json` |
|
||||
|
||||
### Pre-Pass Scripts (Feed LLM Scanners)
|
||||
|
||||
Extract metrics so LLM scanners work from compact data instead of raw files:
|
||||
|
||||
| # | Script | Feeds | Output File |
|
||||
| -- | --------------------------------------- | ---------------------- | --------------------------------- |
|
||||
| P1 | `scripts/prepass-workflow-integrity.py` | architecture scanner | `workflow-integrity-prepass.json` |
|
||||
| P2 | `scripts/prepass-prompt-metrics.py` | architecture scanner | `prompt-metrics-prepass.json` |
|
||||
| P3 | `scripts/prepass-execution-deps.py` | determinism scanner | `execution-deps-prepass.json` |
|
||||
|
||||
### LLM Scanners (Judgment-Based — Run After Scripts)
|
||||
|
||||
Each scanner loads `references/skill-quality-principles.md` and writes a free-form analysis document:
|
||||
|
||||
| # | Scanner | Focus | Pre-Pass | Output File |
|
||||
| -- | ------------------------------------ | ------------------------------------------------------------------------------ | -------- | ---------------------------- |
|
||||
| L1 | `quality-scan-architecture.md` | Structural integrity, prose craft, cohesion (was: integrity + craft + cohesion)| Yes (P1, P2) | `architecture-analysis.md` |
|
||||
| L2 | `quality-scan-determinism.md` | Intelligence placement, parallelization, subagent delegation, script opportunities (was: execution-efficiency + script-opportunities) | Yes (P3) | `determinism-analysis.md` |
|
||||
| L3 | `quality-scan-customization.md` | customize.toml opportunities and abuse | No | `customization-analysis.md` |
|
||||
| L4 | `quality-scan-enhancement.md` | Edge cases, UX gaps, headless potential, facilitative patterns | No | `enhancement-analysis.md` |
|
||||
|
||||
## Execution
|
||||
|
||||
Bind `{quality-report-dir} = {skill-path}/.analysis/{date-time-stamp}/` and create the directory. Use this single name in every script invocation and subagent prompt below. Quality analyses live at the skill's own root, as a peer of `.decision-log.md` and `SKILL.md` — the audit trail travels with the skill.
|
||||
|
||||
### Step 1: Run All Scripts (Parallel)
|
||||
|
||||
```bash
|
||||
python3 scripts/scan-path-standards.py {skill-path} -o {quality-report-dir}/path-standards-temp.json
|
||||
python3 scripts/scan-scripts.py {skill-path} -o {quality-report-dir}/scripts-temp.json
|
||||
uv run scripts/prepass-workflow-integrity.py {skill-path} -o {quality-report-dir}/workflow-integrity-prepass.json
|
||||
python3 scripts/prepass-prompt-metrics.py {skill-path} -o {quality-report-dir}/prompt-metrics-prepass.json
|
||||
uv run scripts/prepass-execution-deps.py {skill-path} -o {quality-report-dir}/execution-deps-prepass.json
|
||||
```
|
||||
|
||||
### Step 2: Spawn LLM Scanners (Parallel)
|
||||
|
||||
After scripts complete, spawn all four LLM scanners as parallel subagents.
|
||||
|
||||
Each subagent receives:
|
||||
- Scanner file to load
|
||||
- Skill path: `{skill-path}`
|
||||
- Output directory: `{quality-report-dir}`
|
||||
- Pre-pass file paths (L1: P1+P2; L2: P3)
|
||||
|
||||
The subagent loads its scanner file (which loads the principles file), analyzes the skill, writes its analysis to `{quality-report-dir}`, and returns the filename.
|
||||
|
||||
### Step 3: Synthesize Report (Parallel with Scanner 4)
|
||||
|
||||
Spawn report creator to synthesize scanner outputs into `report-data.json` and `quality-report.md`. This can run in parallel with the last scanner finishing.
|
||||
|
||||
```bash
|
||||
# Spawn as background task — does not block step 4
|
||||
Agent(description="Synthesize quality report", subagent_type="report-creator", run_in_background=true, prompt="...")
|
||||
```
|
||||
|
||||
The report creator:
|
||||
- Reads all 4 analysis files + prepass JSON
|
||||
- Identifies thematic clusters (root-cause synthesis)
|
||||
- Writes `report-data.json` with: broken, opportunities, strengths, recommendations, detailed_analysis
|
||||
- Writes `quality-report.md` for archival
|
||||
|
||||
### Step 4: Generate & Open HTML Report (Do Not Block on Markdown)
|
||||
|
||||
As soon as `report-data.json` exists (the report creator writes it mid-synthesis), generate the interactive HTML report:
|
||||
|
||||
```bash
|
||||
python3 scripts/generate-html-report.py {quality-report-dir} --open
|
||||
```
|
||||
|
||||
**Important:** Do not wait for `quality-report.md` to be written. The JSON is the complete data source. Open HTML immediately. The markdown report finishes asynchronously and provides archival context.
|
||||
|
||||
### Step 5: Log the Run
|
||||
|
||||
After HTML opens, append a session heading to `{skill-path}/.decision-log.md`:
|
||||
|
||||
```markdown
|
||||
## YYYY-MM-DD — Quality analysis
|
||||
|
||||
Grade: <grade from report-data.json>. Interactive HTML: `.analysis/<timestamp>/quality-report.html`. Full markdown: `.analysis/<timestamp>/quality-report.md`.
|
||||
```
|
||||
|
||||
## Present to User
|
||||
|
||||
**Headless** (`{headless_mode}=true`): emit JSON only.
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "complete",
|
||||
"intent": "analyze",
|
||||
"skill": "{skill-path}",
|
||||
"decision_log": "{skill-path}/.decision-log.md",
|
||||
"report": "{quality-report-dir}/quality-report.md"
|
||||
}
|
||||
```
|
||||
|
||||
Blocked (scanner failure, missing required input, etc.): replace `"complete"` with `"blocked"` and add `"reason": "<one-line cause>"`. The log + any partial report carry the detail.
|
||||
|
||||
**Interactive:** read `report-data.json` and present grade + 2-3 sentence narrative, broken items if any, top opportunities by theme, paths to the full report and HTML. Offer to apply fixes, walk findings, or discuss.
|
||||
@@ -0,0 +1,63 @@
|
||||
# Quality Scan: Skill Architecture
|
||||
|
||||
You are a senior skill architect reviewing a BMad skill. Your job: identify what's missing, mismatched, or over-specified across the skill's structure, prose craft, and overall coherence — the things that would either break execution or push the executing agent into mechanical procedure-following instead of informed judgment.
|
||||
|
||||
**Load `references/skill-quality-principles.md` first.** It is the bar you're testing against. Don't restate its rules; cite them when findings reference them.
|
||||
|
||||
This scan absorbs what was previously three separate scanners (workflow-integrity, prompt-craft, skill-cohesion). Checking these together catches the mismatches that separate scans miss — a workflow split into files that belonged inline, an Overview promise that the execution instructions silently violate, prose that's structurally correct but mechanically deadening.
|
||||
|
||||
## Scan Targets
|
||||
|
||||
- `SKILL.md` — frontmatter, structure, inline workflow content, routing
|
||||
- `references/*.md` — carved-out workflow sections (only present when SKILL.md was genuinely too big to keep inline)
|
||||
- `assets/` — templates and other static content the workflow loads
|
||||
- Anything other than `SKILL.md`, `customize.toml`, and the standard folders at skill root is suspect
|
||||
|
||||
If pre-pass JSON files are provided (`workflow-integrity-prepass.json`, `prompt-metrics-prepass.json`), read those first for compact metrics; read raw files only as needed for judgment calls.
|
||||
|
||||
## What to Find
|
||||
|
||||
Run the principles file against the skill and surface findings in three buckets:
|
||||
|
||||
**Structural integrity** — does what should exist exist, and is it wired correctly?
|
||||
- Frontmatter follows the description format with quoted trigger phrases; no extra fields
|
||||
- `## Overview` and `## On Activation` present and meaningful
|
||||
- When SKILL.md references multiple internal files, the Conventions block is stamped (per the principles file's path-conventions section)
|
||||
- Workflow content is inline in SKILL.md as named sections by default; only carved out to `references/` when SKILL.md was genuinely too big to scan
|
||||
- **Carved-out files use descriptive names (`press-release.md`), NOT numbered prefixes (`01-discover.md`).** Flag numbered-prefix filenames.
|
||||
- **No prompt files at skill root other than `SKILL.md` itself.** Flag any `*.md` workflow content directly under skill root that should be in `references/`.
|
||||
- Routing from SKILL.md uses bare paths from skill root (`references/foo.md`)
|
||||
- References in SKILL.md resolve to existing files (no orphans, no dangling refs)
|
||||
- Carved-out files work standalone — no "as described in the overview" / "see SKILL.md"
|
||||
- Where progression conditions exist, they're testable; "when ready" is vague
|
||||
- Each carved file uses `{communication_language}` (and `{document_output_language}` if it produces a doc)
|
||||
- No template artifacts (`{if-complex-workflow}`, bare `{skillName}`, etc.)
|
||||
- No `## On Exit` sections
|
||||
- Workflow type claim matches actual structure (Complex Workflow with everything inline → reclassify; Simple Workflow with carved references → either inline back or reclassify)
|
||||
|
||||
**Prose craft** — does the SKILL.md and reference prose enable judgment without bloat?
|
||||
- Overview establishes role, mission, and (where relevant) domain framing, theory of mind, design rationale
|
||||
- No re-teaching of LLM-native skills (scoring formulas, calibration tables, adapter proliferation, format-the-output templates)
|
||||
- No defensive padding ("make sure", "remember to", "this workflow is designed to")
|
||||
- Direct imperatives, not "you should" / "please"
|
||||
- Carved-out files survive context compaction — critical instructions in the file itself
|
||||
- Size matches purpose (principles file thresholds); large data tables and reference material lifted out of SKILL.md
|
||||
|
||||
**Cohesion** — does the skill hang together as a purposeful whole?
|
||||
- Description matches what the skill actually does
|
||||
- Workflow flows logically — earlier sections produce what later sections consume; no dead-ends, no overlaps
|
||||
- **Promises-vs-behavior check** — if the Overview or design rationale states a principle ("we do X before Y"), trace through the workflow and verify the instructions enforce or at minimum don't contradict it. Implicit instructions ("acknowledge what you received") that violate stated principles are the most dangerous misalignment because they look correct on casual review.
|
||||
- Complexity matches task — 10 phases for "format a file" is wrong; 2 phases for "architect a system" is wrong
|
||||
- Dependency graph (`after` / `before` / `is-required`) reflects actual data flow, not artificial ordering
|
||||
|
||||
## Output
|
||||
|
||||
Write to `{quality-report-dir}/architecture-analysis.md`. Include:
|
||||
|
||||
- **Assessment** — 2-3 sentence verdict on the skill as a coherent whole
|
||||
- **Findings** — each with severity, file:line, what's wrong, why, how to fix. Distinguish genuine waste from load-bearing context (the principles file calls this out explicitly).
|
||||
- **Strengths** — what's working that should be preserved
|
||||
|
||||
Severity follows the principles file: anything that breaks execution or violates a stated promise is critical/high; over-specification, numbered-prefix filenames, or workflow files at skill root are high; coherence issues are medium; style is low.
|
||||
|
||||
Return only the filename when complete.
|
||||
@@ -0,0 +1,48 @@
|
||||
# Quality Scan: Customization Surface
|
||||
|
||||
You are a customization-surface economist. Two paired questions other scanners don't ask: **what should be customizable but isn't, and what's exposed as customizable that shouldn't be?**
|
||||
|
||||
**Load `references/skill-quality-principles.md` first.** Its "Customization (customize.toml)" section is the schema, naming conventions, and merge rules. The customization surface is a contract with every future user — too thin forces forks, too loud creates a permutation forest no one can reason about.
|
||||
|
||||
This is purely advisory. Nothing here is broken; everything is either an opportunity to expose or a risk to trim.
|
||||
|
||||
## Scan Targets
|
||||
|
||||
- `customize.toml` — if present, the canonical schema for this workflow
|
||||
- `SKILL.md` — `{workflow.X}` references (signals customize.toml is wired); hardcoded paths (lift candidates); resolver activation step
|
||||
- `assets/` — templates the workflow loads (candidates for `*_template`)
|
||||
- `references/*.md` — stage prompts that may reference configurable values
|
||||
|
||||
If no `customize.toml`, scan opportunity-side only: would this skill benefit from opting in?
|
||||
|
||||
## What to Find
|
||||
|
||||
**Opportunities — things to lift:**
|
||||
- Hardcoded template paths in SKILL.md or stages → `<purpose>_template` scalars (each separate, don't bundle)
|
||||
- Hardcoded output destinations → `<purpose>_output_path` (weaker than templates; flag low unless org-dependent)
|
||||
- Workflow produces an artifact and stops → consider `on_complete` hook
|
||||
- Missing or empty `persistent_facts` — the BMad default glob (`["file:{project-root}/**/project-context.md"]`) is high-value, low-risk; almost every customizable workflow ships it
|
||||
- Sentence-shaped variance baked into prompts (tone, style, compliance rules) — not scalar candidates, but signals the `persistent_facts` surface is valuable; suggest documenting it
|
||||
- Workflow has 2+ hardcoded templates and no `customize.toml` at all → high-opportunity to opt in
|
||||
|
||||
**Abuse — things to trim:**
|
||||
- Boolean toggles (3+ in one file = the surface is doing the job of a variant skill; suggest two skills or fewer knobs)
|
||||
- Identity / communication-style / principles in `[workflow]` (those are agent-shape fields — point the author at agent-builder; remove from workflow surface)
|
||||
- 4+ `on_<event>` hooks (workflow internals leaking into the override surface; users can interleave hooks at so many points they break the workflow's contract)
|
||||
- Arrays of tables without `code` or `id` keys (resolver can't merge by key; falls back to append-only — users can't replace items)
|
||||
- Mixed keying (`code` on some, `id` on others) — pick one
|
||||
- Opaque scalar names (`style_config`, `mode`-as-path) — use the principles file's `*_template` / `*_output_path` / `on_<event>` patterns
|
||||
- `customize.toml` declares a scalar but SKILL.md hardcodes the same value (high-abuse — overrides silently no-op; SKILL.md must read `{workflow.<name>}`)
|
||||
- Scalars with no comment explaining when/why to override
|
||||
|
||||
## Output
|
||||
|
||||
Write to `{quality-report-dir}/customization-analysis.md`. Include:
|
||||
|
||||
- **Customization posture** — opted in? Surface size and shape?
|
||||
- **Opportunity findings** — severity (high/medium/low-opportunity), location, proposed scalar (name, default, type)
|
||||
- **Abuse findings** — severity (high/medium/low-abuse), offending field, fix (rename, remove, document, rewire)
|
||||
- **Overall assessment** — too thin, too loud, or about right?
|
||||
- **Top 2-3 insights** distilled
|
||||
|
||||
Return only the filename when complete.
|
||||
@@ -0,0 +1,60 @@
|
||||
# Quality Scan: Determinism & Distribution
|
||||
|
||||
You are a performance and intelligence-placement reviewer. Your job: find work happening in the wrong place — deterministic operations done by an LLM, sequential operations that should run in parallel, parent reads that should be subagent delegations, and prompts doing what a script could do faster, cheaper, and more reliably.
|
||||
|
||||
**Load `references/skill-quality-principles.md` first.** Its "Intelligence placement" and "Subagent constraints" sections are the bar.
|
||||
|
||||
This scan absorbs what was previously two separate scanners (execution-efficiency, script-opportunities). Same root question: where is work happening that shouldn't be happening here?
|
||||
|
||||
## Scan Targets
|
||||
|
||||
- `SKILL.md` — On Activation patterns, inline operations
|
||||
- `*.md` prompt files at root — stage instructions
|
||||
- `references/*.md` — resource-loading patterns
|
||||
- `scripts/` — what already exists (avoid suggesting duplicates)
|
||||
|
||||
If `execution-deps-prepass.json` is provided, read it first for compact dependency metrics.
|
||||
|
||||
## What to Find
|
||||
|
||||
**Script opportunities** — for every operation in a prompt, ask: given identical input, will this always produce identical output? Could you write a unit test for it? If yes, it belongs in a script.
|
||||
|
||||
Patterns to surface:
|
||||
- Validation against schemas, frontmatter checks, naming-convention enforcement
|
||||
- Counting, aggregation, metrics extraction
|
||||
- Format conversion, parsing, structured-data extraction from large files
|
||||
- Cross-reference checks, dependency graph tracing, file-existence verification
|
||||
- **Pre-passes** that hand the LLM compact JSON instead of raw files (highest-value, often missed — the LLM scanner reads the JSON, not the source)
|
||||
- Post-processing validation of LLM-generated output
|
||||
|
||||
For each, estimate the LLM tax in tokens-per-invocation: heavy (500+) → high; moderate (100–500) → medium; light (<100) → low.
|
||||
|
||||
Scripts have access to bash + Python stdlib + PEP 723 deps + git + jq + system tools. Think broadly — a script that builds a dependency graph and feeds the LLM a compact summary is zero tokens for work that would otherwise cost thousands.
|
||||
|
||||
Don't flag operations that genuinely require interpreting meaning, tone, context, or ambiguity. Those stay in prompts.
|
||||
|
||||
**Distribution opportunities** — sequential or parent-bloating patterns:
|
||||
- Independent reads / tool calls / operations done sequentially → batch in one message or fan out to subagents
|
||||
- "Read all files, then analyze" → delegate the reading; parent stays lean
|
||||
- Implicit-read trap (per principles file): language like "review", "acknowledge", "summarize what you have" causes the parent to read files before delegating. Fix: explicit "note paths for subagent scanning; don't read them now"
|
||||
- Subagent prompts without exact return format / "ONLY return X" / token limit → verbose results
|
||||
- Subagent-spawning-from-subagent (will fail at runtime — chain through parent)
|
||||
- Resources loaded as a single block on every activation when they could be loaded selectively
|
||||
- Dependency graph over-constrained (`after` listing things that aren't real inputs) → blocks parallelism
|
||||
- "Gather then process" for independent items → each item should process independently
|
||||
- Validation stages placed AFTER expensive operations → fail-fast lost; cheap validation should run first
|
||||
|
||||
## Output
|
||||
|
||||
Write to `{quality-report-dir}/determinism-analysis.md`. Include:
|
||||
|
||||
- **Existing scripts inventory** — what's already there (so you don't propose duplicates)
|
||||
- **Assessment** — 2-3 sentence verdict on intelligence placement and execution efficiency
|
||||
- **Script findings** — each with severity (LLM tax band), file:line, what the LLM is currently doing, what a script would do, estimated token savings, language, pre-pass potential
|
||||
- **Distribution findings** — each with severity, file:line, current pattern, efficient alternative, estimated impact
|
||||
- **Aggregate token savings** estimate
|
||||
- **Strengths** — efficient patterns worth preserving
|
||||
|
||||
Severity comes from the principles file: anything that will fail at runtime is critical; heavy LLM tax or context-bloating reads are high; missed batching is medium; small parallelization wins are low.
|
||||
|
||||
Return only the filename when complete.
|
||||
@@ -0,0 +1,55 @@
|
||||
# Quality Scan: Enhancement Opportunities
|
||||
|
||||
You are the creative imagination on this review — the one who asks **"what's missing that nobody thought of?"** when other scanners only check what's there. Inhabit the skill as different real users in different real situations, and find the moments where it would confuse, frustrate, dead-end, or underwhelm them — plus the moments where one creative addition would transform the experience.
|
||||
|
||||
**Load `references/skill-quality-principles.md` first.** Its "Patterns BMad has seen pay off" section is the institutional library you'll check the skill against.
|
||||
|
||||
This is purely advisory. Nothing here is broken; everything is opportunity.
|
||||
|
||||
## Scan Targets
|
||||
|
||||
- `SKILL.md`, stage prompts, `references/*.md` — walk the skill end-to-end as users would experience it
|
||||
|
||||
## What to Find
|
||||
|
||||
**Inhabit user archetypes** — the first-timer, the expert who knows what they want, the confused user (invoked by accident or with wrong intent), the edge-case user (technically valid but unexpected input), the hostile environment (deps fail, files missing, context limited), and **the automator** (cron / pipeline / another agent invoking this headless with pre-supplied inputs and expecting a usable return value).
|
||||
|
||||
At each stage, ask:
|
||||
|
||||
- What if the user provides partial, ambiguous, or contradictory input?
|
||||
- What if they want to skip back, change their mind, or exit cleanly mid-flow?
|
||||
- What happens if an external dependency is unavailable?
|
||||
- What if context compaction drops critical state mid-conversation?
|
||||
- Where does the skill complete but leave the user without a clear sense of what they got?
|
||||
|
||||
**Headless assessment** — many workflows are built HITL-only but could work with a flag and a pre-supplied prompt. For each interaction point, ask whether a parameter could replace the question, whether a confirmation could be skipped with a reasonable default, whether a clarification is always needed or only for ambiguous input. Categorize:
|
||||
|
||||
- **Headless-ready** — works today with minimal changes
|
||||
- **Easily adaptable** — needs a headless path on 2-3 stages
|
||||
- **Partially adaptable** — core artifact creation could be headless, but discovery is fundamentally interactive — suggest a "skip to build" entry point
|
||||
- **Fundamentally interactive** — the value IS the conversation (coaching, brainstorming, exploration). That's OK; flag and move on.
|
||||
|
||||
**Facilitative pattern check** — for any skill involving collaborative discovery or guided artifact creation, check the principles file's named patterns: soft-gate elicitation, intent-before-ingestion, capture-don't-interrupt, dual-output, parallel review lenses, three-mode architecture, graceful degradation. Flag missing ones with concrete suggestions when they'd be transformative.
|
||||
|
||||
**Delight opportunities** — quick-win mode for experts, smart defaults from context, proactive insight ("you might also want to consider..."), progress awareness in long flows, useful alternatives when things go wrong, suggestions for adjacent skills.
|
||||
|
||||
**Stay in your lane.** Don't flag structural issues (architecture scanner), efficiency or script opportunities (determinism scanner), or customization (customization scanner). Your findings should be things only a creative thinker would notice.
|
||||
|
||||
## How to Think
|
||||
|
||||
Go wild first — the weirdest user, the worst timing, the most unexpected input. No idea is too crazy in this phase. Then temper. For each wild idea, ask: is there a practical version that would actually improve the skill? If yes, distill to a sharp suggestion. If genuinely impractical, drop it — don't pad findings with fantasies.
|
||||
|
||||
Prioritize by user impact. Preventing confusion outranks adding nice-to-haves.
|
||||
|
||||
## Output
|
||||
|
||||
Write to `{quality-report-dir}/enhancement-analysis.md`. Include:
|
||||
|
||||
- **Skill understanding** — purpose, primary user, key assumptions (2-3 sentences)
|
||||
- **User journeys** — for each archetype: brief narrative, friction points, bright spots
|
||||
- **Headless assessment** — level + which interaction points could auto-resolve + what a headless invocation would need (inputs, return format)
|
||||
- **Facilitative patterns check** — present/missing, which would be most valuable to add
|
||||
- **Findings** — severity (high/medium/low-opportunity), location, what you noticed, concrete suggestion
|
||||
- **Top 2-3 insights** distilled
|
||||
|
||||
Return only the filename when complete.
|
||||
@@ -0,0 +1,182 @@
|
||||
# BMad Quality Analysis Report Creator
|
||||
|
||||
You synthesize scanner output into a unified, actionable quality report. Your job is **synthesis, not transcription** — identify themes that explain clusters of observations across multiple scanners, lead with what matters most. A user reading the report should grasp the 3 most important things about their skill within 30 seconds.
|
||||
|
||||
## Inputs
|
||||
|
||||
- `{skill-path}` — the skill being analyzed
|
||||
- `{quality-report-dir}` — directory with all scanner output and where you write the report
|
||||
|
||||
## Read
|
||||
|
||||
- `*-temp.json` — lint script output (structured findings)
|
||||
- `*-prepass.json` — pre-pass metrics
|
||||
- `*-analysis.md` — LLM scanner analyses (free-form): `architecture-analysis.md`, `determinism-analysis.md`, `customization-analysis.md`, `enhancement-analysis.md`
|
||||
|
||||
## Synthesize Themes
|
||||
|
||||
This is the most important step. Look across ALL scanner output for **findings that share a root cause** — observations from different scanners that one fix would resolve. Ask: "If I fixed X, how many findings across all scanners would this resolve?"
|
||||
|
||||
Group related findings into 3-5 themes. Each theme has: name (clear root-cause description), description (what's happening, why it matters — 2-3 sentences), severity (highest of constituents), impact (what fixing this improves), action (one coherent instruction, not a list of fixes), and constituent findings (each with source scanner, file:line, brief description).
|
||||
|
||||
Findings that don't fit any theme become standalone items.
|
||||
|
||||
## Assess Overall Quality
|
||||
|
||||
- **Grade:** Excellent (no high+ issues, few medium) / Good (some high or several medium) / Fair (multiple high) / Poor (critical issues)
|
||||
- **Narrative:** 2-3 sentences capturing the skill's primary strength and primary opportunity. This is what the user reads first.
|
||||
|
||||
## Write Two Files
|
||||
|
||||
### 1. quality-report.md
|
||||
|
||||
```markdown
|
||||
# BMad Quality Analysis: {skill-name}
|
||||
|
||||
**Analyzed:** {timestamp} | **Path:** {skill-path}
|
||||
**Interactive report:** quality-report.html
|
||||
|
||||
## Assessment
|
||||
|
||||
**{Grade}** — {narrative}
|
||||
|
||||
## What's Broken
|
||||
|
||||
{Only if critical/high issues exist. Each with file:line, what's wrong, how to fix.}
|
||||
|
||||
## Opportunities
|
||||
|
||||
### 1. {Theme Name} ({severity} — {N} observations)
|
||||
|
||||
{Description.} **Fix:** {One coherent action.}
|
||||
|
||||
**Observations:**
|
||||
- {finding} — file:line
|
||||
- ...
|
||||
|
||||
{Repeat for each theme.}
|
||||
|
||||
## Strengths
|
||||
|
||||
{What works — preserve these.}
|
||||
|
||||
## Detailed Analysis
|
||||
|
||||
### Architecture
|
||||
{Assessment + findings not covered by themes (structural integrity, prose craft, cohesion).}
|
||||
|
||||
### Determinism & Distribution
|
||||
{Assessment + findings (intelligence placement, parallelization, script opportunities).}
|
||||
|
||||
### Customization Surface
|
||||
{Assessment + opportunities and abuse findings.}
|
||||
|
||||
### User Experience
|
||||
{Journeys, headless assessment, facilitative-pattern check, edge cases.}
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. {Highest impact — resolves N observations}
|
||||
2. ...
|
||||
```
|
||||
|
||||
### 2. report-data.json
|
||||
|
||||
This is consumed by `scripts/generate-html-report.py`. Use the field names exactly. Arrays may be empty `[]` but must exist.
|
||||
|
||||
```json
|
||||
{
|
||||
"meta": {
|
||||
"skill_name": "the-skill-name",
|
||||
"skill_path": "/full/path/to/skill",
|
||||
"timestamp": "2026-03-26T23:03:03Z",
|
||||
"scanner_count": 6
|
||||
},
|
||||
"narrative": "2-3 sentence synthesis shown at top of report",
|
||||
"grade": "Excellent|Good|Fair|Poor",
|
||||
"broken": [
|
||||
{
|
||||
"title": "Short headline",
|
||||
"file": "relative/path.md",
|
||||
"line": 25,
|
||||
"detail": "Why it's broken and what goes wrong",
|
||||
"action": "Specific fix",
|
||||
"severity": "critical|high",
|
||||
"source": "which-scanner"
|
||||
}
|
||||
],
|
||||
"opportunities": [
|
||||
{
|
||||
"name": "Theme name",
|
||||
"description": "What's happening and why it matters",
|
||||
"severity": "high|medium|low",
|
||||
"impact": "What fixing this achieves",
|
||||
"action": "One coherent fix instruction for the whole theme",
|
||||
"finding_count": 9,
|
||||
"findings": [
|
||||
{
|
||||
"title": "Individual observation headline",
|
||||
"file": "relative/path.md",
|
||||
"line": 42,
|
||||
"detail": "What was observed",
|
||||
"source": "which-scanner"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"strengths": [
|
||||
{
|
||||
"title": "What's strong",
|
||||
"detail": "Why it matters and should be preserved"
|
||||
}
|
||||
],
|
||||
"detailed_analysis": {
|
||||
"architecture": {
|
||||
"assessment": "1-3 sentence summary from architecture scanner",
|
||||
"findings": []
|
||||
},
|
||||
"determinism": {
|
||||
"assessment": "1-3 sentence summary from determinism scanner",
|
||||
"token_savings": "estimated total from script opportunities",
|
||||
"findings": []
|
||||
},
|
||||
"customization": {
|
||||
"assessment": "1-3 sentence summary from customization scanner",
|
||||
"posture": "opted-in|not-opted-in|over-extended",
|
||||
"findings": []
|
||||
},
|
||||
"enhancement": {
|
||||
"assessment": "1-3 sentence summary from enhancement scanner",
|
||||
"journeys": [
|
||||
{
|
||||
"archetype": "first-timer|expert|confused|edge-case|hostile-environment|automator",
|
||||
"summary": "Brief narrative of this user's experience",
|
||||
"friction_points": ["moment where user struggles"],
|
||||
"bright_spots": ["moment where skill shines"]
|
||||
}
|
||||
],
|
||||
"autonomous": {
|
||||
"potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive",
|
||||
"notes": "Brief assessment"
|
||||
},
|
||||
"findings": []
|
||||
}
|
||||
},
|
||||
"recommendations": [
|
||||
{
|
||||
"rank": 1,
|
||||
"action": "What to do",
|
||||
"resolves": 9,
|
||||
"effort": "low|medium|high"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Required field names: `meta.skill_name`, opportunities use `name` and `finding_count`, strengths are objects with `title` and `detail`, recommendations use `action` and numeric `rank`, journeys use `archetype` / `summary` / `friction_points` / `bright_spots`, autonomous uses `potential` / `notes`. The four `detailed_analysis` keys are `architecture`, `determinism`, `customization`, `enhancement`.
|
||||
|
||||
Write both files to `{quality-report-dir}/`.
|
||||
|
||||
## Return
|
||||
|
||||
Return only the path to `report-data.json` when complete.
|
||||
@@ -0,0 +1,100 @@
|
||||
# Script Opportunities Reference — Workflow Builder
|
||||
|
||||
**Reference: `references/script-standards.md` for script creation guidelines.**
|
||||
|
||||
## Core Principle
|
||||
|
||||
Scripts handle deterministic operations (validate, transform, count). Prompts handle judgment (interpret, classify, decide). If a check has clear pass/fail criteria, it belongs in a script.
|
||||
|
||||
---
|
||||
|
||||
## How to Spot Script Opportunities
|
||||
|
||||
### The Determinism Test
|
||||
|
||||
1. **Given identical input, will it always produce identical output?** → Script candidate.
|
||||
2. **Could you write a unit test with expected output?** → Definitely a script.
|
||||
3. **Requires interpreting meaning, tone, or context?** → Keep as prompt.
|
||||
|
||||
### The Judgment Boundary
|
||||
|
||||
| Scripts Handle | Prompts Handle |
|
||||
| -------------------------------- | ------------------------------------ |
|
||||
| Fetch, Transform, Validate | Interpret, Classify (ambiguous) |
|
||||
| Count, Parse, Compare | Create, Decide (incomplete info) |
|
||||
| Extract, Format, Check structure | Evaluate quality, Synthesize meaning |
|
||||
|
||||
### Signal Verbs in Prompts
|
||||
|
||||
When you see these in a workflow's requirements, think scripts first: "validate", "count", "extract", "convert/transform", "compare", "scan for", "check structure", "against schema", "graph/map dependencies", "list all", "detect pattern", "diff/changes between"
|
||||
|
||||
### Script Opportunity Categories
|
||||
|
||||
| Category | What It Does | Example |
|
||||
| ------------------- | ----------------------------------------------------------- | -------------------------------------------------- |
|
||||
| Validation | Check structure, format, schema, naming | Validate frontmatter fields exist |
|
||||
| Data Extraction | Pull structured data without interpreting meaning | Extract all `{variable}` references from markdown |
|
||||
| Transformation | Convert between known formats | Markdown table to JSON |
|
||||
| Metrics | Count, tally, aggregate statistics | Token count per file |
|
||||
| Comparison | Diff, cross-reference, verify consistency | Cross-ref prompt names against SKILL.md references |
|
||||
| Structure Checks | Verify directory layout, file existence | Skill folder has required files |
|
||||
| Dependency Analysis | Trace references, imports, relationships | Build skill dependency graph |
|
||||
| Pre-Processing | Extract compact data from large files BEFORE LLM reads them | Pre-extract file metrics into JSON for LLM scanner |
|
||||
| Post-Processing | Verify LLM output meets structural requirements | Validate generated YAML parses correctly |
|
||||
|
||||
### Your Toolbox
|
||||
|
||||
**Python is the default** for all script logic (cross-platform: macOS, Linux, Windows/WSL). See `references/script-standards.md` for full rationale and safe bash commands.
|
||||
|
||||
- **Python:** Full standard library (`json`, `pathlib`, `re`, `argparse`, `collections`, `difflib`, `ast`, `csv`, `xml`, etc.) plus PEP 723 inline-declared dependencies (`tiktoken`, `jsonschema`, `pyyaml`, etc.)
|
||||
- **Safe shell commands:** `git`, `gh`, `uv run`, `npm`/`npx`/`pnpm`, `mkdir -p`
|
||||
- **Avoid bash for logic** — no piping, `jq`, `grep`, `sed`, `awk`, `find`, `diff`, `wc` in scripts. Use Python equivalents instead.
|
||||
|
||||
### The --help Pattern
|
||||
|
||||
All scripts use PEP 723 metadata and implement `--help`. Prompts can reference `scripts/foo.py --help` instead of inlining interface details — single source of truth, saves prompt tokens.
|
||||
|
||||
---
|
||||
|
||||
## Script Output Standard
|
||||
|
||||
All scripts MUST output structured JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"script": "script-name",
|
||||
"version": "1.0.0",
|
||||
"skill_path": "/path/to/skill",
|
||||
"timestamp": "2025-03-08T10:30:00Z",
|
||||
"status": "pass|fail|warning",
|
||||
"findings": [
|
||||
{
|
||||
"severity": "critical|high|medium|low|info",
|
||||
"category": "structure|security|performance|consistency",
|
||||
"location": { "file": "SKILL.md", "line": 42 },
|
||||
"issue": "Clear description",
|
||||
"fix": "Specific action to resolve"
|
||||
}
|
||||
],
|
||||
"summary": {
|
||||
"total": 0,
|
||||
"critical": 0,
|
||||
"high": 0,
|
||||
"medium": 0,
|
||||
"low": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Checklist
|
||||
|
||||
- [ ] `--help` with PEP 723 metadata
|
||||
- [ ] Accepts skill path as argument
|
||||
- [ ] `-o` flag for output file (defaults to stdout)
|
||||
- [ ] Diagnostics to stderr
|
||||
- [ ] Exit codes: 0=pass, 1=fail, 2=error
|
||||
- [ ] `--verbose` flag for debugging
|
||||
- [ ] Self-contained (PEP 723 for dependencies)
|
||||
- [ ] No interactive prompts, no network dependencies
|
||||
- [ ] Valid JSON to stdout
|
||||
- [ ] Tests in `scripts/tests/`
|
||||
@@ -0,0 +1,92 @@
|
||||
# Script Creation Standards
|
||||
|
||||
When building scripts for a skill, follow these standards to ensure portability and zero-friction execution. Skills must work across macOS, Linux, and Windows (native, Git Bash, and WSL).
|
||||
|
||||
## Python Over Bash
|
||||
|
||||
**Always favor Python for script logic.** Bash is not portable — it fails or behaves inconsistently on Windows (Git Bash is MSYS2-based, not a full Linux shell; WSL bash can conflict with Git Bash on PATH; PowerShell is a different language entirely). Python with `uv run` works identically on all platforms.
|
||||
|
||||
**Safe bash commands** — these work reliably across all environments and are fine to use directly:
|
||||
|
||||
- `git`, `gh` — version control and GitHub CLI
|
||||
- `uv run` — Python script execution with automatic dependency handling
|
||||
- `npm`, `npx`, `pnpm` — Node.js ecosystem
|
||||
- `mkdir -p` — directory creation
|
||||
|
||||
**Everything else should be Python** — piping, `jq`, `grep`, `sed`, `awk`, `find`, `diff`, `wc`, and any non-trivial logic. Even `sed -i` behaves differently on macOS vs Linux. If it's more than a single safe command, write a Python script.
|
||||
|
||||
## Favor the Standard Library
|
||||
|
||||
Always prefer Python's standard library over external dependencies. The stdlib is pre-installed everywhere, requires no `uv run`, and has zero supply-chain risk. Common stdlib modules that cover most script needs:
|
||||
|
||||
- `json` — JSON parsing and output
|
||||
- `pathlib` — cross-platform path handling
|
||||
- `re` — pattern matching
|
||||
- `argparse` — CLI interface
|
||||
- `collections` — counters, defaultdicts
|
||||
- `difflib` — text comparison
|
||||
- `ast` — Python source analysis
|
||||
- `csv`, `xml.etree` — data formats
|
||||
|
||||
Only pull in external dependencies when the stdlib genuinely cannot do the job (e.g., `tiktoken` for accurate token counting, `pyyaml` for YAML parsing, `jsonschema` for schema validation). **External dependencies must be confirmed with the user during the build process** — they add install-time cost, supply-chain surface, and require `uv` to be available.
|
||||
|
||||
## PEP 723 Inline Metadata (Required)
|
||||
|
||||
Every Python script MUST include a PEP 723 metadata block. For scripts with external dependencies, use the `uv run` shebang:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env -S uv run --script
|
||||
# /// script
|
||||
# requires-python = ">=3.10"
|
||||
# dependencies = ["pyyaml>=6.0", "jsonschema>=4.0"]
|
||||
# ///
|
||||
```
|
||||
|
||||
For scripts using only the standard library, use a plain Python shebang but still include the metadata block:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
# /// script
|
||||
# requires-python = ">=3.10"
|
||||
# ///
|
||||
```
|
||||
|
||||
**Key rules:**
|
||||
|
||||
- The shebang MUST be line 1 — before the metadata block
|
||||
- Always include `requires-python`
|
||||
- List all external dependencies with version constraints
|
||||
- Never use `requirements.txt`, `pip install`, or expect global package installs
|
||||
- The shebang is a Unix convenience only — cross-platform invocation always uses `uv run scripts/foo.py`
|
||||
|
||||
## Invocation in SKILL.md
|
||||
|
||||
How a built skill's SKILL.md should reference its scripts:
|
||||
|
||||
- **Scripts with external dependencies:** `uv run scripts/analyze.py {args}`
|
||||
- **Stdlib-only scripts:** `python3 scripts/scan.py {args}` (also fine to use `uv run` for consistency)
|
||||
|
||||
`uv run` reads the PEP 723 metadata, silently caches dependencies in an isolated environment, and runs the script — no user prompt, no global install. Like `npx` for Python.
|
||||
|
||||
## Graceful Degradation
|
||||
|
||||
Skills may run in environments where Python or `uv` is unavailable (e.g., claude.ai web). Scripts should be the fast, reliable path — but the skill must still deliver its outcome when execution is not possible.
|
||||
|
||||
**Pattern:** When a script cannot execute, the LLM performs the equivalent work directly. The script's `--help` documents what it checks, making this fallback natural. Design scripts so their logic is understandable from their help output and the skill's context.
|
||||
|
||||
In SKILL.md, frame script steps as outcomes, not just commands:
|
||||
|
||||
- Good: "Validate path conventions (run `scripts/scan-paths.py --help` for details)"
|
||||
- Avoid: "Execute `python3 scripts/scan-paths.py`" with no context about what it does
|
||||
|
||||
## Script Interface Standards
|
||||
|
||||
- Implement `--help` via `argparse` (single source of truth for the script's API)
|
||||
- Accept target path as a positional argument
|
||||
- `-o` flag for output file (default to stdout)
|
||||
- Diagnostics and progress to stderr
|
||||
- Exit codes: 0=pass, 1=fail, 2=error
|
||||
- `--verbose` flag for debugging
|
||||
- Output valid JSON to stdout
|
||||
- No interactive prompts, no network dependencies
|
||||
- Tests in `scripts/tests/`
|
||||
@@ -0,0 +1,230 @@
|
||||
# Skill Quality Principles
|
||||
|
||||
What earns its place in a BMad skill, and what should be cut. Loaded at both build time (so the author follows the bar upfront) and at quality-analysis time (so scanners verify against the same bar).
|
||||
|
||||
## The Core Test
|
||||
|
||||
For every line you write or review: **would an LLM do this correctly without being told?** If yes, cut it. The instruction must earn its place by preventing a failure that would otherwise happen.
|
||||
|
||||
## What Earns Its Keep
|
||||
|
||||
The model already knows how to facilitate, ask questions, write prose, parse intent, and format markdown. Spend file weight on:
|
||||
|
||||
- **Project paths and outputs** — `{project-root}/...`, config-resolved paths, where the artifact lands.
|
||||
- **Schema** — frontmatter format, customize.toml shape, downstream contracts.
|
||||
- **BMad-specific conventions** — naming (`bmad-` prefix, module prefixes), description format, intelligence placement.
|
||||
- **Hard rules with body count** — the implicit-read trap, subagent-can't-spawn-subagent, compaction survival.
|
||||
- **Fragile-operation invocations** — exact script commands, exact API calls. One right way.
|
||||
- **Domain framing and theory-of-mind** for interactive workflows — context that enables judgment.
|
||||
- **Design rationale** for non-obvious choices — prevents the LLM from "optimizing" away constraints it doesn't understand.
|
||||
|
||||
## What Doesn't Earn Its Keep
|
||||
|
||||
- Numbered procedural steps for things the LLM does naturally
|
||||
- Per-platform adapter files for tools the LLM speaks fluently
|
||||
- Scoring formulas, weighted calibration tables, decision matrices for subjective judgment
|
||||
- Templates teaching output formatting, greeting users, or prompt assembly
|
||||
- "Why It Matters" prose attached to obvious checks
|
||||
- Defensive padding ("make sure", "don't forget", "remember to")
|
||||
- Meta-explanation ("This workflow is designed to...")
|
||||
- Bot personas with rubrics where role + outcome would do the same job
|
||||
- Explaining the model to itself ("You are an AI that...")
|
||||
- Multiple files that could be a single instruction
|
||||
|
||||
## Outcome vs Prescriptive
|
||||
|
||||
| Prescriptive (avoid) | Outcome-based (prefer) |
|
||||
| --- | --- |
|
||||
| "Step 1: Ask about goals. Step 2: Ask about constraints. Step 3: Summarize and confirm." | "Ensure the user's vision is fully captured — goals, constraints, and edge cases — before proceeding." |
|
||||
| "Load config. Read user_name. Read communication_language. Greet by name in their language." | "Load available config and greet the user appropriately." |
|
||||
| "Create a file. Write the header. Write section 1. Write section 2. Save." | "Produce a report covering X, Y, and Z." |
|
||||
|
||||
The prescriptive versions miss requirements the author didn't think of. The outcome-based versions let the LLM adapt.
|
||||
|
||||
## When Procedure IS Value
|
||||
|
||||
Reserve exact steps for fragile operations where deviation has consequences:
|
||||
|
||||
- Exact script invocations (`python3 scripts/foo.py {arg}`)
|
||||
- Specific file paths and config keys
|
||||
- API calls with precise parameters
|
||||
- Security-critical operations
|
||||
- The customize.toml resolver step
|
||||
|
||||
| Freedom | When | Example |
|
||||
| --- | --- | --- |
|
||||
| **High** (outcomes) | Multiple valid approaches, LLM judgment adds value | "Ensure the user's requirements are complete" |
|
||||
| **Medium** (guided) | Preferred approach exists, some variation OK | "Present findings in a structured report with an executive summary" |
|
||||
| **Low** (exact) | Fragile, one right way, consequences for deviation | `python3 scripts/scan-path-standards.py {skill-path}` |
|
||||
|
||||
## BMad Institutional Knowledge
|
||||
|
||||
Things the bare model genuinely won't know. This is what your file weight buys.
|
||||
|
||||
### Naming
|
||||
- Skill name = folder name (kebab-case)
|
||||
- Module skill: `{module-code}-{name}` (e.g. `bmm-create-prd`, `cis-brainstorm`)
|
||||
- Standalone: `{name}`
|
||||
- The `bmad-` prefix is reserved for official BMad creations
|
||||
|
||||
### Description format
|
||||
Two parts: `[5-8 word summary]. [Use when user says 'specific phrase' or 'specific phrase'.]`
|
||||
|
||||
Quote the trigger phrases. Default to conservative (explicit) triggering — most BMad skills are explicitly invoked. Organic triggering is reserved for skills that should activate on context (e.g. "Trigger when code imports anthropic SDK").
|
||||
|
||||
Bad: `Helps with PRDs and product requirements.` (too vague — hijacks unrelated conversations).
|
||||
|
||||
### Path conventions
|
||||
All file references in a skill use bare paths from the skill root. The canonical Conventions block (from `bmad-prfaq/SKILL.md`) — stamp it into any SKILL.md that references multiple internal files:
|
||||
|
||||
```
|
||||
## Conventions
|
||||
- Bare paths (e.g. `references/press-release.md`) resolve from the skill root.
|
||||
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
|
||||
- `{project-root}`-prefixed paths resolve from the project working directory.
|
||||
- `{skill-name}` resolves to the skill directory's basename.
|
||||
```
|
||||
|
||||
Additional rules:
|
||||
- Forward slashes only (cross-platform).
|
||||
- Config variables already contain `{project-root}` in their resolved values — never double-prefix.
|
||||
- `references/` is for prompt content carved out of SKILL.md. `assets/` is for templates and other static content the workflow loads. `scripts/` is for deterministic code. Never put workflow content directly at skill root.
|
||||
|
||||
### Customization (customize.toml)
|
||||
Always-present fields: `activation_steps_prepend`, `activation_steps_append`, `persistent_facts` (each is an array; overrides append).
|
||||
|
||||
Workflow-specific scalars (lifted during configurability discovery):
|
||||
- `<purpose>_template` for template file paths
|
||||
- `<purpose>_output_path` for writable destinations
|
||||
- `on_<event>` for hook scalars
|
||||
|
||||
Arrays of tables MUST key on `code` or `id` (resolver merges by key; without it, falls back to append-only).
|
||||
|
||||
Merge rules: scalars override, tables deep-merge, arrays-of-tables key-merge, plain arrays append.
|
||||
|
||||
Override files: `{project-root}/_bmad/custom/{skill-name}.toml` (team), `.user.toml` (personal). Merge order: base → team → user.
|
||||
|
||||
Default `persistent_facts`: `["file:{project-root}/**/project-context.md"]` is BMad's convention.
|
||||
|
||||
SKILL.md must reference resolved values as `{workflow.<name>}`. Hardcoded paths next to a declared scalar = override silently no-ops.
|
||||
|
||||
### Intelligence placement
|
||||
- Scripts handle plumbing: fetch, parse, validate, count, transform.
|
||||
- Prompts handle judgment: interpret, classify, decide.
|
||||
- Script using regex to decide what content MEANS = intelligence leak into the script.
|
||||
- Prompt validating structure, counting items, comparing against schemas = determinism leak into the LLM.
|
||||
|
||||
### Workflows: inline first, carve out only when needed
|
||||
Default: write the entire workflow as named sections in SKILL.md (`## Discovery`, `## Constraints`, `## Finalize`, etc.). A multi-stage coaching workflow can live in one SKILL.md.
|
||||
|
||||
Carve out to `references/` only when SKILL.md genuinely gets too big to scan. When you do:
|
||||
- **Descriptive filenames.** `references/press-release.md`, `references/customer-faq.md`. Never numbered prefixes (`01-press-release.md`) — the carve-out is a section, not a "step." SKILL.md routes to references by name and the order is whatever SKILL.md specifies.
|
||||
- Each carved-out file works standalone — context compaction can drop SKILL.md mid-flow. No "as described in the overview."
|
||||
- Progression conditions, where they exist, must be testable ("when X is captured, route to Y"). "When ready" is vague.
|
||||
- The file uses `{communication_language}` (and `{document_output_language}` if it produces a doc).
|
||||
- There are NO exit hooks in the system. Don't add `## On Exit` sections — they'd never run.
|
||||
|
||||
### Headless mode
|
||||
|
||||
When a skill supports headless invocation, the decision log absorbs every assumption made without the user — intent inference, proposed names, customization defaults, conflict resolutions, lint-fix calls, anything the user would have weighed in on interactively. The JSON return is the smallest set of paths the caller needs (typically `skill` + `decision_log`, plus the report path for analysis flows); the log carries the reasoning. `status` is `complete` or `blocked`; on `blocked`, include a one-line `reason` and still return the log path so the caller can read the detail. Without this discipline, headless silently buries its calls and the audit trail breaks on the next session.
|
||||
|
||||
### Subagent constraints
|
||||
- Subagents CANNOT spawn other subagents. Chain through parent.
|
||||
- Don't read files in parent if you can delegate the read — parent stays lean.
|
||||
- Subagent prompts must specify exact return format and "ONLY return X" constraint, or you get verbose prose.
|
||||
- **The implicit-read trap:** Language like "review", "acknowledge", "summarize what you have" causes the parent to read files even when you didn't ask for it. If a later stage delegates document analysis, earlier stages must NOT use that language. Use "note paths for subagent scanning; don't read them now".
|
||||
|
||||
### Size guidance
|
||||
Production targets, not hard limits. The "what fails if I delete this?" test still applies to every line.
|
||||
|
||||
- SKILL.md: ~80 lines target, hard ceiling ~130
|
||||
- Multi-branch SKILL.md: up to ~250 lines if each branch has brief contextual explanation
|
||||
- Single-purpose: up to ~500 lines (~5000 tokens) if focused
|
||||
- Past those: lift to `references/` or `assets/`
|
||||
|
||||
### Patterns BMad has seen pay off
|
||||
Institutional names for patterns the LLM won't generate by default:
|
||||
|
||||
- **Open-floor opening** — Conversational skills start with an explicit invitation for the user to share everything they have (goals, references, examples, paths to artifacts) before any structured Q&A. The dump replaces most of the question script that would otherwise follow; the agent then asks only what's missing. The form adapts to input — vague request gets "tell me everything", path/URL gets "what do you want focused on?". Costs almost nothing token-wise; drastically improves conversational feel.
|
||||
- **Soft-gate elicitation** — "Anything else, or shall we move on?" at natural transitions. Users always remember one more thing when given a graceful exit.
|
||||
- **Intent-before-ingestion** — Understand why the user is here before scanning artifacts. Without intent, scanning is noise.
|
||||
- **Capture-don't-interrupt** — Out-of-scope insights mid-flow get captured silently, not redirected. Users in flow share their best stuff unprompted.
|
||||
- **Dual-output** — Human artifact + LLM distillate, when the artifact will feed downstream agents.
|
||||
- **Parallel review lenses** — Fan out 2-3 review subagents (skeptic, opportunity-spotter, contextually-chosen lens) before finalizing significant artifacts.
|
||||
- **Three-mode architecture** — Guided / Yolo / Headless. Not all skills need all three; considering it during design prevents lock-in.
|
||||
- **Graceful degradation** — Subagent-dependent features fall back to sequential when subagents are unavailable.
|
||||
- **Decision-Log Workspace** — multi-turn workflows producing revisable artifacts. The decision log is the load-bearing artifact (carries identity across sessions, prevents railroading, audits overrides). Subsumes "document-as-cache" — see full treatment below.
|
||||
|
||||
### Writing
|
||||
- One term per concept; pick it and stick to it.
|
||||
- Third person in descriptions ("Processes files", not "I help process files").
|
||||
- Descriptive file names (`form-validation-rules.md`, not `doc2.md`).
|
||||
- One level deep for reference files — SKILL.md → reference, never SKILL → ref → ref chains.
|
||||
|
||||
## The Decision-Log Workspace Pattern
|
||||
|
||||
The default for any multi-turn workflow that produces a substantive artifact, may be revisited (Update or Validate), or risks running long enough to compact.
|
||||
|
||||
**Core insight.** The decision log is the load-bearing artifact, not the document. The document is what the user takes; the decision log is what carries identity across sessions, prevents the agent from railroading the user, surfaces conflicts on update, and creates an audit trail when the user overrides their own past calls. Workflows that lack it look fine on the first pass and fall apart on revisit.
|
||||
|
||||
### Workspace layout
|
||||
|
||||
All files live in a single folder rooted at the primary artifact. Two cases:
|
||||
|
||||
- **The artifact is a single document** (a brief, a PRFAQ, etc.) → the workspace is the document's containing folder; the log + addendum + distillate sit as peers of the document.
|
||||
- **The artifact is itself a folder of files** (a built skill, a generated module) → the workspace IS the artifact's folder; the log + addendum sit as peers of the primary file (e.g. `SKILL.md`).
|
||||
|
||||
Either way, the workspace exists from the moment intent is confirmed — not at the end. The user knows the path immediately; state lives on disk, not in the conversation.
|
||||
|
||||
- `<primary>` — the artifact (or, for folder-artifacts, the primary file like `SKILL.md`). YAML frontmatter is the recoverable-state mechanism when the workflow needs it; fields are workflow-specific (the LLM picks what each workflow benefits from — some need none).
|
||||
- `.decision-log.md` — every meaningful decision and why, with alternatives considered. Append-only across sessions, with date-stamped session headings. Can carry its own frontmatter for session state when that's useful.
|
||||
- `addendum.md` — context the user surfaced that didn't earn a place in the primary (rejected alternatives, parked roadmap, options-considered matrices, in-depth personas). Created only when something earns its place.
|
||||
- `distillate.md` *(optional)* — token-efficient version of the primary for downstream LLM consumers.
|
||||
|
||||
### Resume protocol
|
||||
|
||||
On activation, check whether a workspace already exists for this artifact. If found, surface it (with the `updated` timestamp from the primary's frontmatter) and offer to resume. Reading `.decision-log.md` recovers full context regardless of compaction.
|
||||
|
||||
### Update mode
|
||||
|
||||
Read `.decision-log.md` and the addendum first. The change request enters as a "change signal" against the standing record. If the change contradicts a prior decision, surface the conflict before applying. Every change — clean or override — gets a new decision-log entry. Overrides also write to the addendum: the rejected reasoning needs to live somewhere.
|
||||
|
||||
### Validate mode
|
||||
|
||||
Read `.decision-log.md` first. A validation that ignores prior decisions or stated user criteria is shallow; it should challenge the artifact against the standards the user themselves set, not against generic rubrics.
|
||||
|
||||
### Finalize step
|
||||
|
||||
Decision-log audit. Every meaningful entry must be either captured in the primary, captured in the addendum, or explicitly set aside as process noise. The user ends the session with a shared accounting of how their thinking was handled — not a one-sided polish-and-deliver.
|
||||
|
||||
### When NOT to use
|
||||
|
||||
- Simple Utilities (no decisions to log; the input/output IS the contract).
|
||||
- One-shot code operations (the diff is the decision log).
|
||||
- Purely conversational skills (no artifact persists).
|
||||
|
||||
### Treatment style (writing it into a skill)
|
||||
|
||||
State the principle once where it first applies — typically inside the Create intent description as a single clause ("write the primary skeleton and `.decision-log.md` to the workspace; the decision log is canonical memory"). Mention reads at the moments that matter: Update reads decisions before changing them, Validate reads them before critiquing, Finalize audits the log at handoff. That's the entire treatment.
|
||||
|
||||
Do NOT:
|
||||
- Open with a "Decision-log discipline" enumeration of what kinds of things to log — the LLM knows. Trust it.
|
||||
- Write a separate `## Workspace` section header with meta-explanation of the pattern.
|
||||
- Include a tree diagram of the workspace layout — the workspace is just files; the LLM names them as it uses them.
|
||||
- Prescribe a YAML frontmatter schema for the decision log — fields are workflow-specific; let the building LLM pick what each workflow needs (or skip frontmatter entirely).
|
||||
- Split workspace creation into separate "for new" / "for existing" sub-sections — "create if absent, append a new session heading if present" is one sentence.
|
||||
|
||||
The scanner flags skills that bury DLW guidance under ceremony. `bmad-product-brief` is the canonical-brief example: ~5 sentences total, threaded through Create / Update / Validate / Constraints / Finalize at the points where each matters.
|
||||
|
||||
## Failure Modes With Body Count
|
||||
|
||||
- **Description over-broadens** → Skill hijacks unrelated conversations. Fix: quote trigger phrases.
|
||||
- **Vague progression conditions** ("when ready") → Stage never advances or advances early. Fix: testable conditions.
|
||||
- **Stage references SKILL.md** ("as above") → Breaks on compaction. Fix: stages self-contained.
|
||||
- **Subagent prompt without explicit return format** → Verbose prose responses. Fix: "Return ONLY {schema}. No other output."
|
||||
- **Parent reads then delegates analysis** → Context bloat that makes delegation pointless. Fix: delegate the read.
|
||||
- **Implicit-read trap** in a stage that precedes subagent delegation → Parent reads everything anyway. Fix: explicit "don't read these now".
|
||||
- **Scoring formulas for subjective judgment** → Rigidity that doesn't improve quality. Fix: state the outcome, let the model assess.
|
||||
- **Boolean toggles in customize.toml** → Author didn't decide what the skill does; surface becomes a permutation forest. Fix: pick a default; users fork if they want the other shape.
|
||||
- **Hardcoded path in SKILL.md while customize.toml declares the scalar** → Override silently does nothing. Fix: SKILL.md must read `{workflow.<name>}`.
|
||||
- **Identity / communication-style / principles in `[workflow]`** → Workflow wants to be an agent. Fix: point author at agent-builder; remove from workflow surface.
|
||||
@@ -0,0 +1,196 @@
|
||||
# Standard Workflow/Skill Fields
|
||||
|
||||
## Frontmatter Fields
|
||||
|
||||
Only these fields go in the YAML frontmatter block:
|
||||
|
||||
| Field | Description | Example |
|
||||
| ------------- | ---------------------------------------------------- | --------------------------------------------- |
|
||||
| `name` | Full skill name (kebab-case, same as folder name) | `validate-json`, `cis-brainstorm` |
|
||||
| `description` | [5-8 word summary]. [Use when user says 'X' or 'Y'.] | See Description Format below |
|
||||
|
||||
## Content Fields (All Types)
|
||||
|
||||
These are used within the SKILL.md body — never in frontmatter:
|
||||
|
||||
| Field | Description | Example |
|
||||
| --------------- | ----------------------------- | --------------------------------- |
|
||||
| `role-guidance` | Brief expertise primer | "Act as a senior DevOps engineer" |
|
||||
| `module-code` | Module code (if module-based) | `bmb`, `cis` |
|
||||
|
||||
## Simple Utility Fields
|
||||
|
||||
| Field | Description | Example |
|
||||
| --------------- | ----------------------------------- | ------------------------------------------- |
|
||||
| `input-format` | What it accepts | JSON file path, stdin text |
|
||||
| `output-format` | What it returns | Validated JSON, error report |
|
||||
| `standalone` | Fully standalone, no config needed? | true/false |
|
||||
| `composability` | How other skills use it | "Called by quality scanners for validation" |
|
||||
|
||||
## Simple Workflow Fields
|
||||
|
||||
| Field | Description | Example |
|
||||
| ------------ | --------------------- | ----------------------------------------- |
|
||||
| `steps` | Numbered inline steps | "1. Load config 2. Read input 3. Process" |
|
||||
| `tools-used` | CLIs/tools/scripts | gh, jq, python scripts |
|
||||
| `output` | What it produces | PR, report, file |
|
||||
|
||||
## Complex Workflow Fields
|
||||
|
||||
| Field | Description | Example |
|
||||
| ------------------------ | --------------------------------- | ------------------------------------- |
|
||||
| `stages` | Named numbered stages | "01-discover, 02-plan, 03-build" |
|
||||
| `progression-conditions` | When stages complete | "User approves outline" |
|
||||
| `headless-mode` | Supports autonomous? | true/false |
|
||||
| `config-variables` | Beyond core vars | `planning_artifacts`, `output_folder` |
|
||||
| `output-artifacts` | What it creates (output-location) | "PRD document", "agent skill" |
|
||||
|
||||
## Customization Surface (`customize.toml`, opt-in)
|
||||
|
||||
Emitted only when the skill author opts in during Phase 3.5 (Configurability Discovery). The file sits next to SKILL.md and is loaded via `{project-root}/_bmad/scripts/resolve_customization.py` at activation.
|
||||
|
||||
### Always-present fields (when opted in)
|
||||
|
||||
| Field | Type | Purpose |
|
||||
| -------------------------- | ------------- | -------------------------------------------------------------------------- |
|
||||
| `activation_steps_prepend` | array[string] | Steps run before standard activation. Overrides append. |
|
||||
| `activation_steps_append` | array[string] | Steps run after greet, before the workflow's first stage. Overrides append. |
|
||||
| `persistent_facts` | array[string] | Facts (literal or `file:` prefixed paths/globs) loaded on activation. Overrides append. |
|
||||
|
||||
### Workflow-specific scalars (lifted during Phase 3.5)
|
||||
|
||||
Named by purpose and suffix. Override wins (scalar merge rule).
|
||||
|
||||
| Naming pattern | Use for | Example |
|
||||
| ------------------- | ---------------------------------------------------- | --------------------------------------------------- |
|
||||
| `<purpose>_template` | File path for templates the workflow loads | `brief_template = "assets/brief-template.md"` |
|
||||
| `<purpose>_output_path` | Writable destination paths | `output_path = "{project-root}/docs/briefs"` |
|
||||
| `on_<event>` | Prompt or command executed at a hook point | `on_complete = ""` |
|
||||
|
||||
**Path resolution within scalar values:**
|
||||
|
||||
- Bare paths (e.g. `assets/brief-template.md`) resolve from the skill root.
|
||||
- `{project-root}/...` resolves from the project working directory — use for org-owned overrides.
|
||||
- Never mix `{project-root}` with config variables that already contain it (no double-prefix).
|
||||
|
||||
### How SKILL.md references the resolved values
|
||||
|
||||
After the resolver step runs, read customized values as `{workflow.<name>}`:
|
||||
|
||||
```markdown
|
||||
Load the brief template from `{workflow.brief_template}`.
|
||||
```
|
||||
|
||||
At runtime, that resolves to whatever the merged `[workflow].brief_template` scalar is — the default, a team override, or a personal override.
|
||||
|
||||
### Override files
|
||||
|
||||
Teams and users override without editing `customize.toml` in the skill, and instead modify the following:
|
||||
|
||||
- Team: `{project-root}/_bmad/custom/{skill-name}.toml`
|
||||
- Personal: `{project-root}/_bmad/custom/{skill-name}.user.toml`
|
||||
|
||||
Both use the same `[workflow]` block shape. Merge order: base (skill's `customize.toml`) → team → user.
|
||||
|
||||
## Overview Section Format
|
||||
|
||||
The Overview is the first section after the title — it primes the AI for everything that follows.
|
||||
|
||||
**3-part formula:**
|
||||
|
||||
1. **What** — What this workflow/skill does
|
||||
2. **How** — How it works (approach, key stages)
|
||||
3. **Why/Outcome** — Value delivered, quality standard
|
||||
|
||||
**Templates by skill type:**
|
||||
|
||||
**Complex Workflow:**
|
||||
|
||||
```markdown
|
||||
This skill helps you {outcome} through {approach}. Act as {role-guidance}, guiding users through {key stages}. Your output is {deliverable}.
|
||||
```
|
||||
|
||||
**Simple Workflow:**
|
||||
|
||||
```markdown
|
||||
This skill {what it does} by {approach}. Act as {role-guidance}. Use when {trigger conditions}. Produces {output}.
|
||||
```
|
||||
|
||||
**Simple Utility:**
|
||||
|
||||
```markdown
|
||||
This skill {what it does}. Use when {when to use}. Returns {output format} with {key feature}.
|
||||
```
|
||||
|
||||
## SKILL.md Description Format
|
||||
|
||||
The frontmatter `description` is the PRIMARY trigger mechanism — it determines when the AI invokes this skill. Most BMad skills are **explicitly invoked** by name (`/skill-name` or direct request), so descriptions should be conservative to prevent accidental triggering.
|
||||
|
||||
**Format:** Two parts, one sentence each:
|
||||
|
||||
```
|
||||
[What it does in 5-8 words]. [Use when user says 'specific phrase' or 'specific phrase'.]
|
||||
```
|
||||
|
||||
**The trigger clause** uses one of these patterns depending on the skill's activation style:
|
||||
|
||||
- **Explicit invocation (default):** `Use when the user requests to 'create a PRD' or 'edit an existing PRD'.` — Quotes around specific phrases the user would actually say. Conservative — won't fire on casual mentions.
|
||||
- **Organic/reactive:** `Trigger when code imports anthropic SDK, or user asks to use Claude API.` — For lightweight skills that should activate on contextual signals, not explicit requests.
|
||||
|
||||
**Examples:**
|
||||
|
||||
Good (explicit): `Builds workflows and skills through conversational discovery. Use when the user requests to 'build a workflow', 'modify a workflow', or 'quality check workflow'.`
|
||||
|
||||
Good (organic): `Initializes BMad project configuration. Trigger when any skill needs module-specific configuration values, or when setting up a new BMad project.`
|
||||
|
||||
Bad: `Helps with PRDs and product requirements.` — Too vague, would trigger on any mention of PRD even in passing conversation.
|
||||
|
||||
Bad: `Use on any mention of workflows, building, or creating things.` — Over-broad, would hijack unrelated conversations.
|
||||
|
||||
**Default to explicit invocation** unless the user specifically describes organic/reactive activation during discovery.
|
||||
|
||||
## Role Guidance Format
|
||||
|
||||
Every generated workflow SKILL.md includes a brief role statement in the Overview or as a standalone line:
|
||||
|
||||
```markdown
|
||||
Act as {role-guidance}. {brief expertise/approach description}.
|
||||
```
|
||||
|
||||
This provides quick prompt priming for expertise and tone. Workflows may also use full Identity/Communication Style/Principles sections when personality serves the workflow's purpose.
|
||||
|
||||
## Path Rules
|
||||
|
||||
### Skill-Internal References
|
||||
|
||||
Use bare paths from the skill root for any file inside this skill — including same-folder references between two files in `references/` or two files in `scripts/`:
|
||||
|
||||
- `references/build-process.md`
|
||||
- `references/standard-fields.md` (referenced from another file in `references/` — still bare path)
|
||||
- `scripts/validate.py`
|
||||
- `assets/template.md`
|
||||
|
||||
The convention is universal: bare paths from skill root. Never use `./` prefixes — they cause inconsistency and break under context compaction when the working directory shifts.
|
||||
|
||||
### Project-Scope Paths
|
||||
|
||||
Use `{project-root}/...` for any path relative to the project root:
|
||||
|
||||
- `{project-root}/_bmad/planning/prd.md`
|
||||
- `{project-root}/docs/report.md`
|
||||
|
||||
### Config Variables
|
||||
|
||||
Use directly — they already contain `{project-root}` in their resolved values:
|
||||
|
||||
- `{output_folder}/file.md`
|
||||
- `{planning_artifacts}/prd.md`
|
||||
|
||||
### Anti-patterns (negative examples — fenced so the linter doesn't fire on them)
|
||||
|
||||
```text
|
||||
{project-root}/{output_folder}/file.md # WRONG — double-prefix; config var already has {project-root}
|
||||
_bmad/planning/prd.md # WRONG — bare _bmad must have {project-root} prefix
|
||||
./references/foo.md # WRONG — never use ./ for skill-internal paths
|
||||
./scripts/foo.py # WRONG — same; bare paths from skill root only
|
||||
```
|
||||
@@ -0,0 +1,47 @@
|
||||
# Template Substitution Rules
|
||||
|
||||
The SKILL-template provides a minimal skeleton: frontmatter, overview, and activation with config loading. Everything beyond that is crafted by the builder based on what was learned during discovery and requirements phases.
|
||||
|
||||
## Frontmatter
|
||||
|
||||
- `{module-code-or-empty}` → Module code prefix with hyphen (e.g., `bmb-`) or empty for standalone. The `bmad-` prefix is reserved for official BMad creations; user skills should not include it.
|
||||
- `{skill-name}` → Skill functional name (kebab-case)
|
||||
- `{skill-description}` → Two parts: [5-8 word summary]. [trigger phrases]
|
||||
|
||||
## Module Conditionals
|
||||
|
||||
### For Module-Based Skills
|
||||
|
||||
- `{if-module}` ... `{/if-module}` → Keep the content inside
|
||||
- `{if-standalone}` ... `{/if-standalone}` → Remove the entire block including markers
|
||||
- `{module-code}` → Module code without trailing hyphen (e.g., `bmb`)
|
||||
- `{module-setup-skill}` → Name of the module's setup skill (e.g., `mymod-setup`)
|
||||
|
||||
### For Standalone Skills
|
||||
|
||||
- `{if-module}` ... `{/if-module}` → Remove the entire block including markers
|
||||
- `{if-standalone}` ... `{/if-standalone}` → Keep the content inside
|
||||
|
||||
## Customization Conditionals
|
||||
|
||||
### When Customization Is Opted In
|
||||
|
||||
- `{if-customizable}` ... `{/if-customizable}` → Keep the content inside; emit `customize.toml` alongside SKILL.md.
|
||||
- Lifted configurable scalars are referenced in SKILL.md body as `{workflow.<name>}` (e.g. `{workflow.brief_template}`). These are resolved at runtime by the resolver, not at build time — emit them verbatim.
|
||||
|
||||
### When Customization Is Not Opted In
|
||||
|
||||
- `{if-customizable}` ... `{/if-customizable}` → Remove the entire block including markers.
|
||||
- Do NOT emit `customize.toml`. Use hardcoded paths and values in SKILL.md throughout.
|
||||
|
||||
## Beyond the Template
|
||||
|
||||
The builder determines the rest of the skill structure — body sections, phases, stages, scripts, external skills, headless mode, role guidance — based on the skill type classification and requirements gathered during the build process. The template intentionally does not prescribe these; the builder has the context to craft them.
|
||||
|
||||
## Path References
|
||||
|
||||
All generated skills use paths relative to skill root (cross-directory) or `./` (same-folder):
|
||||
|
||||
- `references/{reference}.md` — Reference documents loaded on demand
|
||||
- `references/{stage}.md` — Stage prompts (complex workflows)
|
||||
- `scripts/` — Python/shell scripts for deterministic operations
|
||||
Reference in New Issue
Block a user