chore: initial monorepo scaffold + WDS Phase 1+2 artifacts

- Nx 22.7 monorepo (pnpm 11.1, TypeScript 5.9, Node 24)
- apps/api: NestJS 11 (CJS conforme CODING-RULES.md PGD-DB-004)
- apps/web: React 19 + Vite 8 (ESM)
- libs/shared/api-interface: Zod contract base
- Docker Compose dev: Postgres 18, Valkey 8, MinIO, Mailpit
- WDS artifacts:
  - design-artifacts/A-Product-Brief/ (5 docs canônicos + 16 dialogs)
  - design-artifacts/B-Trigger-Map/ (hub + 4 personas + feature impact)
- Stack canon: STACK.md v2.2 + CODING-RULES.md v2.0 + brand.md
- AGENTS.md + README.md como entrada para devs/agentes

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 14:34:20 +00:00
commit 17c08e6392
3631 changed files with 855518 additions and 0 deletions

View File

@@ -0,0 +1,38 @@
---
name: bmad-workflow-builder
description: Builds, edits, and analyzes workflows and skills. Use when the user requests to "build a workflow", "modify a workflow", "quality check workflow", or "analyze skill".
---
# Overview
You are a creative agent skills workflow builder and facilitator. Your job: turn a user's vision and ideas locked in their head into the outcome driven skills, where every line earns its place against the test "would an LLM do this correctly without being told?"
**Args:** `--headless` / `-H` for non-interactive; an initial description for a new build; or a path to an existing skill with keywords like analyze, edit, or rebuild. To re-shape an existing non-BMad skill, just point to it and describe what should change — the build flow handles it.
## Conventions
- Bare paths (e.g. `references/build-process.md`) resolve from the skill root.
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
- `{project-root}`-prefixed paths resolve from the project working directory.
- `{skill-name}` resolves to the skill directory's basename.
## On Activation
1. Detect intent. If `--headless` or `-H`, set `{headless_mode}=true` for all sub-prompts.
2. Load config from `{project-root}/_bmad/config.yaml` and `{project-root}/_bmad/config.user.yaml` (root and bmb section). Fall back to `{project-root}/_bmad/bmb/config.yaml` (legacy per-module format). If neither exists and the `bmad-builder-setup` skill is available, mention it. Resolve and apply throughout the session (defaults in parens):
- `{user_name}` (default: null) — address the user by name
- `{communication_language}` (default: user or system intent) — for all communications
- `{document_output_language}` (default: user or system intent) — for generated document content
- `{bmad_builder_output_folder}` (default: `{project-root}/skills`) — where new skills are created. Existing skills use their own path.
3. **Open the floor (interactive only).** Before any structured questions or routing, invite the user to share everything they have in mind unless they already provided extensive detail (if they did then you could just ask if they want to add any more before proceeding): goals, references, examples, half-formed ideas, paths to existing skills or artifacts, anything they want you to read. Adapt the invitation to what they already gave you — for a vague "build me X," ask for the full picture; for a path or URL, ask what they want focused on or what context you should know. After they share, one soft "anything else?" surfaces what they almost forgot. The dump replaces most structured Q&A downstream; let it run. Skip in headless mode and skip if the invocation already includes enough detail to act on.
4. **Resume detection.** Once a target skill is identified — either a path to an existing skill, or a new build with a target name — check `{target-skill-path}/.decision-log.md`. If found, read its frontmatter for state recovery (`phase`, `classification`, `last_touched`) and tail the body for full decision history. In headless mode, resume automatically and append a new session heading.
## Routing
| Intent | Load |
| ---------------------------- | --------------------------------- |
| Build new or edit existing | `references/build-process.md` |
| Analyze | `references/quality-analysis.md` |

View File

@@ -0,0 +1,53 @@
---
name: {module-code-or-empty}{skill-name}
description: { skill-description } # [5-8 word summary]. [trigger phrases, e.g. Use when user says create xyz or wants to do abc]
---
# {skill-name}
## Overview
{overview — concise: what it does, args supported, and the outcome for the singular or different paths. This overview needs to contain succinct information for the llm as this is the main provision of help output for the skill.}
## Conventions
- Bare paths (e.g. `references/guide.md`) resolve from the skill root.
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
- `{project-root}`-prefixed paths resolve from the project working directory.
- `{skill-name}` resolves to the skill directory's basename.
## On Activation
{if-customizable}
### Step 1: Resolve the Workflow Block
Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`
If the script fails, resolve the `workflow` block yourself by reading these three files in base → team → user order and applying structural merge rules: `{skill-root}/customize.toml`, `{project-root}/_bmad/custom/{skill-name}.toml`, `{project-root}/_bmad/custom/{skill-name}.user.toml`. Scalars override, tables deep-merge, arrays of tables keyed by `code`/`id` replace matching entries and append new ones, all other arrays append.
### Step 2: Execute Prepend Steps
Execute each entry in `{workflow.activation_steps_prepend}` in order before proceeding.
### Step 3: Load Persistent Facts
Treat every entry in `{workflow.persistent_facts}` as foundational context for the whole run. Entries prefixed `file:` are paths or globs — load the referenced contents as facts. All other entries are facts verbatim.
### Step 4: Load Config
{/if-customizable}
{if-module}
Load available config from `{project-root}/_bmad/config.yaml` and `{project-root}/_bmad/config.user.yaml` (root level and `{module-code}` section). If config is missing, let the user know `{module-setup-skill}` can configure the module at any time. Use sensible defaults for anything not configured — prefer inferring at runtime or asking the user over requiring configuration.
{/if-module}
{if-standalone}
Load available config from `{project-root}/_bmad/config.yaml` and `{project-root}/_bmad/config.user.yaml` if present. Use sensible defaults for anything not configured.
{/if-standalone}
{if-customizable}
### Step 5: Execute Append Steps
Execute each entry in `{workflow.activation_steps_append}` in order before entering the workflow's first stage.
{/if-customizable}
{The rest of the skill — body structure, sections, phases, stages, scripts, external skills — is determined entirely by what the skill needs. The builder crafts this based on the discovery and requirements phases.}

View File

@@ -0,0 +1,56 @@
# DO NOT EDIT -- overwritten on every update.
#
# Workflow customization surface for {skill-name}.
# Team overrides: {project-root}/_bmad/custom/{skill-name}.toml
# Personal overrides: {project-root}/_bmad/custom/{skill-name}.user.toml
[workflow]
# --- Configurable below. Overrides merge per BMad structural rules: ---
# scalars: override wins • arrays (persistent_facts, activation_steps_*): append
# arrays-of-tables with `code`/`id`: replace matching items, append new ones.
# Steps to run before the standard activation (config load, greet).
# Overrides append. Use for pre-flight loads, compliance checks, etc.
activation_steps_prepend = []
# Steps to run after greet but before Stage 1 of the workflow.
# Overrides append. Use for context-heavy setup that should happen
# once the user has been acknowledged.
activation_steps_append = []
# Persistent facts the workflow keeps in mind for the whole run
# (standards, compliance constraints, stylistic guardrails).
# Distinct from the runtime memory sidecar -- these are static context
# loaded on activation. Overrides append.
#
# Each entry is either:
# - a literal sentence, e.g. "All briefs must include a regulatory-risk section."
# - a file reference prefixed with `file:`, e.g. "file:{project-root}/docs/standards.md"
# (glob patterns are supported; the file's contents are loaded and treated as facts).
persistent_facts = [
"file:{project-root}/**/project-context.md",
]
# Scalar: executed when the workflow reaches its terminal stage, after
# the main output has been delivered. Override wins. Leave empty for
# no custom post-completion behavior.
on_complete = ""
# --- Workflow-specific configurables (lifted during Configurability Discovery) ---
#
# Templates, output paths, and hooks the builder surfaced with the author.
# Bare paths resolve from the skill root; use `{project-root}/...` to point
# at an org-owned resource elsewhere in the repo. Override wins.
#
# Naming conventions:
# *_template -- file paths for templates the workflow loads
# *_output_path -- writable destinations
# on_<event> -- additional hook scalars beyond on_complete
#
# Example (from bmad-product-brief):
# brief_template = "resources/brief-template.md"

View File

@@ -0,0 +1,66 @@
# DO NOT EDIT -- overwritten on every update.
#
# Workflow customization surface for bmad-product-brief.
#
# Override files (not edited here):
# {project-root}/_bmad/custom/bmad-product-brief.toml (team)
# {project-root}/_bmad/custom/bmad-product-brief.user.toml (personal)
[workflow]
# --- Configurable below. Overrides merge per BMad structural rules: ---
# scalars: override wins • arrays: append
# Steps to run before the standard activation (config load, greet).
# Use for pre-flight loads, compliance checks, etc.
activation_steps_prepend = []
# Steps to run after greet but before the workflow begins.
# Use for context-heavy setup that should happen once the user has been acknowledged.
activation_steps_append = []
# Persistent facts the workflow keeps in mind for the whole run
# (standards, compliance constraints, stylistic guardrails).
# Each entry is either a literal sentence, a skill prefixed with `skill:`, or a `file:`-prefixed path/glob
# whose contents are loaded as facts.
# Default is empty. Common opt-ins (set in your team/user override TOML):
# "file:{project-root}/_bmad-output/planning-artifacts/project-context.md" # bmad-generate-project-context output
# "skill:acme-co:terms-and-conditions" # a skill that contains some relevant info to the documents that may be generated
# "Elvis has left the building" # generic agent instructions
persistent_facts = []
# Executed when the workflow completes (after the user has been told the
# brief is ready). Accepts either a string scalar (single instruction)
# or an array of instructions executed in order. Empty for none.
on_complete = ""
# Default brief structure. Treated as a starting point — the LLM adapts it
# to the product, purpose, and domain. Override the path in team/user TOML
# to enforce a different structure (e.g. regulated-industry, investor-deck).
brief_template = "assets/brief-template.md"
# Run folder location. The brief, optional addendum, and optional distillate
# all land inside `{output_dir}/{output_folder_name}/`.
output_dir = "{planning_artifacts}/briefs"
output_folder_name = "brief-{project_name}-{date}"
# Document standards applied to human-consumed docs at finalize. Each entry is
# a `skill:`, `file:`, or plain-text directive; the parent LLM applies the
# findings before the user sees the draft. Encodes standards, not options.
#
# Examples:
# "skill:bmad-editorial-review-prose"
# "file:{project-root}/_bmad/style-guides/company-voice.md"
# "Convert all dates to ISO 8601 format."
#
# Suggested order (broader passes first, narrower last):
# 1. Structural (cuts, reorganization, section sizing)
# 2. Content/voice/conventions (org standards, tone, terminology, compliance)
# 3. Prose mechanics (grammar, clarity, typos)
#
# Override the array in team/user TOML to add additional standards. Append-only:
# base entries cannot be removed or replaced (resolver has no removal mechanism).
doc_standards = [
"skill:bmad-editorial-review-structure",
"skill:bmad-editorial-review-prose",
]

View File

@@ -0,0 +1,154 @@
**Workspace.** Once intent is clear and the target skill is named (propose a kebab-case name for new skills if the user didn't give one — they can rename later, that's a logged decision not a redo), write `.decision-log.md` at the skill's root as a peer of `SKILL.md`. The decision log is canonical memory — load-bearing decisions, rejected alternatives, and overrides live on disk, not in the conversation. On resume, append a new session heading; at handoff, audit the log so the user signs off on how their thinking was handled.
## Phase 1: Classify
**Outcome:** you and the user agree on the skill type and whether it's part of a module. Reasoning is shared, not hidden.
| Type | When |
|---|---|
| **Simple Utility** | Composable building block with clear input → processing → output. Often deterministic. No multi-turn discovery. |
| **Simple Workflow** | Multi-step process that fits inline in SKILL.md as named sections (`## Discovery`, `## Constraints`, etc.). Default. |
| **Complex Workflow** | SKILL.md routing + carved-out sections in `references/` with descriptive filenames. Reserved for workflows whose SKILL.md would otherwise be too big to scan (~250+ lines). |
Default to Simple Workflow. Carving is a SIZE decision, not a stage-count decision.
If module-based: capture module code, other skills it'll invoke (with name / inputs / outputs), and config variables it needs.
For Workflows that produce an artifact: confirm whether `--headless` should be supported.
**On Edit:** classification is already set — read it from the existing skill or from `.decision-log.md` frontmatter. Skip this phase.
## Phase 2: Determine Spec
**Outcome:** you have everything needed to draft the skill — extracted from what the user has already shared (open-floor + decision log) plus targeted follow-ups for whatever's missing.
Through what's already known or further conversation, determine all of the following that are relevant:
| Field | Applies | Notes |
|---|---|---|
| Name | All | kebab-case. `{module-code}-{name}` for modules, `{name}` standalone. `bmad-` reserved for official. |
| Description | All | `[5-8 word summary]. [Use when user says 'specific phrase'.]` See `references/standard-fields.md`. |
| Overview | All | What / How / Why-Outcome. Domain framing + theory of mind for interactive or complex skills. |
| Role | Workflows | "Act as a [role/expert]" primer. |
| Design rationale | Where non-obvious | Choices the executing agent should understand so it doesn't optimize them away. |
| External skills | All | Which other skills this calls. |
| Scripts | All | Deterministic operations to push out of prompts; see `references/script-opportunities-reference.md`. List non-stdlib deps and get user approval (`uv` required). |
| Output documents | All | Yes/no — uses `{document_output_language}` if yes. |
| Revisable artifact | If output doc | If Update / Validate intents are likely, propose the Decision-Log Workspace pattern (`references/skill-quality-principles.md`). |
| Inputs / outputs | Simple Utility | Format, schema, required fields. |
| Stages | Workflows | Named sections (Simple) or carved files in `references/` with descriptive filenames (Complex). |
| Module capability | If module-based | phase-name, after, before, is-required, short description. |
| Customization | All | Fixed, or swappable templates / paths / hooks? Default no. If yes, walk each scalar (`<purpose>_template`, `<purpose>_output_path`, `on_<event>`); auto-promote in headless. |
The customization opt-in question (interactive only):
> "Should this support end-user customization (activation hooks, swappable templates, output paths)? If no, it ships fixed — users who need changes fork it."
For path conventions and customize.toml schema, see `references/skill-quality-principles.md`.
**On Edit:** spec is already defined by the existing skill. Read what's relevant to the change, ignore the rest. Update the decision-log with what's actually changing and why.
## Phase 3: Draft & Refine
**Load `references/skill-quality-principles.md` before reviewing the plan** — same principles file the quality scanners verify against. Building against it upfront is cheaper than fixing afterwards.
Present a plan. Point out vague areas. Iterate with the user until the outcome and shape are clear. Apply the principles file's core test to every planned instruction: **would an LLM do this correctly without being told?** If yes, cut it.
## Phase 4: Build
**Load:**
- `references/skill-quality-principles.md` — what earns its place, BMad institutional knowledge, failure modes (already loaded in Phase 3; keep open)
- `references/standard-fields.md` — field-by-field schema reference for frontmatter, customize.toml, and the Overview formula
- `references/complex-workflow-patterns.md` (Complex Workflow only) — config integration, compaction survival, document-as-cache
Load `assets/SKILL-template.md` and `references/template-substitution-rules.md`. Default to writing the entire workflow inline in SKILL.md as named sections. Carve out to `references/` ONLY when SKILL.md would otherwise be too big to scan; when you do, use descriptive filenames (`press-release.md`), never numbered prefixes (`01-discover.md`). Output to `{bmad_builder_output_folder}`.
**If the SKILL.md references multiple internal files** (anything in `references/`, `assets/`, `scripts/`, `agents/`), stamp the Conventions block at the top of SKILL.md (after Overview, before On Activation):
```markdown
## Conventions
- Bare paths (e.g. `references/press-release.md`) resolve from the skill root.
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
- `{project-root}`-prefixed paths resolve from the project working directory.
- `{skill-name}` resolves to the skill directory's basename.
```
**If `{customizable}` is yes:**
- Emit `customize.toml` alongside SKILL.md from `assets/customize-template.toml`. Fill `[workflow]` with the Phase 2 scalars.
- In SKILL.md, replace hardcoded references with `{workflow.<name>}` indirection. `assets/brief-template.md``{workflow.brief_template}` if lifted.
- Add the resolver activation step before config load:
```markdown
### Step 1: Resolve the Workflow Block
Run: `python3 {project-root}/_bmad/scripts/resolve_customization.py --skill {skill-root} --key workflow`
If the script fails, resolve the `workflow` block yourself by reading these three files in base → team → user order and applying structural merge rules: `{skill-root}/customize.toml`, `{project-root}/_bmad/custom/{skill-name}.toml`, `{project-root}/_bmad/custom/{skill-name}.user.toml`. Scalars override, tables deep-merge, arrays of tables keyed by `code`/`id` replace matching entries and append new ones, all other arrays append.
```
- Execute `{workflow.activation_steps_prepend}` before the workflow's first stage and `{workflow.activation_steps_append}` after greet but before Stage 1. Treat `{workflow.persistent_facts}` as foundational context loaded on activation (`file:` prefix = path/glob; bare entries = literal facts).
**If `{customizable}` is no:** no `customize.toml`, no resolver step. SKILL.md uses hardcoded paths throughout.
**If the skill uses the Decision-Log Workspace pattern** (Phase 2 confirmed it produces a revisable artifact):
- Add `output_dir` and `output_folder_name` scalars to `customize.toml [workflow]`. Default shape:
- `output_dir = "{planning_artifacts}/<purpose>"` (e.g. `briefs`, `analyses`)
- `output_folder_name = "<purpose>-{project_name}-{date}"`
- This implies `{customizable}=yes` — if the user declined customization, ask whether to enable it for these two scalars.
- In SKILL.md Activation, after config resolution: bind `{doc_workspace} = {workflow.output_dir}/{workflow.output_folder_name}/`.
- Wire Create / Update / Validate intents and a Finalize audit per `references/skill-quality-principles.md` § Decision-Log Workspace Pattern. Follow the **Treatment style** sub-section there: state the principle once where it first applies, mention reads at the moments that matter, no prescribed frontmatter schema, no `## Workspace` header, no tree diagram. The workspace is just files.
- If the artifact will feed downstream LLM consumers: offer a `distillate.md` at finalize. Skip with a note if no distillation tool is available; never inline a substitute.
**Skill source tree** (only create folders that are needed):
```
{skill-name}/
├── SKILL.md # Frontmatter, Overview, Activation, the workflow itself (default), routing if carved
├── customize.toml # Only if {customizable} is yes
├── references/ # Carved-out workflow sections — descriptive names, no numbered prefixes
├── assets/ # Templates and other static content the workflow loads
├── scripts/ # Deterministic code with tests
│ └── tests/
```
Never put workflow content (`*.md` prompt files) directly at skill root — that's `SKILL.md`'s job. Carve-outs always go in `references/`.
| Location | Contains | LLM relationship |
| ----------------- | --------------------------------------------------------- | ------------------------------------ |
| **SKILL.md** | Overview, Activation, inline workflow OR routing to refs | LLM identity, the workflow itself |
| **`references/`** | Carved-out workflow sections (descriptive names) | Loaded on demand by SKILL.md routing |
| **`assets/`** | Templates, starter files, static content | Copied/transformed into output |
| **`scripts/`** | Python, shell scripts with tests | Invoked for deterministic operations |
**If the built skill includes scripts**, also load `references/script-standards.md` — ensures PEP 723 metadata, correct shebangs, and `uv run` invocation from the start.
**Lint gate** — validate and auto-fix. If subagents are available, delegate lint-fix; otherwise run inline.
1. Run both lint scripts in parallel:
```bash
python3 scripts/scan-path-standards.py {skill-path}
python3 scripts/scan-scripts.py {skill-path}
```
2. Fix high/critical findings, re-run (up to 3 attempts per script).
3. Run unit tests if scripts exist in the built skill.
## Phase 5: Handoff
**Interactive:** show what was built, lint results, and offer next steps (commit, run quality analysis). Decision log is at `{target-skill-path}/.decision-log.md`.
**Headless** (`{headless_mode}=true`): emit JSON only. `intent` is `"build"` for new, `"edit"` for existing.
```json
{
"status": "complete",
"intent": "build",
"skill": "{target-skill-path}",
"decision_log": "{target-skill-path}/.decision-log.md"
}
```
Blocked (ambiguous intent that couldn't be inferred, persistent lint failures, etc.): replace `"complete"` with `"blocked"` and add `"reason": "<one-line cause>"`. The log carries the detail.

View File

@@ -0,0 +1,95 @@
# Complex Workflow Patterns
Patterns for workflows whose SKILL.md got too big and had to carve out to `references/`. The default for any new skill is **inline** — a multi-stage coaching workflow lives in a single SKILL.md. Reach for these patterns only when SKILL.md genuinely won't fit.
## Carve-Out Conventions
When carving out to `references/`:
- Descriptive filenames (`press-release.md`, `customer-faq.md`, `verdict.md`). Never numbered prefixes — the carve-out is a section, not a "step." SKILL.md decides the order by routing.
- Each file works standalone (context compaction can drop SKILL.md). No "as described in the overview."
- SKILL.md keeps Overview, Activation, the Conventions block (see `references/skill-quality-principles.md`), and the routing logic. Everything else moves out.
- `assets/` is for templates and other static content the workflow loads, not for stages.
## Workflow Persona
BMad workflows treat the human operator as the expert. The agent facilitates — asks clarifying questions, presents options with trade-offs, validates before irreversible actions. The operator knows their domain; the workflow knows the process.
## Config Reading and Integration
Workflows read config from `{project-root}/_bmad/config.yaml` and `config.user.yaml`.
**Module-based skills** load with fallback and setup-skill awareness:
```
Load config from {project-root}/_bmad/config.yaml ({module-code} section) and config.user.yaml.
If missing: inform user that {module-setup-skill} is available, continue with sensible defaults.
```
**Standalone skills** load best-effort:
```
Load config from {project-root}/_bmad/config.yaml and config.user.yaml if available.
If missing: continue with defaults — no mention of a setup skill.
```
Config variables resolved already contain `{project-root}` — never double-prefix.
## Decision-Log Workspace Pattern (canonical compaction survival)
For workflows that produce revisable artifacts, the Decision-Log Workspace pattern is the default. See `references/skill-quality-principles.md` for the full treatment.
**The pattern in one paragraph.** The workspace folder (artifact + `.decision-log.md` + optional `addendum.md` + optional `distillate.md`) exists from the moment intent is confirmed. Decision-log captures every meaningful decision and rationale; addendum captures rejected alternatives. Resume on activation, conflict-detect on update, audit at finalize. The decision log is the load-bearing artifact — the document is what the user takes; the log is what carries identity across sessions.
**For Complex Workflows that route to carved-out files**, each carved file must work standalone (compaction can drop SKILL.md mid-flow). Carved files reference the workspace by config-resolved path (`{workflow.output_dir}/{workflow.output_folder_name}/`) — never assume in-context state.
**YAML frontmatter on the primary artifact** (status + inputs survives compaction):
```markdown
---
title: 'Analysis: Research Topic'
status: 'discovery'
inputs:
- '{project-root}/docs/brief.md'
created: '2025-03-02T10:00:00Z'
updated: '2025-03-02T11:30:00Z'
---
```
**When NOT to apply:** purely conversational workflows, one-shot single-turn outputs, multi-artifact workflows where each artifact gets its own folder.
## Routing from SKILL.md
When SKILL.md routes to a carved-out file, the route is by descriptive name. Use a Stages table near the bottom of SKILL.md:
```markdown
## Stages
| # | Stage | Purpose | Location |
|---|-------|---------|----------|
| 1 | Ignition | Raw concept, enforce customer-first thinking | SKILL.md (above) |
| 2 | Press Release | Iterative drafting with hard coaching | `references/press-release.md` |
| 3 | Customer FAQ | Devil's advocate customer questions | `references/customer-faq.md` |
```
The `#` is a reading aid for the table, not a filename prefix.
## Module Metadata Reference
BMad module workflows require extended frontmatter metadata. See `references/metadata-reference.md` for the metadata template and field explanations.
## Architecture Checklist
Before finalizing a complex BMad workflow:
- [ ] Default reconsidered — would this fit inline as named sections in a single SKILL.md?
- [ ] Facilitator persona — treats the operator as expert?
- [ ] Config integration — language, output locations read and used?
- [ ] Conventions block stamped at top of SKILL.md (when multiple internal files are referenced)
- [ ] Carve-outs in `references/` use descriptive names, no numbered prefixes
- [ ] Each carved file works standalone (compaction survival)
- [ ] Decision-Log Workspace pattern applied (or explicit reason for skipping — Simple Utility, one-shot, purely conversational)
- [ ] Resume protocol — Activation checks for existing workspace and offers to resume
- [ ] Update mode reads `.decision-log.md` first; surfaces conflicts before applying changes
- [ ] Final polish — subagent polish step at the end?
- [ ] Finalize step includes decision-log audit (every entry → primary, addendum, or explicit process noise)

View File

@@ -0,0 +1,140 @@
# Quality Analysis
Communicate with user in `{communication_language}`. Write report content in `{document_output_language}`.
You orchestrate quality analysis on a BMad workflow or skill. The pipeline is optimized for speed and completeness:
1. **Deterministic checks** (scripts) — zero tokens, instant
2. **LLM scanners** (parallel subagents) — judgment-based analysis against `skill-quality-principles.md`
3. **Fast JSON extraction** (deterministic script) — lossless capture of all scanner findings (~10 seconds, no LLM)
4. **HTML generation** — interactive, auto-opening report from JSON (no wait for synthesis)
5. **Optional markdown synthesis** (LLM subagent, background) — thematic analysis and archival markdown
The scanners verify against `references/skill-quality-principles.md` — the same file the build process loads at create/edit time. Findings cite the principle that's being violated rather than restating it.
## Your Role: Coordination, Not File Reading
**Do not read the target skill's files yourself.** Scripts and subagents do all analysis. You orchestrate: run deterministic scripts and pre-pass extractors, spawn LLM scanner subagents in parallel, hand off to the report creator for synthesis.
## Headless Mode
If `{headless_mode}=true`, skip user interaction, use safe defaults, note any warnings, and output structured JSON as specified in the Present Findings section.
## Pre-Scan Checks
Check for uncommitted changes. In headless mode, note warnings and proceed. In interactive mode, inform the user, confirm before proceeding, and confirm the workflow is currently functioning.
## Analysis Principles
**Effectiveness over efficiency.** The analysis may suggest leaner phrasing, but if the current phrasing captures the right guidance, it should be kept. The report presents opportunities — the user applies judgment.
## Scanners
### Lint Scripts (Deterministic — Run First)
Run instantly, cost zero tokens, produce structured JSON:
| # | Script | Focus | Output File |
| -- | -------------------------------- | --------------------------------------- | -------------------------- |
| S1 | `scripts/scan-path-standards.py` | Path conventions | `path-standards-temp.json` |
| S2 | `scripts/scan-scripts.py` | Script portability, PEP 723, unit tests | `scripts-temp.json` |
### Pre-Pass Scripts (Feed LLM Scanners)
Extract metrics so LLM scanners work from compact data instead of raw files:
| # | Script | Feeds | Output File |
| -- | --------------------------------------- | ---------------------- | --------------------------------- |
| P1 | `scripts/prepass-workflow-integrity.py` | architecture scanner | `workflow-integrity-prepass.json` |
| P2 | `scripts/prepass-prompt-metrics.py` | architecture scanner | `prompt-metrics-prepass.json` |
| P3 | `scripts/prepass-execution-deps.py` | determinism scanner | `execution-deps-prepass.json` |
### LLM Scanners (Judgment-Based — Run After Scripts)
Each scanner loads `references/skill-quality-principles.md` and writes a free-form analysis document:
| # | Scanner | Focus | Pre-Pass | Output File |
| -- | ------------------------------------ | ------------------------------------------------------------------------------ | -------- | ---------------------------- |
| L1 | `quality-scan-architecture.md` | Structural integrity, prose craft, cohesion (was: integrity + craft + cohesion)| Yes (P1, P2) | `architecture-analysis.md` |
| L2 | `quality-scan-determinism.md` | Intelligence placement, parallelization, subagent delegation, script opportunities (was: execution-efficiency + script-opportunities) | Yes (P3) | `determinism-analysis.md` |
| L3 | `quality-scan-customization.md` | customize.toml opportunities and abuse | No | `customization-analysis.md` |
| L4 | `quality-scan-enhancement.md` | Edge cases, UX gaps, headless potential, facilitative patterns | No | `enhancement-analysis.md` |
## Execution
Bind `{quality-report-dir} = {skill-path}/.analysis/{date-time-stamp}/` and create the directory. Use this single name in every script invocation and subagent prompt below. Quality analyses live at the skill's own root, as a peer of `.decision-log.md` and `SKILL.md` — the audit trail travels with the skill.
### Step 1: Run All Scripts (Parallel)
```bash
python3 scripts/scan-path-standards.py {skill-path} -o {quality-report-dir}/path-standards-temp.json
python3 scripts/scan-scripts.py {skill-path} -o {quality-report-dir}/scripts-temp.json
uv run scripts/prepass-workflow-integrity.py {skill-path} -o {quality-report-dir}/workflow-integrity-prepass.json
python3 scripts/prepass-prompt-metrics.py {skill-path} -o {quality-report-dir}/prompt-metrics-prepass.json
uv run scripts/prepass-execution-deps.py {skill-path} -o {quality-report-dir}/execution-deps-prepass.json
```
### Step 2: Spawn LLM Scanners (Parallel)
After scripts complete, spawn all four LLM scanners as parallel subagents.
Each subagent receives:
- Scanner file to load
- Skill path: `{skill-path}`
- Output directory: `{quality-report-dir}`
- Pre-pass file paths (L1: P1+P2; L2: P3)
The subagent loads its scanner file (which loads the principles file), analyzes the skill, writes its analysis to `{quality-report-dir}`, and returns the filename.
### Step 3: Synthesize Report (Parallel with Scanner 4)
Spawn report creator to synthesize scanner outputs into `report-data.json` and `quality-report.md`. This can run in parallel with the last scanner finishing.
```bash
# Spawn as background task — does not block step 4
Agent(description="Synthesize quality report", subagent_type="report-creator", run_in_background=true, prompt="...")
```
The report creator:
- Reads all 4 analysis files + prepass JSON
- Identifies thematic clusters (root-cause synthesis)
- Writes `report-data.json` with: broken, opportunities, strengths, recommendations, detailed_analysis
- Writes `quality-report.md` for archival
### Step 4: Generate & Open HTML Report (Do Not Block on Markdown)
As soon as `report-data.json` exists (the report creator writes it mid-synthesis), generate the interactive HTML report:
```bash
python3 scripts/generate-html-report.py {quality-report-dir} --open
```
**Important:** Do not wait for `quality-report.md` to be written. The JSON is the complete data source. Open HTML immediately. The markdown report finishes asynchronously and provides archival context.
### Step 5: Log the Run
After HTML opens, append a session heading to `{skill-path}/.decision-log.md`:
```markdown
## YYYY-MM-DD — Quality analysis
Grade: <grade from report-data.json>. Interactive HTML: `.analysis/<timestamp>/quality-report.html`. Full markdown: `.analysis/<timestamp>/quality-report.md`.
```
## Present to User
**Headless** (`{headless_mode}=true`): emit JSON only.
```json
{
"status": "complete",
"intent": "analyze",
"skill": "{skill-path}",
"decision_log": "{skill-path}/.decision-log.md",
"report": "{quality-report-dir}/quality-report.md"
}
```
Blocked (scanner failure, missing required input, etc.): replace `"complete"` with `"blocked"` and add `"reason": "<one-line cause>"`. The log + any partial report carry the detail.
**Interactive:** read `report-data.json` and present grade + 2-3 sentence narrative, broken items if any, top opportunities by theme, paths to the full report and HTML. Offer to apply fixes, walk findings, or discuss.

View File

@@ -0,0 +1,63 @@
# Quality Scan: Skill Architecture
You are a senior skill architect reviewing a BMad skill. Your job: identify what's missing, mismatched, or over-specified across the skill's structure, prose craft, and overall coherence — the things that would either break execution or push the executing agent into mechanical procedure-following instead of informed judgment.
**Load `references/skill-quality-principles.md` first.** It is the bar you're testing against. Don't restate its rules; cite them when findings reference them.
This scan absorbs what was previously three separate scanners (workflow-integrity, prompt-craft, skill-cohesion). Checking these together catches the mismatches that separate scans miss — a workflow split into files that belonged inline, an Overview promise that the execution instructions silently violate, prose that's structurally correct but mechanically deadening.
## Scan Targets
- `SKILL.md` — frontmatter, structure, inline workflow content, routing
- `references/*.md` — carved-out workflow sections (only present when SKILL.md was genuinely too big to keep inline)
- `assets/` — templates and other static content the workflow loads
- Anything other than `SKILL.md`, `customize.toml`, and the standard folders at skill root is suspect
If pre-pass JSON files are provided (`workflow-integrity-prepass.json`, `prompt-metrics-prepass.json`), read those first for compact metrics; read raw files only as needed for judgment calls.
## What to Find
Run the principles file against the skill and surface findings in three buckets:
**Structural integrity** — does what should exist exist, and is it wired correctly?
- Frontmatter follows the description format with quoted trigger phrases; no extra fields
- `## Overview` and `## On Activation` present and meaningful
- When SKILL.md references multiple internal files, the Conventions block is stamped (per the principles file's path-conventions section)
- Workflow content is inline in SKILL.md as named sections by default; only carved out to `references/` when SKILL.md was genuinely too big to scan
- **Carved-out files use descriptive names (`press-release.md`), NOT numbered prefixes (`01-discover.md`).** Flag numbered-prefix filenames.
- **No prompt files at skill root other than `SKILL.md` itself.** Flag any `*.md` workflow content directly under skill root that should be in `references/`.
- Routing from SKILL.md uses bare paths from skill root (`references/foo.md`)
- References in SKILL.md resolve to existing files (no orphans, no dangling refs)
- Carved-out files work standalone — no "as described in the overview" / "see SKILL.md"
- Where progression conditions exist, they're testable; "when ready" is vague
- Each carved file uses `{communication_language}` (and `{document_output_language}` if it produces a doc)
- No template artifacts (`{if-complex-workflow}`, bare `{skillName}`, etc.)
- No `## On Exit` sections
- Workflow type claim matches actual structure (Complex Workflow with everything inline → reclassify; Simple Workflow with carved references → either inline back or reclassify)
**Prose craft** — does the SKILL.md and reference prose enable judgment without bloat?
- Overview establishes role, mission, and (where relevant) domain framing, theory of mind, design rationale
- No re-teaching of LLM-native skills (scoring formulas, calibration tables, adapter proliferation, format-the-output templates)
- No defensive padding ("make sure", "remember to", "this workflow is designed to")
- Direct imperatives, not "you should" / "please"
- Carved-out files survive context compaction — critical instructions in the file itself
- Size matches purpose (principles file thresholds); large data tables and reference material lifted out of SKILL.md
**Cohesion** — does the skill hang together as a purposeful whole?
- Description matches what the skill actually does
- Workflow flows logically — earlier sections produce what later sections consume; no dead-ends, no overlaps
- **Promises-vs-behavior check** — if the Overview or design rationale states a principle ("we do X before Y"), trace through the workflow and verify the instructions enforce or at minimum don't contradict it. Implicit instructions ("acknowledge what you received") that violate stated principles are the most dangerous misalignment because they look correct on casual review.
- Complexity matches task — 10 phases for "format a file" is wrong; 2 phases for "architect a system" is wrong
- Dependency graph (`after` / `before` / `is-required`) reflects actual data flow, not artificial ordering
## Output
Write to `{quality-report-dir}/architecture-analysis.md`. Include:
- **Assessment** — 2-3 sentence verdict on the skill as a coherent whole
- **Findings** — each with severity, file:line, what's wrong, why, how to fix. Distinguish genuine waste from load-bearing context (the principles file calls this out explicitly).
- **Strengths** — what's working that should be preserved
Severity follows the principles file: anything that breaks execution or violates a stated promise is critical/high; over-specification, numbered-prefix filenames, or workflow files at skill root are high; coherence issues are medium; style is low.
Return only the filename when complete.

View File

@@ -0,0 +1,48 @@
# Quality Scan: Customization Surface
You are a customization-surface economist. Two paired questions other scanners don't ask: **what should be customizable but isn't, and what's exposed as customizable that shouldn't be?**
**Load `references/skill-quality-principles.md` first.** Its "Customization (customize.toml)" section is the schema, naming conventions, and merge rules. The customization surface is a contract with every future user — too thin forces forks, too loud creates a permutation forest no one can reason about.
This is purely advisory. Nothing here is broken; everything is either an opportunity to expose or a risk to trim.
## Scan Targets
- `customize.toml` — if present, the canonical schema for this workflow
- `SKILL.md``{workflow.X}` references (signals customize.toml is wired); hardcoded paths (lift candidates); resolver activation step
- `assets/` — templates the workflow loads (candidates for `*_template`)
- `references/*.md` — stage prompts that may reference configurable values
If no `customize.toml`, scan opportunity-side only: would this skill benefit from opting in?
## What to Find
**Opportunities — things to lift:**
- Hardcoded template paths in SKILL.md or stages → `<purpose>_template` scalars (each separate, don't bundle)
- Hardcoded output destinations → `<purpose>_output_path` (weaker than templates; flag low unless org-dependent)
- Workflow produces an artifact and stops → consider `on_complete` hook
- Missing or empty `persistent_facts` — the BMad default glob (`["file:{project-root}/**/project-context.md"]`) is high-value, low-risk; almost every customizable workflow ships it
- Sentence-shaped variance baked into prompts (tone, style, compliance rules) — not scalar candidates, but signals the `persistent_facts` surface is valuable; suggest documenting it
- Workflow has 2+ hardcoded templates and no `customize.toml` at all → high-opportunity to opt in
**Abuse — things to trim:**
- Boolean toggles (3+ in one file = the surface is doing the job of a variant skill; suggest two skills or fewer knobs)
- Identity / communication-style / principles in `[workflow]` (those are agent-shape fields — point the author at agent-builder; remove from workflow surface)
- 4+ `on_<event>` hooks (workflow internals leaking into the override surface; users can interleave hooks at so many points they break the workflow's contract)
- Arrays of tables without `code` or `id` keys (resolver can't merge by key; falls back to append-only — users can't replace items)
- Mixed keying (`code` on some, `id` on others) — pick one
- Opaque scalar names (`style_config`, `mode`-as-path) — use the principles file's `*_template` / `*_output_path` / `on_<event>` patterns
- `customize.toml` declares a scalar but SKILL.md hardcodes the same value (high-abuse — overrides silently no-op; SKILL.md must read `{workflow.<name>}`)
- Scalars with no comment explaining when/why to override
## Output
Write to `{quality-report-dir}/customization-analysis.md`. Include:
- **Customization posture** — opted in? Surface size and shape?
- **Opportunity findings** — severity (high/medium/low-opportunity), location, proposed scalar (name, default, type)
- **Abuse findings** — severity (high/medium/low-abuse), offending field, fix (rename, remove, document, rewire)
- **Overall assessment** — too thin, too loud, or about right?
- **Top 2-3 insights** distilled
Return only the filename when complete.

View File

@@ -0,0 +1,60 @@
# Quality Scan: Determinism & Distribution
You are a performance and intelligence-placement reviewer. Your job: find work happening in the wrong place — deterministic operations done by an LLM, sequential operations that should run in parallel, parent reads that should be subagent delegations, and prompts doing what a script could do faster, cheaper, and more reliably.
**Load `references/skill-quality-principles.md` first.** Its "Intelligence placement" and "Subagent constraints" sections are the bar.
This scan absorbs what was previously two separate scanners (execution-efficiency, script-opportunities). Same root question: where is work happening that shouldn't be happening here?
## Scan Targets
- `SKILL.md` — On Activation patterns, inline operations
- `*.md` prompt files at root — stage instructions
- `references/*.md` — resource-loading patterns
- `scripts/` — what already exists (avoid suggesting duplicates)
If `execution-deps-prepass.json` is provided, read it first for compact dependency metrics.
## What to Find
**Script opportunities** — for every operation in a prompt, ask: given identical input, will this always produce identical output? Could you write a unit test for it? If yes, it belongs in a script.
Patterns to surface:
- Validation against schemas, frontmatter checks, naming-convention enforcement
- Counting, aggregation, metrics extraction
- Format conversion, parsing, structured-data extraction from large files
- Cross-reference checks, dependency graph tracing, file-existence verification
- **Pre-passes** that hand the LLM compact JSON instead of raw files (highest-value, often missed — the LLM scanner reads the JSON, not the source)
- Post-processing validation of LLM-generated output
For each, estimate the LLM tax in tokens-per-invocation: heavy (500+) → high; moderate (100500) → medium; light (<100) → low.
Scripts have access to bash + Python stdlib + PEP 723 deps + git + jq + system tools. Think broadly — a script that builds a dependency graph and feeds the LLM a compact summary is zero tokens for work that would otherwise cost thousands.
Don't flag operations that genuinely require interpreting meaning, tone, context, or ambiguity. Those stay in prompts.
**Distribution opportunities** — sequential or parent-bloating patterns:
- Independent reads / tool calls / operations done sequentially → batch in one message or fan out to subagents
- "Read all files, then analyze" → delegate the reading; parent stays lean
- Implicit-read trap (per principles file): language like "review", "acknowledge", "summarize what you have" causes the parent to read files before delegating. Fix: explicit "note paths for subagent scanning; don't read them now"
- Subagent prompts without exact return format / "ONLY return X" / token limit → verbose results
- Subagent-spawning-from-subagent (will fail at runtime — chain through parent)
- Resources loaded as a single block on every activation when they could be loaded selectively
- Dependency graph over-constrained (`after` listing things that aren't real inputs) → blocks parallelism
- "Gather then process" for independent items → each item should process independently
- Validation stages placed AFTER expensive operations → fail-fast lost; cheap validation should run first
## Output
Write to `{quality-report-dir}/determinism-analysis.md`. Include:
- **Existing scripts inventory** — what's already there (so you don't propose duplicates)
- **Assessment** — 2-3 sentence verdict on intelligence placement and execution efficiency
- **Script findings** — each with severity (LLM tax band), file:line, what the LLM is currently doing, what a script would do, estimated token savings, language, pre-pass potential
- **Distribution findings** — each with severity, file:line, current pattern, efficient alternative, estimated impact
- **Aggregate token savings** estimate
- **Strengths** — efficient patterns worth preserving
Severity comes from the principles file: anything that will fail at runtime is critical; heavy LLM tax or context-bloating reads are high; missed batching is medium; small parallelization wins are low.
Return only the filename when complete.

View File

@@ -0,0 +1,55 @@
# Quality Scan: Enhancement Opportunities
You are the creative imagination on this review — the one who asks **"what's missing that nobody thought of?"** when other scanners only check what's there. Inhabit the skill as different real users in different real situations, and find the moments where it would confuse, frustrate, dead-end, or underwhelm them — plus the moments where one creative addition would transform the experience.
**Load `references/skill-quality-principles.md` first.** Its "Patterns BMad has seen pay off" section is the institutional library you'll check the skill against.
This is purely advisory. Nothing here is broken; everything is opportunity.
## Scan Targets
- `SKILL.md`, stage prompts, `references/*.md` — walk the skill end-to-end as users would experience it
## What to Find
**Inhabit user archetypes** — the first-timer, the expert who knows what they want, the confused user (invoked by accident or with wrong intent), the edge-case user (technically valid but unexpected input), the hostile environment (deps fail, files missing, context limited), and **the automator** (cron / pipeline / another agent invoking this headless with pre-supplied inputs and expecting a usable return value).
At each stage, ask:
- What if the user provides partial, ambiguous, or contradictory input?
- What if they want to skip back, change their mind, or exit cleanly mid-flow?
- What happens if an external dependency is unavailable?
- What if context compaction drops critical state mid-conversation?
- Where does the skill complete but leave the user without a clear sense of what they got?
**Headless assessment** — many workflows are built HITL-only but could work with a flag and a pre-supplied prompt. For each interaction point, ask whether a parameter could replace the question, whether a confirmation could be skipped with a reasonable default, whether a clarification is always needed or only for ambiguous input. Categorize:
- **Headless-ready** — works today with minimal changes
- **Easily adaptable** — needs a headless path on 2-3 stages
- **Partially adaptable** — core artifact creation could be headless, but discovery is fundamentally interactive — suggest a "skip to build" entry point
- **Fundamentally interactive** — the value IS the conversation (coaching, brainstorming, exploration). That's OK; flag and move on.
**Facilitative pattern check** — for any skill involving collaborative discovery or guided artifact creation, check the principles file's named patterns: soft-gate elicitation, intent-before-ingestion, capture-don't-interrupt, dual-output, parallel review lenses, three-mode architecture, graceful degradation. Flag missing ones with concrete suggestions when they'd be transformative.
**Delight opportunities** — quick-win mode for experts, smart defaults from context, proactive insight ("you might also want to consider..."), progress awareness in long flows, useful alternatives when things go wrong, suggestions for adjacent skills.
**Stay in your lane.** Don't flag structural issues (architecture scanner), efficiency or script opportunities (determinism scanner), or customization (customization scanner). Your findings should be things only a creative thinker would notice.
## How to Think
Go wild first — the weirdest user, the worst timing, the most unexpected input. No idea is too crazy in this phase. Then temper. For each wild idea, ask: is there a practical version that would actually improve the skill? If yes, distill to a sharp suggestion. If genuinely impractical, drop it — don't pad findings with fantasies.
Prioritize by user impact. Preventing confusion outranks adding nice-to-haves.
## Output
Write to `{quality-report-dir}/enhancement-analysis.md`. Include:
- **Skill understanding** — purpose, primary user, key assumptions (2-3 sentences)
- **User journeys** — for each archetype: brief narrative, friction points, bright spots
- **Headless assessment** — level + which interaction points could auto-resolve + what a headless invocation would need (inputs, return format)
- **Facilitative patterns check** — present/missing, which would be most valuable to add
- **Findings** — severity (high/medium/low-opportunity), location, what you noticed, concrete suggestion
- **Top 2-3 insights** distilled
Return only the filename when complete.

View File

@@ -0,0 +1,182 @@
# BMad Quality Analysis Report Creator
You synthesize scanner output into a unified, actionable quality report. Your job is **synthesis, not transcription** — identify themes that explain clusters of observations across multiple scanners, lead with what matters most. A user reading the report should grasp the 3 most important things about their skill within 30 seconds.
## Inputs
- `{skill-path}` — the skill being analyzed
- `{quality-report-dir}` — directory with all scanner output and where you write the report
## Read
- `*-temp.json` — lint script output (structured findings)
- `*-prepass.json` — pre-pass metrics
- `*-analysis.md` — LLM scanner analyses (free-form): `architecture-analysis.md`, `determinism-analysis.md`, `customization-analysis.md`, `enhancement-analysis.md`
## Synthesize Themes
This is the most important step. Look across ALL scanner output for **findings that share a root cause** — observations from different scanners that one fix would resolve. Ask: "If I fixed X, how many findings across all scanners would this resolve?"
Group related findings into 3-5 themes. Each theme has: name (clear root-cause description), description (what's happening, why it matters — 2-3 sentences), severity (highest of constituents), impact (what fixing this improves), action (one coherent instruction, not a list of fixes), and constituent findings (each with source scanner, file:line, brief description).
Findings that don't fit any theme become standalone items.
## Assess Overall Quality
- **Grade:** Excellent (no high+ issues, few medium) / Good (some high or several medium) / Fair (multiple high) / Poor (critical issues)
- **Narrative:** 2-3 sentences capturing the skill's primary strength and primary opportunity. This is what the user reads first.
## Write Two Files
### 1. quality-report.md
```markdown
# BMad Quality Analysis: {skill-name}
**Analyzed:** {timestamp} | **Path:** {skill-path}
**Interactive report:** quality-report.html
## Assessment
**{Grade}** — {narrative}
## What's Broken
{Only if critical/high issues exist. Each with file:line, what's wrong, how to fix.}
## Opportunities
### 1. {Theme Name} ({severity} — {N} observations)
{Description.} **Fix:** {One coherent action.}
**Observations:**
- {finding} — file:line
- ...
{Repeat for each theme.}
## Strengths
{What works — preserve these.}
## Detailed Analysis
### Architecture
{Assessment + findings not covered by themes (structural integrity, prose craft, cohesion).}
### Determinism & Distribution
{Assessment + findings (intelligence placement, parallelization, script opportunities).}
### Customization Surface
{Assessment + opportunities and abuse findings.}
### User Experience
{Journeys, headless assessment, facilitative-pattern check, edge cases.}
## Recommendations
1. {Highest impact — resolves N observations}
2. ...
```
### 2. report-data.json
This is consumed by `scripts/generate-html-report.py`. Use the field names exactly. Arrays may be empty `[]` but must exist.
```json
{
"meta": {
"skill_name": "the-skill-name",
"skill_path": "/full/path/to/skill",
"timestamp": "2026-03-26T23:03:03Z",
"scanner_count": 6
},
"narrative": "2-3 sentence synthesis shown at top of report",
"grade": "Excellent|Good|Fair|Poor",
"broken": [
{
"title": "Short headline",
"file": "relative/path.md",
"line": 25,
"detail": "Why it's broken and what goes wrong",
"action": "Specific fix",
"severity": "critical|high",
"source": "which-scanner"
}
],
"opportunities": [
{
"name": "Theme name",
"description": "What's happening and why it matters",
"severity": "high|medium|low",
"impact": "What fixing this achieves",
"action": "One coherent fix instruction for the whole theme",
"finding_count": 9,
"findings": [
{
"title": "Individual observation headline",
"file": "relative/path.md",
"line": 42,
"detail": "What was observed",
"source": "which-scanner"
}
]
}
],
"strengths": [
{
"title": "What's strong",
"detail": "Why it matters and should be preserved"
}
],
"detailed_analysis": {
"architecture": {
"assessment": "1-3 sentence summary from architecture scanner",
"findings": []
},
"determinism": {
"assessment": "1-3 sentence summary from determinism scanner",
"token_savings": "estimated total from script opportunities",
"findings": []
},
"customization": {
"assessment": "1-3 sentence summary from customization scanner",
"posture": "opted-in|not-opted-in|over-extended",
"findings": []
},
"enhancement": {
"assessment": "1-3 sentence summary from enhancement scanner",
"journeys": [
{
"archetype": "first-timer|expert|confused|edge-case|hostile-environment|automator",
"summary": "Brief narrative of this user's experience",
"friction_points": ["moment where user struggles"],
"bright_spots": ["moment where skill shines"]
}
],
"autonomous": {
"potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive",
"notes": "Brief assessment"
},
"findings": []
}
},
"recommendations": [
{
"rank": 1,
"action": "What to do",
"resolves": 9,
"effort": "low|medium|high"
}
]
}
```
Required field names: `meta.skill_name`, opportunities use `name` and `finding_count`, strengths are objects with `title` and `detail`, recommendations use `action` and numeric `rank`, journeys use `archetype` / `summary` / `friction_points` / `bright_spots`, autonomous uses `potential` / `notes`. The four `detailed_analysis` keys are `architecture`, `determinism`, `customization`, `enhancement`.
Write both files to `{quality-report-dir}/`.
## Return
Return only the path to `report-data.json` when complete.

View File

@@ -0,0 +1,100 @@
# Script Opportunities Reference — Workflow Builder
**Reference: `references/script-standards.md` for script creation guidelines.**
## Core Principle
Scripts handle deterministic operations (validate, transform, count). Prompts handle judgment (interpret, classify, decide). If a check has clear pass/fail criteria, it belongs in a script.
---
## How to Spot Script Opportunities
### The Determinism Test
1. **Given identical input, will it always produce identical output?** → Script candidate.
2. **Could you write a unit test with expected output?** → Definitely a script.
3. **Requires interpreting meaning, tone, or context?** → Keep as prompt.
### The Judgment Boundary
| Scripts Handle | Prompts Handle |
| -------------------------------- | ------------------------------------ |
| Fetch, Transform, Validate | Interpret, Classify (ambiguous) |
| Count, Parse, Compare | Create, Decide (incomplete info) |
| Extract, Format, Check structure | Evaluate quality, Synthesize meaning |
### Signal Verbs in Prompts
When you see these in a workflow's requirements, think scripts first: "validate", "count", "extract", "convert/transform", "compare", "scan for", "check structure", "against schema", "graph/map dependencies", "list all", "detect pattern", "diff/changes between"
### Script Opportunity Categories
| Category | What It Does | Example |
| ------------------- | ----------------------------------------------------------- | -------------------------------------------------- |
| Validation | Check structure, format, schema, naming | Validate frontmatter fields exist |
| Data Extraction | Pull structured data without interpreting meaning | Extract all `{variable}` references from markdown |
| Transformation | Convert between known formats | Markdown table to JSON |
| Metrics | Count, tally, aggregate statistics | Token count per file |
| Comparison | Diff, cross-reference, verify consistency | Cross-ref prompt names against SKILL.md references |
| Structure Checks | Verify directory layout, file existence | Skill folder has required files |
| Dependency Analysis | Trace references, imports, relationships | Build skill dependency graph |
| Pre-Processing | Extract compact data from large files BEFORE LLM reads them | Pre-extract file metrics into JSON for LLM scanner |
| Post-Processing | Verify LLM output meets structural requirements | Validate generated YAML parses correctly |
### Your Toolbox
**Python is the default** for all script logic (cross-platform: macOS, Linux, Windows/WSL). See `references/script-standards.md` for full rationale and safe bash commands.
- **Python:** Full standard library (`json`, `pathlib`, `re`, `argparse`, `collections`, `difflib`, `ast`, `csv`, `xml`, etc.) plus PEP 723 inline-declared dependencies (`tiktoken`, `jsonschema`, `pyyaml`, etc.)
- **Safe shell commands:** `git`, `gh`, `uv run`, `npm`/`npx`/`pnpm`, `mkdir -p`
- **Avoid bash for logic** — no piping, `jq`, `grep`, `sed`, `awk`, `find`, `diff`, `wc` in scripts. Use Python equivalents instead.
### The --help Pattern
All scripts use PEP 723 metadata and implement `--help`. Prompts can reference `scripts/foo.py --help` instead of inlining interface details — single source of truth, saves prompt tokens.
---
## Script Output Standard
All scripts MUST output structured JSON:
```json
{
"script": "script-name",
"version": "1.0.0",
"skill_path": "/path/to/skill",
"timestamp": "2025-03-08T10:30:00Z",
"status": "pass|fail|warning",
"findings": [
{
"severity": "critical|high|medium|low|info",
"category": "structure|security|performance|consistency",
"location": { "file": "SKILL.md", "line": 42 },
"issue": "Clear description",
"fix": "Specific action to resolve"
}
],
"summary": {
"total": 0,
"critical": 0,
"high": 0,
"medium": 0,
"low": 0
}
}
```
### Implementation Checklist
- [ ] `--help` with PEP 723 metadata
- [ ] Accepts skill path as argument
- [ ] `-o` flag for output file (defaults to stdout)
- [ ] Diagnostics to stderr
- [ ] Exit codes: 0=pass, 1=fail, 2=error
- [ ] `--verbose` flag for debugging
- [ ] Self-contained (PEP 723 for dependencies)
- [ ] No interactive prompts, no network dependencies
- [ ] Valid JSON to stdout
- [ ] Tests in `scripts/tests/`

View File

@@ -0,0 +1,92 @@
# Script Creation Standards
When building scripts for a skill, follow these standards to ensure portability and zero-friction execution. Skills must work across macOS, Linux, and Windows (native, Git Bash, and WSL).
## Python Over Bash
**Always favor Python for script logic.** Bash is not portable — it fails or behaves inconsistently on Windows (Git Bash is MSYS2-based, not a full Linux shell; WSL bash can conflict with Git Bash on PATH; PowerShell is a different language entirely). Python with `uv run` works identically on all platforms.
**Safe bash commands** — these work reliably across all environments and are fine to use directly:
- `git`, `gh` — version control and GitHub CLI
- `uv run` — Python script execution with automatic dependency handling
- `npm`, `npx`, `pnpm` — Node.js ecosystem
- `mkdir -p` — directory creation
**Everything else should be Python** — piping, `jq`, `grep`, `sed`, `awk`, `find`, `diff`, `wc`, and any non-trivial logic. Even `sed -i` behaves differently on macOS vs Linux. If it's more than a single safe command, write a Python script.
## Favor the Standard Library
Always prefer Python's standard library over external dependencies. The stdlib is pre-installed everywhere, requires no `uv run`, and has zero supply-chain risk. Common stdlib modules that cover most script needs:
- `json` — JSON parsing and output
- `pathlib` — cross-platform path handling
- `re` — pattern matching
- `argparse` — CLI interface
- `collections` — counters, defaultdicts
- `difflib` — text comparison
- `ast` — Python source analysis
- `csv`, `xml.etree` — data formats
Only pull in external dependencies when the stdlib genuinely cannot do the job (e.g., `tiktoken` for accurate token counting, `pyyaml` for YAML parsing, `jsonschema` for schema validation). **External dependencies must be confirmed with the user during the build process** — they add install-time cost, supply-chain surface, and require `uv` to be available.
## PEP 723 Inline Metadata (Required)
Every Python script MUST include a PEP 723 metadata block. For scripts with external dependencies, use the `uv run` shebang:
```python
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.10"
# dependencies = ["pyyaml>=6.0", "jsonschema>=4.0"]
# ///
```
For scripts using only the standard library, use a plain Python shebang but still include the metadata block:
```python
#!/usr/bin/env python3
# /// script
# requires-python = ">=3.10"
# ///
```
**Key rules:**
- The shebang MUST be line 1 — before the metadata block
- Always include `requires-python`
- List all external dependencies with version constraints
- Never use `requirements.txt`, `pip install`, or expect global package installs
- The shebang is a Unix convenience only — cross-platform invocation always uses `uv run scripts/foo.py`
## Invocation in SKILL.md
How a built skill's SKILL.md should reference its scripts:
- **Scripts with external dependencies:** `uv run scripts/analyze.py {args}`
- **Stdlib-only scripts:** `python3 scripts/scan.py {args}` (also fine to use `uv run` for consistency)
`uv run` reads the PEP 723 metadata, silently caches dependencies in an isolated environment, and runs the script — no user prompt, no global install. Like `npx` for Python.
## Graceful Degradation
Skills may run in environments where Python or `uv` is unavailable (e.g., claude.ai web). Scripts should be the fast, reliable path — but the skill must still deliver its outcome when execution is not possible.
**Pattern:** When a script cannot execute, the LLM performs the equivalent work directly. The script's `--help` documents what it checks, making this fallback natural. Design scripts so their logic is understandable from their help output and the skill's context.
In SKILL.md, frame script steps as outcomes, not just commands:
- Good: "Validate path conventions (run `scripts/scan-paths.py --help` for details)"
- Avoid: "Execute `python3 scripts/scan-paths.py`" with no context about what it does
## Script Interface Standards
- Implement `--help` via `argparse` (single source of truth for the script's API)
- Accept target path as a positional argument
- `-o` flag for output file (default to stdout)
- Diagnostics and progress to stderr
- Exit codes: 0=pass, 1=fail, 2=error
- `--verbose` flag for debugging
- Output valid JSON to stdout
- No interactive prompts, no network dependencies
- Tests in `scripts/tests/`

View File

@@ -0,0 +1,230 @@
# Skill Quality Principles
What earns its place in a BMad skill, and what should be cut. Loaded at both build time (so the author follows the bar upfront) and at quality-analysis time (so scanners verify against the same bar).
## The Core Test
For every line you write or review: **would an LLM do this correctly without being told?** If yes, cut it. The instruction must earn its place by preventing a failure that would otherwise happen.
## What Earns Its Keep
The model already knows how to facilitate, ask questions, write prose, parse intent, and format markdown. Spend file weight on:
- **Project paths and outputs** — `{project-root}/...`, config-resolved paths, where the artifact lands.
- **Schema** — frontmatter format, customize.toml shape, downstream contracts.
- **BMad-specific conventions** — naming (`bmad-` prefix, module prefixes), description format, intelligence placement.
- **Hard rules with body count** — the implicit-read trap, subagent-can't-spawn-subagent, compaction survival.
- **Fragile-operation invocations** — exact script commands, exact API calls. One right way.
- **Domain framing and theory-of-mind** for interactive workflows — context that enables judgment.
- **Design rationale** for non-obvious choices — prevents the LLM from "optimizing" away constraints it doesn't understand.
## What Doesn't Earn Its Keep
- Numbered procedural steps for things the LLM does naturally
- Per-platform adapter files for tools the LLM speaks fluently
- Scoring formulas, weighted calibration tables, decision matrices for subjective judgment
- Templates teaching output formatting, greeting users, or prompt assembly
- "Why It Matters" prose attached to obvious checks
- Defensive padding ("make sure", "don't forget", "remember to")
- Meta-explanation ("This workflow is designed to...")
- Bot personas with rubrics where role + outcome would do the same job
- Explaining the model to itself ("You are an AI that...")
- Multiple files that could be a single instruction
## Outcome vs Prescriptive
| Prescriptive (avoid) | Outcome-based (prefer) |
| --- | --- |
| "Step 1: Ask about goals. Step 2: Ask about constraints. Step 3: Summarize and confirm." | "Ensure the user's vision is fully captured — goals, constraints, and edge cases — before proceeding." |
| "Load config. Read user_name. Read communication_language. Greet by name in their language." | "Load available config and greet the user appropriately." |
| "Create a file. Write the header. Write section 1. Write section 2. Save." | "Produce a report covering X, Y, and Z." |
The prescriptive versions miss requirements the author didn't think of. The outcome-based versions let the LLM adapt.
## When Procedure IS Value
Reserve exact steps for fragile operations where deviation has consequences:
- Exact script invocations (`python3 scripts/foo.py {arg}`)
- Specific file paths and config keys
- API calls with precise parameters
- Security-critical operations
- The customize.toml resolver step
| Freedom | When | Example |
| --- | --- | --- |
| **High** (outcomes) | Multiple valid approaches, LLM judgment adds value | "Ensure the user's requirements are complete" |
| **Medium** (guided) | Preferred approach exists, some variation OK | "Present findings in a structured report with an executive summary" |
| **Low** (exact) | Fragile, one right way, consequences for deviation | `python3 scripts/scan-path-standards.py {skill-path}` |
## BMad Institutional Knowledge
Things the bare model genuinely won't know. This is what your file weight buys.
### Naming
- Skill name = folder name (kebab-case)
- Module skill: `{module-code}-{name}` (e.g. `bmm-create-prd`, `cis-brainstorm`)
- Standalone: `{name}`
- The `bmad-` prefix is reserved for official BMad creations
### Description format
Two parts: `[5-8 word summary]. [Use when user says 'specific phrase' or 'specific phrase'.]`
Quote the trigger phrases. Default to conservative (explicit) triggering — most BMad skills are explicitly invoked. Organic triggering is reserved for skills that should activate on context (e.g. "Trigger when code imports anthropic SDK").
Bad: `Helps with PRDs and product requirements.` (too vague — hijacks unrelated conversations).
### Path conventions
All file references in a skill use bare paths from the skill root. The canonical Conventions block (from `bmad-prfaq/SKILL.md`) — stamp it into any SKILL.md that references multiple internal files:
```
## Conventions
- Bare paths (e.g. `references/press-release.md`) resolve from the skill root.
- `{skill-root}` resolves to this skill's installed directory (where `customize.toml` lives).
- `{project-root}`-prefixed paths resolve from the project working directory.
- `{skill-name}` resolves to the skill directory's basename.
```
Additional rules:
- Forward slashes only (cross-platform).
- Config variables already contain `{project-root}` in their resolved values — never double-prefix.
- `references/` is for prompt content carved out of SKILL.md. `assets/` is for templates and other static content the workflow loads. `scripts/` is for deterministic code. Never put workflow content directly at skill root.
### Customization (customize.toml)
Always-present fields: `activation_steps_prepend`, `activation_steps_append`, `persistent_facts` (each is an array; overrides append).
Workflow-specific scalars (lifted during configurability discovery):
- `<purpose>_template` for template file paths
- `<purpose>_output_path` for writable destinations
- `on_<event>` for hook scalars
Arrays of tables MUST key on `code` or `id` (resolver merges by key; without it, falls back to append-only).
Merge rules: scalars override, tables deep-merge, arrays-of-tables key-merge, plain arrays append.
Override files: `{project-root}/_bmad/custom/{skill-name}.toml` (team), `.user.toml` (personal). Merge order: base → team → user.
Default `persistent_facts`: `["file:{project-root}/**/project-context.md"]` is BMad's convention.
SKILL.md must reference resolved values as `{workflow.<name>}`. Hardcoded paths next to a declared scalar = override silently no-ops.
### Intelligence placement
- Scripts handle plumbing: fetch, parse, validate, count, transform.
- Prompts handle judgment: interpret, classify, decide.
- Script using regex to decide what content MEANS = intelligence leak into the script.
- Prompt validating structure, counting items, comparing against schemas = determinism leak into the LLM.
### Workflows: inline first, carve out only when needed
Default: write the entire workflow as named sections in SKILL.md (`## Discovery`, `## Constraints`, `## Finalize`, etc.). A multi-stage coaching workflow can live in one SKILL.md.
Carve out to `references/` only when SKILL.md genuinely gets too big to scan. When you do:
- **Descriptive filenames.** `references/press-release.md`, `references/customer-faq.md`. Never numbered prefixes (`01-press-release.md`) — the carve-out is a section, not a "step." SKILL.md routes to references by name and the order is whatever SKILL.md specifies.
- Each carved-out file works standalone — context compaction can drop SKILL.md mid-flow. No "as described in the overview."
- Progression conditions, where they exist, must be testable ("when X is captured, route to Y"). "When ready" is vague.
- The file uses `{communication_language}` (and `{document_output_language}` if it produces a doc).
- There are NO exit hooks in the system. Don't add `## On Exit` sections — they'd never run.
### Headless mode
When a skill supports headless invocation, the decision log absorbs every assumption made without the user — intent inference, proposed names, customization defaults, conflict resolutions, lint-fix calls, anything the user would have weighed in on interactively. The JSON return is the smallest set of paths the caller needs (typically `skill` + `decision_log`, plus the report path for analysis flows); the log carries the reasoning. `status` is `complete` or `blocked`; on `blocked`, include a one-line `reason` and still return the log path so the caller can read the detail. Without this discipline, headless silently buries its calls and the audit trail breaks on the next session.
### Subagent constraints
- Subagents CANNOT spawn other subagents. Chain through parent.
- Don't read files in parent if you can delegate the read — parent stays lean.
- Subagent prompts must specify exact return format and "ONLY return X" constraint, or you get verbose prose.
- **The implicit-read trap:** Language like "review", "acknowledge", "summarize what you have" causes the parent to read files even when you didn't ask for it. If a later stage delegates document analysis, earlier stages must NOT use that language. Use "note paths for subagent scanning; don't read them now".
### Size guidance
Production targets, not hard limits. The "what fails if I delete this?" test still applies to every line.
- SKILL.md: ~80 lines target, hard ceiling ~130
- Multi-branch SKILL.md: up to ~250 lines if each branch has brief contextual explanation
- Single-purpose: up to ~500 lines (~5000 tokens) if focused
- Past those: lift to `references/` or `assets/`
### Patterns BMad has seen pay off
Institutional names for patterns the LLM won't generate by default:
- **Open-floor opening** — Conversational skills start with an explicit invitation for the user to share everything they have (goals, references, examples, paths to artifacts) before any structured Q&A. The dump replaces most of the question script that would otherwise follow; the agent then asks only what's missing. The form adapts to input — vague request gets "tell me everything", path/URL gets "what do you want focused on?". Costs almost nothing token-wise; drastically improves conversational feel.
- **Soft-gate elicitation** — "Anything else, or shall we move on?" at natural transitions. Users always remember one more thing when given a graceful exit.
- **Intent-before-ingestion** — Understand why the user is here before scanning artifacts. Without intent, scanning is noise.
- **Capture-don't-interrupt** — Out-of-scope insights mid-flow get captured silently, not redirected. Users in flow share their best stuff unprompted.
- **Dual-output** — Human artifact + LLM distillate, when the artifact will feed downstream agents.
- **Parallel review lenses** — Fan out 2-3 review subagents (skeptic, opportunity-spotter, contextually-chosen lens) before finalizing significant artifacts.
- **Three-mode architecture** — Guided / Yolo / Headless. Not all skills need all three; considering it during design prevents lock-in.
- **Graceful degradation** — Subagent-dependent features fall back to sequential when subagents are unavailable.
- **Decision-Log Workspace** — multi-turn workflows producing revisable artifacts. The decision log is the load-bearing artifact (carries identity across sessions, prevents railroading, audits overrides). Subsumes "document-as-cache" — see full treatment below.
### Writing
- One term per concept; pick it and stick to it.
- Third person in descriptions ("Processes files", not "I help process files").
- Descriptive file names (`form-validation-rules.md`, not `doc2.md`).
- One level deep for reference files — SKILL.md → reference, never SKILL → ref → ref chains.
## The Decision-Log Workspace Pattern
The default for any multi-turn workflow that produces a substantive artifact, may be revisited (Update or Validate), or risks running long enough to compact.
**Core insight.** The decision log is the load-bearing artifact, not the document. The document is what the user takes; the decision log is what carries identity across sessions, prevents the agent from railroading the user, surfaces conflicts on update, and creates an audit trail when the user overrides their own past calls. Workflows that lack it look fine on the first pass and fall apart on revisit.
### Workspace layout
All files live in a single folder rooted at the primary artifact. Two cases:
- **The artifact is a single document** (a brief, a PRFAQ, etc.) → the workspace is the document's containing folder; the log + addendum + distillate sit as peers of the document.
- **The artifact is itself a folder of files** (a built skill, a generated module) → the workspace IS the artifact's folder; the log + addendum sit as peers of the primary file (e.g. `SKILL.md`).
Either way, the workspace exists from the moment intent is confirmed — not at the end. The user knows the path immediately; state lives on disk, not in the conversation.
- `<primary>` — the artifact (or, for folder-artifacts, the primary file like `SKILL.md`). YAML frontmatter is the recoverable-state mechanism when the workflow needs it; fields are workflow-specific (the LLM picks what each workflow benefits from — some need none).
- `.decision-log.md` — every meaningful decision and why, with alternatives considered. Append-only across sessions, with date-stamped session headings. Can carry its own frontmatter for session state when that's useful.
- `addendum.md` — context the user surfaced that didn't earn a place in the primary (rejected alternatives, parked roadmap, options-considered matrices, in-depth personas). Created only when something earns its place.
- `distillate.md` *(optional)* — token-efficient version of the primary for downstream LLM consumers.
### Resume protocol
On activation, check whether a workspace already exists for this artifact. If found, surface it (with the `updated` timestamp from the primary's frontmatter) and offer to resume. Reading `.decision-log.md` recovers full context regardless of compaction.
### Update mode
Read `.decision-log.md` and the addendum first. The change request enters as a "change signal" against the standing record. If the change contradicts a prior decision, surface the conflict before applying. Every change — clean or override — gets a new decision-log entry. Overrides also write to the addendum: the rejected reasoning needs to live somewhere.
### Validate mode
Read `.decision-log.md` first. A validation that ignores prior decisions or stated user criteria is shallow; it should challenge the artifact against the standards the user themselves set, not against generic rubrics.
### Finalize step
Decision-log audit. Every meaningful entry must be either captured in the primary, captured in the addendum, or explicitly set aside as process noise. The user ends the session with a shared accounting of how their thinking was handled — not a one-sided polish-and-deliver.
### When NOT to use
- Simple Utilities (no decisions to log; the input/output IS the contract).
- One-shot code operations (the diff is the decision log).
- Purely conversational skills (no artifact persists).
### Treatment style (writing it into a skill)
State the principle once where it first applies — typically inside the Create intent description as a single clause ("write the primary skeleton and `.decision-log.md` to the workspace; the decision log is canonical memory"). Mention reads at the moments that matter: Update reads decisions before changing them, Validate reads them before critiquing, Finalize audits the log at handoff. That's the entire treatment.
Do NOT:
- Open with a "Decision-log discipline" enumeration of what kinds of things to log — the LLM knows. Trust it.
- Write a separate `## Workspace` section header with meta-explanation of the pattern.
- Include a tree diagram of the workspace layout — the workspace is just files; the LLM names them as it uses them.
- Prescribe a YAML frontmatter schema for the decision log — fields are workflow-specific; let the building LLM pick what each workflow needs (or skip frontmatter entirely).
- Split workspace creation into separate "for new" / "for existing" sub-sections — "create if absent, append a new session heading if present" is one sentence.
The scanner flags skills that bury DLW guidance under ceremony. `bmad-product-brief` is the canonical-brief example: ~5 sentences total, threaded through Create / Update / Validate / Constraints / Finalize at the points where each matters.
## Failure Modes With Body Count
- **Description over-broadens** → Skill hijacks unrelated conversations. Fix: quote trigger phrases.
- **Vague progression conditions** ("when ready") → Stage never advances or advances early. Fix: testable conditions.
- **Stage references SKILL.md** ("as above") → Breaks on compaction. Fix: stages self-contained.
- **Subagent prompt without explicit return format** → Verbose prose responses. Fix: "Return ONLY {schema}. No other output."
- **Parent reads then delegates analysis** → Context bloat that makes delegation pointless. Fix: delegate the read.
- **Implicit-read trap** in a stage that precedes subagent delegation → Parent reads everything anyway. Fix: explicit "don't read these now".
- **Scoring formulas for subjective judgment** → Rigidity that doesn't improve quality. Fix: state the outcome, let the model assess.
- **Boolean toggles in customize.toml** → Author didn't decide what the skill does; surface becomes a permutation forest. Fix: pick a default; users fork if they want the other shape.
- **Hardcoded path in SKILL.md while customize.toml declares the scalar** → Override silently does nothing. Fix: SKILL.md must read `{workflow.<name>}`.
- **Identity / communication-style / principles in `[workflow]`** → Workflow wants to be an agent. Fix: point author at agent-builder; remove from workflow surface.

View File

@@ -0,0 +1,196 @@
# Standard Workflow/Skill Fields
## Frontmatter Fields
Only these fields go in the YAML frontmatter block:
| Field | Description | Example |
| ------------- | ---------------------------------------------------- | --------------------------------------------- |
| `name` | Full skill name (kebab-case, same as folder name) | `validate-json`, `cis-brainstorm` |
| `description` | [5-8 word summary]. [Use when user says 'X' or 'Y'.] | See Description Format below |
## Content Fields (All Types)
These are used within the SKILL.md body — never in frontmatter:
| Field | Description | Example |
| --------------- | ----------------------------- | --------------------------------- |
| `role-guidance` | Brief expertise primer | "Act as a senior DevOps engineer" |
| `module-code` | Module code (if module-based) | `bmb`, `cis` |
## Simple Utility Fields
| Field | Description | Example |
| --------------- | ----------------------------------- | ------------------------------------------- |
| `input-format` | What it accepts | JSON file path, stdin text |
| `output-format` | What it returns | Validated JSON, error report |
| `standalone` | Fully standalone, no config needed? | true/false |
| `composability` | How other skills use it | "Called by quality scanners for validation" |
## Simple Workflow Fields
| Field | Description | Example |
| ------------ | --------------------- | ----------------------------------------- |
| `steps` | Numbered inline steps | "1. Load config 2. Read input 3. Process" |
| `tools-used` | CLIs/tools/scripts | gh, jq, python scripts |
| `output` | What it produces | PR, report, file |
## Complex Workflow Fields
| Field | Description | Example |
| ------------------------ | --------------------------------- | ------------------------------------- |
| `stages` | Named numbered stages | "01-discover, 02-plan, 03-build" |
| `progression-conditions` | When stages complete | "User approves outline" |
| `headless-mode` | Supports autonomous? | true/false |
| `config-variables` | Beyond core vars | `planning_artifacts`, `output_folder` |
| `output-artifacts` | What it creates (output-location) | "PRD document", "agent skill" |
## Customization Surface (`customize.toml`, opt-in)
Emitted only when the skill author opts in during Phase 3.5 (Configurability Discovery). The file sits next to SKILL.md and is loaded via `{project-root}/_bmad/scripts/resolve_customization.py` at activation.
### Always-present fields (when opted in)
| Field | Type | Purpose |
| -------------------------- | ------------- | -------------------------------------------------------------------------- |
| `activation_steps_prepend` | array[string] | Steps run before standard activation. Overrides append. |
| `activation_steps_append` | array[string] | Steps run after greet, before the workflow's first stage. Overrides append. |
| `persistent_facts` | array[string] | Facts (literal or `file:` prefixed paths/globs) loaded on activation. Overrides append. |
### Workflow-specific scalars (lifted during Phase 3.5)
Named by purpose and suffix. Override wins (scalar merge rule).
| Naming pattern | Use for | Example |
| ------------------- | ---------------------------------------------------- | --------------------------------------------------- |
| `<purpose>_template` | File path for templates the workflow loads | `brief_template = "assets/brief-template.md"` |
| `<purpose>_output_path` | Writable destination paths | `output_path = "{project-root}/docs/briefs"` |
| `on_<event>` | Prompt or command executed at a hook point | `on_complete = ""` |
**Path resolution within scalar values:**
- Bare paths (e.g. `assets/brief-template.md`) resolve from the skill root.
- `{project-root}/...` resolves from the project working directory — use for org-owned overrides.
- Never mix `{project-root}` with config variables that already contain it (no double-prefix).
### How SKILL.md references the resolved values
After the resolver step runs, read customized values as `{workflow.<name>}`:
```markdown
Load the brief template from `{workflow.brief_template}`.
```
At runtime, that resolves to whatever the merged `[workflow].brief_template` scalar is — the default, a team override, or a personal override.
### Override files
Teams and users override without editing `customize.toml` in the skill, and instead modify the following:
- Team: `{project-root}/_bmad/custom/{skill-name}.toml`
- Personal: `{project-root}/_bmad/custom/{skill-name}.user.toml`
Both use the same `[workflow]` block shape. Merge order: base (skill's `customize.toml`) → team → user.
## Overview Section Format
The Overview is the first section after the title — it primes the AI for everything that follows.
**3-part formula:**
1. **What** — What this workflow/skill does
2. **How** — How it works (approach, key stages)
3. **Why/Outcome** — Value delivered, quality standard
**Templates by skill type:**
**Complex Workflow:**
```markdown
This skill helps you {outcome} through {approach}. Act as {role-guidance}, guiding users through {key stages}. Your output is {deliverable}.
```
**Simple Workflow:**
```markdown
This skill {what it does} by {approach}. Act as {role-guidance}. Use when {trigger conditions}. Produces {output}.
```
**Simple Utility:**
```markdown
This skill {what it does}. Use when {when to use}. Returns {output format} with {key feature}.
```
## SKILL.md Description Format
The frontmatter `description` is the PRIMARY trigger mechanism — it determines when the AI invokes this skill. Most BMad skills are **explicitly invoked** by name (`/skill-name` or direct request), so descriptions should be conservative to prevent accidental triggering.
**Format:** Two parts, one sentence each:
```
[What it does in 5-8 words]. [Use when user says 'specific phrase' or 'specific phrase'.]
```
**The trigger clause** uses one of these patterns depending on the skill's activation style:
- **Explicit invocation (default):** `Use when the user requests to 'create a PRD' or 'edit an existing PRD'.` — Quotes around specific phrases the user would actually say. Conservative — won't fire on casual mentions.
- **Organic/reactive:** `Trigger when code imports anthropic SDK, or user asks to use Claude API.` — For lightweight skills that should activate on contextual signals, not explicit requests.
**Examples:**
Good (explicit): `Builds workflows and skills through conversational discovery. Use when the user requests to 'build a workflow', 'modify a workflow', or 'quality check workflow'.`
Good (organic): `Initializes BMad project configuration. Trigger when any skill needs module-specific configuration values, or when setting up a new BMad project.`
Bad: `Helps with PRDs and product requirements.` — Too vague, would trigger on any mention of PRD even in passing conversation.
Bad: `Use on any mention of workflows, building, or creating things.` — Over-broad, would hijack unrelated conversations.
**Default to explicit invocation** unless the user specifically describes organic/reactive activation during discovery.
## Role Guidance Format
Every generated workflow SKILL.md includes a brief role statement in the Overview or as a standalone line:
```markdown
Act as {role-guidance}. {brief expertise/approach description}.
```
This provides quick prompt priming for expertise and tone. Workflows may also use full Identity/Communication Style/Principles sections when personality serves the workflow's purpose.
## Path Rules
### Skill-Internal References
Use bare paths from the skill root for any file inside this skill — including same-folder references between two files in `references/` or two files in `scripts/`:
- `references/build-process.md`
- `references/standard-fields.md` (referenced from another file in `references/` — still bare path)
- `scripts/validate.py`
- `assets/template.md`
The convention is universal: bare paths from skill root. Never use `./` prefixes — they cause inconsistency and break under context compaction when the working directory shifts.
### Project-Scope Paths
Use `{project-root}/...` for any path relative to the project root:
- `{project-root}/_bmad/planning/prd.md`
- `{project-root}/docs/report.md`
### Config Variables
Use directly — they already contain `{project-root}` in their resolved values:
- `{output_folder}/file.md`
- `{planning_artifacts}/prd.md`
### Anti-patterns (negative examples — fenced so the linter doesn't fire on them)
```text
{project-root}/{output_folder}/file.md # WRONG — double-prefix; config var already has {project-root}
_bmad/planning/prd.md # WRONG — bare _bmad must have {project-root} prefix
./references/foo.md # WRONG — never use ./ for skill-internal paths
./scripts/foo.py # WRONG — same; bare paths from skill root only
```

View File

@@ -0,0 +1,47 @@
# Template Substitution Rules
The SKILL-template provides a minimal skeleton: frontmatter, overview, and activation with config loading. Everything beyond that is crafted by the builder based on what was learned during discovery and requirements phases.
## Frontmatter
- `{module-code-or-empty}` → Module code prefix with hyphen (e.g., `bmb-`) or empty for standalone. The `bmad-` prefix is reserved for official BMad creations; user skills should not include it.
- `{skill-name}` → Skill functional name (kebab-case)
- `{skill-description}` → Two parts: [5-8 word summary]. [trigger phrases]
## Module Conditionals
### For Module-Based Skills
- `{if-module}` ... `{/if-module}` → Keep the content inside
- `{if-standalone}` ... `{/if-standalone}` → Remove the entire block including markers
- `{module-code}` → Module code without trailing hyphen (e.g., `bmb`)
- `{module-setup-skill}` → Name of the module's setup skill (e.g., `mymod-setup`)
### For Standalone Skills
- `{if-module}` ... `{/if-module}` → Remove the entire block including markers
- `{if-standalone}` ... `{/if-standalone}` → Keep the content inside
## Customization Conditionals
### When Customization Is Opted In
- `{if-customizable}` ... `{/if-customizable}` → Keep the content inside; emit `customize.toml` alongside SKILL.md.
- Lifted configurable scalars are referenced in SKILL.md body as `{workflow.<name>}` (e.g. `{workflow.brief_template}`). These are resolved at runtime by the resolver, not at build time — emit them verbatim.
### When Customization Is Not Opted In
- `{if-customizable}` ... `{/if-customizable}` → Remove the entire block including markers.
- Do NOT emit `customize.toml`. Use hardcoded paths and values in SKILL.md throughout.
## Beyond the Template
The builder determines the rest of the skill structure — body sections, phases, stages, scripts, external skills, headless mode, role guidance — based on the skill type classification and requirements gathered during the build process. The template intentionally does not prescribe these; the builder has the context to craft them.
## Path References
All generated skills use paths relative to skill root (cross-directory) or `./` (same-folder):
- `references/{reference}.md` — Reference documents loaded on demand
- `references/{stage}.md` — Stage prompts (complex workflows)
- `scripts/` — Python/shell scripts for deterministic operations

View File

@@ -0,0 +1,287 @@
#!/usr/bin/env python3
"""Deterministic extraction of report-data.json from analysis outputs.
Reads scanner outputs (markdown + JSON) and extracts structured data without
LLM synthesis. Ensures no data loss and completes in <10 seconds.
Usage:
python3 extract-report-json.py {skill-path} {quality-report-dir} -o {output-file}
"""
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
def extract_section(content: str, section_name: str, level: int = 2) -> str | None:
"""Extract a section from markdown by heading name."""
pattern = r'^#{' + str(level) + r'}\s+' + re.escape(section_name) + r'\s*\n(.*?)(?=^#{1,' + str(level) + r'}\s|\Z)'
match = re.search(pattern, content, re.MULTILINE | re.DOTALL)
return match.group(1).strip() if match else None
def extract_journeys(content: str) -> list[dict]:
"""Extract user journey archetypes from enhancement-analysis.md."""
journeys = []
# Match ### N. {Name}: {Description}
pattern = r'^###\s+\d+\.\s+([^:]+):\s+(.+?)(?=^###|\Z)'
for match in re.finditer(pattern, content, re.MULTILINE | re.DOTALL):
name = match.group(1).strip()
section = match.group(2)
# Extract narrative (after "Narrative." or "Narrative\n")
narrative_match = re.search(r'(?:Narrative[:.]\s*)?([^\n]+(?:\n[^*\n][^\n]*)*?)(?=\n\*\*|\n[A-Z])', section)
summary = narrative_match.group(1).strip() if narrative_match else ""
# Extract friction points
friction_points = []
friction_section = re.search(r'\*\*Friction points?[:\*]*\*\*\s*\n(.*?)(?=\n\*\*|\n[A-Z]|$)', section, re.DOTALL)
if friction_section:
for line in friction_section.group(1).split('\n'):
line = line.strip()
if line.startswith('- '):
friction_points.append(line[2:].strip())
# Extract bright spots
bright_spots = []
bright_section = re.search(r'\*\*Bright spots?[:\*]*\*\*\s*\n(.*?)(?=\n\*\*|\n[A-Z]|$)', section, re.DOTALL)
if bright_section:
for line in bright_section.group(1).split('\n'):
line = line.strip()
if line.startswith('- '):
bright_spots.append(line[2:].strip())
journeys.append({
'archetype': name,
'summary': summary,
'friction_points': friction_points,
'bright_spots': bright_spots
})
return journeys
def extract_autonomous(content: str) -> dict:
"""Extract headless/automation assessment from enhancement-analysis.md."""
assessment_section = extract_section(content, 'Headless Assessment', level=2)
if not assessment_section:
return {}
# Look for "Current Level:" or "Potential:" pattern
potential_match = re.search(r'(?:Current Level|Potential)[:\*]*\s*([^\n.]+)', assessment_section)
potential = potential_match.group(1).strip() if potential_match else "unknown"
# Get the rest as notes
notes = assessment_section
if potential_match:
notes = assessment_section[potential_match.end():].strip()
return {
'potential': potential,
'notes': notes[:200] if notes else "" # Truncate to 200 chars
}
def extract_findings_from_md(content: str, source_scanner: str) -> list[dict]:
"""Extract individual findings from analysis markdown.
Handles multiple formats:
- Architecture: level 4 headings under severity sections (### HIGH, etc)
- Determinism: bold headings with severity markers [HIGH], [LOW]
- Customization: bold headings with opportunity markers (HIGH-OPPORTUNITY, etc)
- Enhancement: numbered findings with severity/opportunity markers
"""
findings = []
if source_scanner == 'architecture':
# Architecture format: ### SEVERITY followed by #### N. Title
severity_pattern = r'^###\s+(CRITICAL|HIGH|MEDIUM|LOW)\s*$'
severity_sections = re.split(severity_pattern, content, flags=re.MULTILINE)
for i in range(1, len(severity_sections), 2):
severity = severity_sections[i].lower() if i < len(severity_sections) else "medium"
section_content = severity_sections[i + 1] if i + 1 < len(severity_sections) else ""
if not section_content.strip() or section_content.strip() == "None":
continue
# Extract level 4 findings (#### N. Title)
finding_pattern = r'^####\s+(\d+\.\s+)?(.+?)$'
for match in re.finditer(finding_pattern, section_content, re.MULTILINE):
finding_title = match.group(2).strip()
if finding_title:
findings.append({
'title': finding_title,
'severity': severity,
'source': source_scanner
})
elif source_scanner == 'determinism':
# Determinism format: ### **[SEVERITY] Title**
pattern = r'###\s+\*\*\[([A-Z]+)\]\s+([^*]+)\*\*'
for match in re.finditer(pattern, content, re.MULTILINE):
severity = match.group(1).lower()
title = match.group(2).strip()
if title:
findings.append({
'title': title,
'severity': severity,
'source': source_scanner
})
elif source_scanner == 'customization':
# Customization format: ### N. **Title** (OPPORTUNITY-TYPE)
pattern = r'###\s+\d+\.\s+\*\*([^*]+)\*\*\s+\(([A-Z-]+)\)'
for match in re.finditer(pattern, content, re.MULTILINE):
title = match.group(1).strip()
opportunity = match.group(2).lower()
# Map opportunity to severity
severity = 'high' if 'high' in opportunity else 'medium' if 'medium' in opportunity else 'low'
if title:
findings.append({
'title': title,
'severity': severity,
'source': source_scanner
})
elif source_scanner == 'enhancement':
# Enhancement format: ### LEVEL Findings section followed by #### N. Title
# Extract opportunity sections (HIGH-OPPORTUNITY, SECONDARY-OPPORTUNITY, etc)
opportunity_pattern = r'^###\s+([A-Z-]+)\s+(?:Findings|Opportunities?)'
opportunity_sections = re.split(opportunity_pattern, content, flags=re.MULTILINE)
for i in range(1, len(opportunity_sections), 2):
opportunity = opportunity_sections[i].lower() if i < len(opportunity_sections) else "medium"
section_content = opportunity_sections[i + 1] if i + 1 < len(opportunity_sections) else ""
if not section_content.strip():
continue
# Map opportunity to severity
severity = 'high' if 'high' in opportunity else 'medium' if 'secondary' in opportunity else 'low'
# Extract level 4 findings (#### N. Title)
finding_pattern = r'^####\s+(\d+\.\s+)?(.+?)$'
for match in re.finditer(finding_pattern, section_content, re.MULTILINE):
finding_title = match.group(2).strip()
if finding_title:
findings.append({
'title': finding_title,
'severity': severity,
'source': source_scanner
})
return findings
def merge_prepass_data(report_dir: Path) -> dict:
"""Load and merge all prepass JSON data."""
merged = {}
for json_file in report_dir.glob('*-prepass.json'):
try:
data = json.loads(json_file.read_text(encoding='utf-8'))
merged.update(data)
except Exception:
pass # Skip if not valid JSON
return merged
def build_report_json(skill_path: str, quality_report_dir: str) -> dict:
"""Extract and build complete report-data.json."""
report_dir = Path(quality_report_dir)
skill_name = Path(skill_path).name
timestamp = datetime.now(timezone.utc).isoformat()
# Read all analysis files
architecture_content = (report_dir / 'architecture-analysis.md').read_text(encoding='utf-8') if (report_dir / 'architecture-analysis.md').exists() else ""
determinism_content = (report_dir / 'determinism-analysis.md').read_text(encoding='utf-8') if (report_dir / 'determinism-analysis.md').exists() else ""
customization_content = (report_dir / 'customization-analysis.md').read_text(encoding='utf-8') if (report_dir / 'customization-analysis.md').exists() else ""
enhancement_content = (report_dir / 'enhancement-analysis.md').read_text(encoding='utf-8') if (report_dir / 'enhancement-analysis.md').exists() else ""
# Extract assessments
arch_assessment = extract_section(architecture_content, 'Assessment', level=2) or ""
det_assessment = extract_section(determinism_content, 'Assessment', level=2) or ""
cust_assessment = extract_section(customization_content, 'Overall Assessment', level=2) or ""
enh_assessment = extract_section(enhancement_content, 'Summary', level=2) or ""
# Extract journeys and autonomous from enhancement
journeys = extract_journeys(enhancement_content)
autonomous = extract_autonomous(enhancement_content)
# Build detailed_analysis
detailed_analysis = {
'architecture': {
'assessment': arch_assessment[:500], # First 500 chars
'findings': extract_findings_from_md(architecture_content, 'architecture')
},
'determinism': {
'assessment': det_assessment[:500],
'findings': extract_findings_from_md(determinism_content, 'determinism')
},
'customization': {
'assessment': cust_assessment[:500],
'posture': 'not-opted-in', # From content
'findings': extract_findings_from_md(customization_content, 'customization')
},
'enhancement': {
'assessment': enh_assessment[:500],
'journeys': journeys,
'autonomous': autonomous,
'findings': extract_findings_from_md(enhancement_content, 'enhancement')
}
}
# Build basic structure - minimal for now, will be expanded by report creator if needed
report_data = {
'meta': {
'skill_name': skill_name,
'skill_path': skill_path,
'timestamp': timestamp,
'scanner_count': 4
},
'narrative': enh_assessment[:150] if enh_assessment else "", # Placeholder
'grade': 'Good', # Placeholder - report creator sets this
'broken': [],
'opportunities': [],
'strengths': [],
'recommendations': [],
'detailed_analysis': detailed_analysis
}
return report_data
def main():
parser = argparse.ArgumentParser(description='Extract report-data.json from analysis outputs')
parser.add_argument('skill_path', help='Path to the skill being analyzed')
parser.add_argument('quality_report_dir', help='Directory with analysis outputs and where to write report')
parser.add_argument('-o', '--output', help='Output file path (default: {quality_report_dir}/report-data.json)')
args = parser.parse_args()
output_path = args.output or str(Path(args.quality_report_dir) / 'report-data.json')
try:
report_json = build_report_json(args.skill_path, args.quality_report_dir)
# Write output
output_file = Path(output_path)
output_file.write_text(json.dumps(report_json, indent=2, ensure_ascii=False), encoding='utf-8')
print(f'Report JSON written to {output_path}', file=sys.stderr)
print(json.dumps({'status': 'success', 'output': output_path}, indent=2))
except Exception as e:
print(f'Error: {e}', file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,588 @@
# /// script
# requires-python = ">=3.9"
# ///
#!/usr/bin/env python3
"""
Generate an interactive HTML quality analysis report from report-data.json.
Reads the structured report data produced by the report creator and renders
a self-contained HTML report with:
- Grade + narrative at top
- Broken items with fix prompts
- Opportunity themes with "Fix This Theme" prompt generation
- Expandable strengths
- Expandable detailed analysis per dimension
- Link to full markdown report
Usage:
python3 generate-html-report.py {quality-report-dir} [--open]
"""
from __future__ import annotations
import argparse
import json
import platform
import subprocess
import sys
from pathlib import Path
def load_report_data(report_dir: Path) -> dict:
"""Load report-data.json from the report directory."""
data_file = report_dir / 'report-data.json'
if not data_file.exists():
print(f'Error: {data_file} not found', file=sys.stderr)
sys.exit(2)
return json.loads(data_file.read_text(encoding='utf-8'))
def build_fix_prompt(skill_path: str, theme: dict) -> str:
"""Build a coherent fix prompt for an entire opportunity theme."""
prompt = f"## Task: {theme['name']}\n"
prompt += f"Skill path: {skill_path}\n\n"
prompt += f"### Problem\n{theme['description']}\n\n"
prompt += f"### Fix\n{theme['action']}\n\n"
if theme.get('findings'):
prompt += "### Specific observations to address:\n\n"
for i, f in enumerate(theme['findings'], 1):
loc = f"{f['file']}:{f['line']}" if f.get('file') and f.get('line') else f.get('file', '')
prompt += f"{i}. **{f['title']}**"
if loc:
prompt += f" ({loc})"
if f.get('detail'):
prompt += f"\n {f['detail']}"
prompt += "\n"
return prompt.strip()
def build_broken_prompt(skill_path: str, items: list) -> str:
"""Build a fix prompt for all broken items."""
prompt = f"## Task: Fix Critical Issues\nSkill path: {skill_path}\n\n"
for i, item in enumerate(items, 1):
loc = f"{item['file']}:{item['line']}" if item.get('file') and item.get('line') else item.get('file', '')
prompt += f"{i}. **[{item.get('severity','high').upper()}] {item['title']}**\n"
if loc:
prompt += f" File: {loc}\n"
if item.get('detail'):
prompt += f" Context: {item['detail']}\n"
if item.get('action'):
prompt += f" Fix: {item['action']}\n"
prompt += "\n"
return prompt.strip()
HTML_TEMPLATE = r"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>BMad Method · Quality Analysis: SKILL_NAME</title>
<style>
:root {
--bg: #0d1117; --surface: #161b22; --surface2: #21262d; --border: #30363d;
--text: #e6edf3; --text-muted: #8b949e; --text-dim: #6e7681;
--critical: #f85149; --high: #f0883e; --medium: #d29922; --low: #58a6ff;
--strength: #3fb950; --suggestion: #a371f7;
--accent: #58a6ff; --accent-hover: #79c0ff;
--font: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
--mono: ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, monospace;
}
@media (prefers-color-scheme: light) {
:root {
--bg: #ffffff; --surface: #f6f8fa; --surface2: #eaeef2; --border: #d0d7de;
--text: #1f2328; --text-muted: #656d76; --text-dim: #8c959f;
--critical: #cf222e; --high: #bc4c00; --medium: #9a6700; --low: #0969da;
--strength: #1a7f37; --suggestion: #8250df;
--accent: #0969da; --accent-hover: #0550ae;
}
}
* { margin: 0; padding: 0; box-sizing: border-box; }
body { font-family: var(--font); background: var(--bg); color: var(--text); line-height: 1.5; padding: 2rem; max-width: 900px; margin: 0 auto; }
h1 { font-size: 1.5rem; margin-bottom: 0.25rem; }
.subtitle { color: var(--text-muted); font-size: 0.85rem; margin-bottom: 1.5rem; }
.subtitle a { color: var(--accent); text-decoration: none; }
.subtitle a:hover { text-decoration: underline; }
.grade { font-size: 2.5rem; font-weight: 700; margin: 0.5rem 0; }
.grade-Excellent { color: var(--strength); }
.grade-Good { color: var(--low); }
.grade-Fair { color: var(--medium); }
.grade-Poor { color: var(--critical); }
.narrative { color: var(--text-muted); font-size: 0.95rem; margin-bottom: 1.5rem; line-height: 1.6; }
.badge { display: inline-flex; align-items: center; padding: 0.15rem 0.5rem; border-radius: 2rem; font-size: 0.75rem; font-weight: 600; }
.badge-critical { background: color-mix(in srgb, var(--critical) 20%, transparent); color: var(--critical); }
.badge-high { background: color-mix(in srgb, var(--high) 20%, transparent); color: var(--high); }
.badge-medium { background: color-mix(in srgb, var(--medium) 20%, transparent); color: var(--medium); }
.badge-low { background: color-mix(in srgb, var(--low) 20%, transparent); color: var(--low); }
.badge-strength { background: color-mix(in srgb, var(--strength) 20%, transparent); color: var(--strength); }
.section { border: 1px solid var(--border); border-radius: 0.5rem; margin: 0.75rem 0; overflow: hidden; }
.section-header { display: flex; align-items: center; gap: 0.75rem; padding: 0.75rem 1rem; background: var(--surface); cursor: pointer; user-select: none; }
.section-header:hover { background: var(--surface2); }
.section-header .arrow { font-size: 0.7rem; transition: transform 0.15s; color: var(--text-muted); width: 1rem; }
.section-header.open .arrow { transform: rotate(90deg); }
.section-header .label { font-weight: 600; flex: 1; }
.section-header .count { font-size: 0.8rem; color: var(--text-muted); }
.section-header .actions { display: flex; gap: 0.5rem; }
.section-body { display: none; }
.section-body.open { display: block; }
.item { padding: 0.75rem 1rem; border-top: 1px solid var(--border); }
.item:hover { background: var(--surface); }
.item-title { font-weight: 600; font-size: 0.9rem; }
.item-file { font-family: var(--mono); font-size: 0.75rem; color: var(--text-muted); }
.item-desc { font-size: 0.85rem; color: var(--text-muted); margin-top: 0.25rem; }
.item-action { font-size: 0.85rem; margin-top: 0.25rem; }
.item-action strong { color: var(--strength); }
.opp { padding: 1rem; border-top: 1px solid var(--border); }
.opp-header { display: flex; align-items: center; gap: 0.75rem; }
.opp-name { font-weight: 600; font-size: 1rem; flex: 1; }
.opp-count { font-size: 0.8rem; color: var(--text-muted); }
.opp-desc { font-size: 0.9rem; color: var(--text-muted); margin: 0.5rem 0; }
.opp-impact { font-size: 0.85rem; color: var(--text-dim); font-style: italic; }
.opp-findings { margin-top: 0.75rem; padding-left: 1rem; border-left: 2px solid var(--border); display: none; }
.opp-findings.open { display: block; }
.opp-finding { font-size: 0.85rem; padding: 0.25rem 0; color: var(--text-muted); }
.opp-finding .source { font-size: 0.75rem; color: var(--text-dim); }
.btn { background: none; border: 1px solid var(--border); border-radius: 0.25rem; padding: 0.3rem 0.7rem; cursor: pointer; color: var(--text-muted); font-size: 0.8rem; transition: all 0.15s; }
.btn:hover { border-color: var(--accent); color: var(--accent); }
.btn-primary { background: var(--accent); color: #fff; border-color: var(--accent); font-weight: 600; }
.btn-primary:hover { background: var(--accent-hover); }
.btn.copied { border-color: var(--strength); color: var(--strength); }
.strength-item { padding: 0.5rem 1rem; border-top: 1px solid var(--border); }
.strength-item .title { font-weight: 600; font-size: 0.9rem; color: var(--strength); }
.strength-item .detail { font-size: 0.85rem; color: var(--text-muted); }
.analysis-section { padding: 0.75rem 1rem; border-top: 1px solid var(--border); }
.analysis-section h4 { font-size: 0.9rem; margin-bottom: 0.25rem; }
.analysis-section p { font-size: 0.85rem; color: var(--text-muted); }
.analysis-finding { font-size: 0.85rem; padding: 0.25rem 0 0.25rem 1rem; border-left: 2px solid var(--border); margin: 0.25rem 0; color: var(--text-muted); }
.modal-overlay { display: none; position: fixed; inset: 0; background: rgba(0,0,0,0.6); z-index: 200; align-items: center; justify-content: center; }
.modal-overlay.visible { display: flex; }
.modal { background: var(--surface); border: 1px solid var(--border); border-radius: 0.5rem; padding: 1.5rem; width: 90%; max-width: 700px; max-height: 80vh; overflow-y: auto; }
.modal h3 { margin-bottom: 0.75rem; }
.modal pre { background: var(--bg); border: 1px solid var(--border); border-radius: 0.375rem; padding: 1rem; font-family: var(--mono); font-size: 0.8rem; white-space: pre-wrap; word-wrap: break-word; max-height: 50vh; overflow-y: auto; }
.modal-actions { display: flex; gap: 0.75rem; margin-top: 1rem; justify-content: flex-end; }
.recs { padding: 0.75rem 1rem; border-top: 1px solid var(--border); }
.rec { padding: 0.3rem 0; font-size: 0.9rem; }
.rec-rank { font-weight: 700; color: var(--accent); margin-right: 0.5rem; }
.rec-resolves { font-size: 0.8rem; color: var(--text-dim); }
</style>
</head>
<body>
<div style="color:#a371f7;font-size:0.8rem;font-weight:600;letter-spacing:0.05em;text-transform:uppercase;margin-bottom:0.25rem">BMad Method</div>
<h1>Quality Analysis: <span id="skill-name"></span></h1>
<div class="subtitle" id="subtitle"></div>
<div id="grade-area"></div>
<div class="narrative" id="narrative"></div>
<div id="broken-section"></div>
<div id="opportunities-section"></div>
<div id="strengths-section"></div>
<div id="user-experience-section"></div>
<div id="recommendations-section"></div>
<div id="detailed-section"></div>
<div class="modal-overlay" id="modal" onclick="if(event.target===this)closeModal()">
<div class="modal">
<h3 id="modal-title">Generated Prompt</h3>
<pre id="modal-content"></pre>
<div class="modal-actions">
<button class="btn" onclick="closeModal()">Close</button>
<button class="btn btn-primary" onclick="copyModal()">Copy to Clipboard</button>
</div>
</div>
</div>
<script>
const RAW = JSON.parse(document.getElementById('report-data').textContent);
const DATA = normalize(RAW);
function normalize(d) {
// Fix meta field variants
if (d.meta) {
d.meta.skill_name = d.meta.skill_name || d.meta.skill || d.meta.name || 'Unknown';
d.meta.scanner_count = typeof d.meta.scanner_count === 'number' ? d.meta.scanner_count
: Array.isArray(d.meta.scanners_run) ? d.meta.scanners_run.length
: d.meta.scanner_count || 0;
}
// Fix strengths: plain strings → objects
d.strengths = (d.strengths || []).map(s =>
typeof s === 'string' ? { title: s, detail: '' } : { title: s.title || '', detail: s.detail || '' }
);
// Fix opportunities: title→name, findings_resolved→findings
(d.opportunities || []).forEach(o => {
o.name = o.name || o.title || '';
o.finding_count = o.finding_count || (o.findings || o.findings_resolved || []).length;
if (!o.findings && o.findings_resolved) o.findings = [];
o.action = o.action || o.fix || '';
});
// Fix broken: description→detail, fix→action
(d.broken || []).forEach(b => {
b.detail = b.detail || b.description || '';
b.action = b.action || b.fix || '';
});
// Fix recommendations: description→action
(d.recommendations || []).forEach((r, i) => {
r.action = r.action || r.description || '';
r.rank = r.rank || i + 1;
});
// Fix journeys: persona→archetype, friction→friction_points
// Accept both `enhancement` (new) and `experience` (legacy) section keys
const expSection = d.detailed_analysis && (d.detailed_analysis.enhancement || d.detailed_analysis.experience);
if (expSection) {
expSection.journeys = (expSection.journeys || []).map(j => ({
archetype: j.archetype || j.persona || j.name || 'Unknown',
summary: j.summary || j.journey_summary || j.description || j.friction || '',
friction_points: j.friction_points || (j.friction ? [j.friction] : []),
bright_spots: j.bright_spots || (j.bright ? [j.bright] : [])
}));
}
return d;
}
function esc(s) {
if (!s) return '';
const d = document.createElement('div');
d.textContent = String(s);
return d.innerHTML;
}
function init() {
const m = DATA.meta;
document.getElementById('skill-name').textContent = m.skill_name;
document.getElementById('subtitle').innerHTML =
`${esc(m.skill_path)} &bull; ${m.timestamp ? m.timestamp.split('T')[0] : ''} &bull; ${m.scanner_count || 0} scanners &bull; <a href="quality-report.md">Full Report &nearr;</a>`;
document.getElementById('grade-area').innerHTML =
`<div class="grade grade-${DATA.grade}">${esc(DATA.grade)}</div>`;
document.getElementById('narrative').textContent = DATA.narrative || '';
renderBroken();
renderOpportunities();
renderStrengths();
renderUserExperience();
renderRecommendations();
renderDetailed();
}
function renderBroken() {
const items = DATA.broken || [];
if (!items.length) return;
let html = `<div class="section"><div class="section-header open" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">Broken / Critical (${items.length})</span>`;
html += `<div class="actions"><button class="btn btn-primary" onclick="event.stopPropagation();showBrokenPrompt()">Fix These</button></div>`;
html += `</div><div class="section-body open">`;
items.forEach(item => {
const loc = item.file ? `${item.file}${item.line ? ':'+item.line : ''}` : '';
html += `<div class="item">`;
html += `<span class="badge badge-${item.severity || 'high'}">${esc(item.severity || 'high')}</span> `;
if (loc) html += `<span class="item-file">${esc(loc)}</span>`;
html += `<div class="item-title">${esc(item.title)}</div>`;
if (item.detail) html += `<div class="item-desc">${esc(item.detail)}</div>`;
if (item.action) html += `<div class="item-action"><strong>Fix:</strong> ${esc(item.action)}</div>`;
html += `</div>`;
});
html += `</div></div>`;
document.getElementById('broken-section').innerHTML = html;
}
function renderOpportunities() {
const opps = DATA.opportunities || [];
if (!opps.length) return;
let html = `<div class="section"><div class="section-header open" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">Opportunities (${opps.length})</span>`;
html += `</div><div class="section-body open">`;
opps.forEach((opp, idx) => {
html += `<div class="opp">`;
html += `<div class="opp-header">`;
html += `<span class="badge badge-${opp.severity || 'medium'}">${esc(opp.severity || 'medium')}</span>`;
html += `<span class="opp-name">${idx+1}. ${esc(opp.name)}</span>`;
html += `<span class="opp-count">${opp.finding_count || (opp.findings||[]).length} observations</span>`;
html += `<button class="btn" onclick="toggleFindings(${idx})">Details</button>`;
html += `<button class="btn btn-primary" onclick="showThemePrompt(${idx})">Fix This</button>`;
html += `</div>`;
html += `<div class="opp-desc">${esc(opp.description)}</div>`;
if (opp.impact) html += `<div class="opp-impact">Impact: ${esc(opp.impact)}</div>`;
html += `<div class="opp-findings" id="findings-${idx}">`;
(opp.findings || []).forEach(f => {
const loc = f.file ? `${f.file}${f.line ? ':'+f.line : ''}` : '';
html += `<div class="opp-finding">`;
html += `<strong>${esc(f.title)}</strong>`;
if (loc) html += ` <span class="item-file">${esc(loc)}</span>`;
if (f.source) html += ` <span class="source">[${esc(f.source)}]</span>`;
if (f.detail) html += `<br>${esc(f.detail)}`;
html += `</div>`;
});
html += `</div></div>`;
});
html += `</div></div>`;
document.getElementById('opportunities-section').innerHTML = html;
}
function renderStrengths() {
const items = DATA.strengths || [];
if (!items.length) return;
let html = `<div class="section"><div class="section-header" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">Strengths (${items.length})</span>`;
html += `</div><div class="section-body">`;
items.forEach(s => {
html += `<div class="strength-item"><div class="title">${esc(s.title)}</div>`;
if (s.detail) html += `<div class="detail">${esc(s.detail)}</div>`;
html += `</div>`;
});
html += `</div></div>`;
document.getElementById('strengths-section').innerHTML = html;
}
function renderRecommendations() {
const recs = DATA.recommendations || [];
if (!recs.length) return;
let html = `<div class="section"><div class="section-header open" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">Recommendations</span>`;
html += `</div><div class="section-body open"><div class="recs">`;
recs.forEach(r => {
html += `<div class="rec">`;
html += `<span class="rec-rank">#${r.rank}</span>`;
html += `${esc(r.action)}`;
if (r.resolves) html += ` <span class="rec-resolves">(resolves ${r.resolves} observations)</span>`;
html += `</div>`;
});
html += `</div></div></div>`;
document.getElementById('recommendations-section').innerHTML = html;
}
function renderUserExperience() {
const ux = DATA.detailed_analysis && DATA.detailed_analysis.enhancement;
if (!ux) return;
let html = `<div class="section"><div class="section-header open" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">User Experience</span>`;
html += `</div><div class="section-body open">`;
if (ux.assessment) html += `<p>${esc(ux.assessment)}</p>`;
if (ux.journeys && ux.journeys.length) {
html += `<div style="margin:1rem 0"><strong>User Journeys:</strong></div>`;
ux.journeys.forEach(j => {
html += `<div style="margin:0.75rem 0;padding:0.75rem;border-left:3px solid var(--accent);background:var(--surface2);">`;
html += `<div style="font-weight:600;margin-bottom:0.5rem">${esc(j.archetype)}</div>`;
html += `<p style="margin:0 0 0.5rem 0;font-size:0.95rem">${esc(j.summary || '')}</p>`;
if (j.friction_points && j.friction_points.length) {
html += `<div style="color:var(--high);font-size:0.85rem;margin:0.25rem 0"><strong>Friction Points:</strong></div>`;
html += `<ul style="margin:0.25rem 0 0.5rem 1.25rem;color:var(--high);font-size:0.85rem">`;
j.friction_points.forEach(fp => { html += `<li>${esc(fp)}</li>`; });
html += `</ul>`;
}
if (j.bright_spots && j.bright_spots.length) {
html += `<div style="color:var(--strength);font-size:0.85rem;margin:0.25rem 0"><strong>Bright Spots:</strong></div>`;
html += `<ul style="margin:0.25rem 0 0 1.25rem;color:var(--strength);font-size:0.85rem">`;
j.bright_spots.forEach(bs => { html += `<li>${esc(bs)}</li>`; });
html += `</ul>`;
}
html += `</div>`;
});
}
if (ux.autonomous) {
const a = ux.autonomous;
html += `<div style="margin:1rem 0;padding:0.75rem;background:var(--surface2);border-left:3px solid var(--suggestion);">`;
html += `<div style="font-weight:600;margin-bottom:0.5rem">Headless / Automation Potential</div>`;
html += `<div><strong>${esc(a.potential || '')}</strong>`;
if (a.notes) html += `: ${esc(a.notes)}`;
html += `</div></div>`;
}
(ux.findings || []).forEach(f => {
const loc = f.file ? `${f.file}${f.line ? ':'+f.line : ''}` : '';
html += `<div class="analysis-finding">`;
if (f.severity) html += `<span class="badge badge-${f.severity}">${esc(f.severity)}</span> `;
html += `${esc(f.title)}`;
if (loc) html += ` <span class="item-file">${esc(loc)}</span>`;
html += `</div>`;
});
html += `</div></div>`;
document.getElementById('user-experience-section').innerHTML = html;
}
function renderDetailed() {
const da = DATA.detailed_analysis;
if (!da) return;
const dims = [
['architecture', 'Architecture (Structure, Craft, Cohesion)'],
['determinism', 'Determinism & Distribution'],
['customization', 'Customization Surface']
];
let html = `<div class="section"><div class="section-header" onclick="toggleSection(this)">`;
html += `<span class="arrow">&#9654;</span><span class="label">Detailed Analysis</span>`;
html += `</div><div class="section-body">`;
dims.forEach(([key, label]) => {
const dim = da[key];
if (!dim) return;
html += `<div class="analysis-section"><h4>${label}</h4>`;
if (dim.assessment) html += `<p>${esc(dim.assessment)}</p>`;
if (dim.dimensions) {
html += `<table style="width:100%;font-size:0.85rem;margin:0.5rem 0;border-collapse:collapse;">`;
html += `<tr><th style="text-align:left;padding:0.3rem;border-bottom:1px solid var(--border)">Dimension</th><th style="text-align:left;padding:0.3rem;border-bottom:1px solid var(--border)">Score</th><th style="text-align:left;padding:0.3rem;border-bottom:1px solid var(--border)">Notes</th></tr>`;
Object.entries(dim.dimensions).forEach(([d, v]) => {
if (v && typeof v === 'object') {
html += `<tr><td style="padding:0.3rem;border-bottom:1px solid var(--border)">${esc(d.replace(/_/g,' '))}</td><td style="padding:0.3rem;border-bottom:1px solid var(--border)">${esc(v.score||'')}</td><td style="padding:0.3rem;border-bottom:1px solid var(--border)">${esc(v.notes||'')}</td></tr>`;
}
});
html += `</table>`;
}
if (dim.journeys && dim.journeys.length) {
dim.journeys.forEach(j => {
html += `<div style="margin:0.5rem 0"><strong>${esc(j.archetype)}</strong>: ${esc(j.summary || j.journey_summary || '')}`;
if (j.friction_points && j.friction_points.length) {
html += `<ul style="color:var(--high);font-size:0.85rem;padding-left:1.25rem">`;
j.friction_points.forEach(fp => { html += `<li>${esc(fp)}</li>`; });
html += `</ul>`;
}
html += `</div>`;
});
}
if (dim.autonomous) {
const a = dim.autonomous;
html += `<p><strong>Headless Potential:</strong> ${esc(a.potential||'')}`;
if (a.notes) html += ` — ${esc(a.notes)}`;
html += `</p>`;
}
(dim.findings || []).forEach(f => {
const loc = f.file ? `${f.file}${f.line ? ':'+f.line : ''}` : '';
html += `<div class="analysis-finding">`;
if (f.severity) html += `<span class="badge badge-${f.severity}">${esc(f.severity)}</span> `;
html += `${esc(f.title)}`;
if (loc) html += ` <span class="item-file">${esc(loc)}</span>`;
html += `</div>`;
});
html += `</div>`;
});
html += `</div></div>`;
document.getElementById('detailed-section').innerHTML = html;
}
// --- Interactions ---
function toggleSection(el) {
el.classList.toggle('open');
el.nextElementSibling.classList.toggle('open');
}
function toggleFindings(idx) {
document.getElementById('findings-'+idx).classList.toggle('open');
}
// --- Prompt Generation ---
function showThemePrompt(idx) {
const opp = DATA.opportunities[idx];
if (!opp) return;
let prompt = `## Task: ${opp.name}\nSkill path: ${DATA.meta.skill_path}\n\n`;
prompt += `### Problem\n${opp.description}\n\n`;
prompt += `### Fix\n${opp.action}\n\n`;
if (opp.findings && opp.findings.length) {
prompt += `### Specific observations to address:\n\n`;
opp.findings.forEach((f, i) => {
const loc = f.file ? (f.line ? `${f.file}:${f.line}` : f.file) : '';
prompt += `${i+1}. **${f.title}**`;
if (loc) prompt += ` (${loc})`;
if (f.detail) prompt += `\n ${f.detail}`;
prompt += `\n`;
});
}
document.getElementById('modal-title').textContent = `Fix: ${opp.name}`;
document.getElementById('modal-content').textContent = prompt.trim();
document.getElementById('modal').classList.add('visible');
}
function showBrokenPrompt() {
const items = DATA.broken || [];
let prompt = `## Task: Fix Critical Issues\nSkill path: ${DATA.meta.skill_path}\n\n`;
items.forEach((item, i) => {
const loc = item.file ? (item.line ? `${item.file}:${item.line}` : item.file) : '';
prompt += `${i+1}. **[${(item.severity||'high').toUpperCase()}] ${item.title}**\n`;
if (loc) prompt += ` File: ${loc}\n`;
if (item.detail) prompt += ` Context: ${item.detail}\n`;
if (item.action) prompt += ` Fix: ${item.action}\n`;
prompt += `\n`;
});
document.getElementById('modal-title').textContent = 'Fix Critical Issues';
document.getElementById('modal-content').textContent = prompt.trim();
document.getElementById('modal').classList.add('visible');
}
function closeModal() { document.getElementById('modal').classList.remove('visible'); }
function copyModal() {
const text = document.getElementById('modal-content').textContent;
navigator.clipboard.writeText(text).then(() => {
const btn = document.querySelector('.modal .btn-primary');
btn.textContent = 'Copied!';
setTimeout(() => { btn.textContent = 'Copy to Clipboard'; }, 1500);
});
}
init();
</script>
</body>
</html>"""
def generate_html(report_data: dict) -> str:
"""Inject report data into the HTML template."""
data_json = json.dumps(report_data, indent=None, ensure_ascii=False)
data_tag = f'<script id="report-data" type="application/json">{data_json}</script>'
html = HTML_TEMPLATE.replace('<script>\nconst RAW', f'{data_tag}\n<script>\nconst RAW')
html = html.replace('SKILL_NAME', report_data.get('meta', {}).get('skill_name', 'Unknown'))
return html
def main() -> int:
parser = argparse.ArgumentParser(
description='Generate interactive HTML quality analysis report',
)
parser.add_argument(
'report_dir',
type=Path,
help='Directory containing report-data.json',
)
parser.add_argument(
'--open',
action='store_true',
help='Open the HTML report in the default browser',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Output HTML file path (default: {report_dir}/quality-report.html)',
)
args = parser.parse_args()
if not args.report_dir.is_dir():
print(f'Error: {args.report_dir} is not a directory', file=sys.stderr)
return 2
report_data = load_report_data(args.report_dir)
html = generate_html(report_data)
output_path = args.output or (args.report_dir / 'quality-report.html')
output_path.write_text(html, encoding='utf-8')
# Output summary
opp_count = len(report_data.get('opportunities', []))
broken_count = len(report_data.get('broken', []))
print(json.dumps({
'html_report': str(output_path),
'grade': report_data.get('grade', 'Unknown'),
'opportunities': opp_count,
'broken': broken_count,
}))
if args.open:
system = platform.system()
if system == 'Darwin':
subprocess.run(['open', str(output_path)])
elif system == 'Linux':
subprocess.run(['xdg-open', str(output_path)])
elif system == 'Windows':
subprocess.run(['start', str(output_path)], shell=True)
return 0
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,288 @@
#!/usr/bin/env python3
"""Deterministic pre-pass for execution efficiency scanner.
Extracts dependency graph data and execution patterns from a BMad skill
so the LLM scanner can evaluate efficiency from compact structured data.
Covers:
- Dependency graph from skill structure
- Circular dependency detection
- Transitive dependency redundancy
- Parallelizable stage groups (independent nodes)
- Sequential pattern detection in prompts (numbered Read/Grep/Glob steps)
- Subagent-from-subagent detection
"""
# /// script
# requires-python = ">=3.9"
# ///
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
def detect_cycles(graph: dict[str, list[str]]) -> list[list[str]]:
"""Detect circular dependencies in a directed graph using DFS."""
cycles = []
visited = set()
path = []
path_set = set()
def dfs(node: str) -> None:
if node in path_set:
cycle_start = path.index(node)
cycles.append(path[cycle_start:] + [node])
return
if node in visited:
return
visited.add(node)
path.append(node)
path_set.add(node)
for neighbor in graph.get(node, []):
dfs(neighbor)
path.pop()
path_set.discard(node)
for node in graph:
dfs(node)
return cycles
def find_transitive_redundancy(graph: dict[str, list[str]]) -> list[dict]:
"""Find cases where A declares dependency on C, but A->B->C already exists."""
redundancies = []
def get_transitive(node: str, visited: set | None = None) -> set[str]:
if visited is None:
visited = set()
for dep in graph.get(node, []):
if dep not in visited:
visited.add(dep)
get_transitive(dep, visited)
return visited
for node, direct_deps in graph.items():
for dep in direct_deps:
# Check if dep is reachable through other direct deps
other_deps = [d for d in direct_deps if d != dep]
for other in other_deps:
transitive = get_transitive(other)
if dep in transitive:
redundancies.append({
'node': node,
'redundant_dep': dep,
'already_via': other,
'issue': f'"{node}" declares "{dep}" as dependency, but already reachable via "{other}"',
})
return redundancies
def find_parallel_groups(graph: dict[str, list[str]], all_nodes: set[str]) -> list[list[str]]:
"""Find groups of nodes that have no dependencies on each other (can run in parallel)."""
# Nodes with no incoming edges from other nodes in the set
independent_groups = []
# Simple approach: find all nodes at each "level" of the DAG
remaining = set(all_nodes)
while remaining:
# Nodes whose dependencies are all satisfied (not in remaining)
ready = set()
for node in remaining:
deps = set(graph.get(node, []))
if not deps & remaining:
ready.add(node)
if not ready:
break # Circular dependency, can't proceed
if len(ready) > 1:
independent_groups.append(sorted(ready))
remaining -= ready
return independent_groups
def scan_sequential_patterns(filepath: Path, rel_path: str) -> list[dict]:
"""Detect sequential operation patterns that could be parallel."""
content = filepath.read_text(encoding='utf-8')
patterns = []
# Sequential numbered steps with Read/Grep/Glob
tool_steps = re.findall(
r'^\s*\d+\.\s+.*?\b(Read|Grep|Glob|read|grep|glob)\b.*$',
content, re.MULTILINE
)
if len(tool_steps) >= 3:
patterns.append({
'file': rel_path,
'type': 'sequential-tool-calls',
'count': len(tool_steps),
'issue': f'{len(tool_steps)} sequential tool call steps found — check if independent calls can be parallel',
})
# "Read all files" / "for each" loop patterns
loop_patterns = [
(r'[Rr]ead all (?:files|documents|prompts)', 'read-all'),
(r'[Ff]or each (?:file|document|prompt|stage)', 'for-each-loop'),
(r'[Aa]nalyze each', 'analyze-each'),
(r'[Ss]can (?:through|all|each)', 'scan-all'),
(r'[Rr]eview (?:all|each)', 'review-all'),
]
for pattern, ptype in loop_patterns:
matches = re.findall(pattern, content)
if matches:
patterns.append({
'file': rel_path,
'type': ptype,
'count': len(matches),
'issue': f'"{matches[0]}" pattern found — consider parallel subagent delegation',
})
# Subagent spawning from subagent (impossible)
if re.search(r'(?i)spawn.*subagent|launch.*subagent|create.*subagent', content):
# Check if this file IS a subagent (non-SKILL.md, non-numbered prompt at root)
if rel_path != 'SKILL.md' and not re.match(r'^\d+-', rel_path):
patterns.append({
'file': rel_path,
'type': 'subagent-chain-violation',
'count': 1,
'issue': 'Subagent file references spawning other subagents — subagents cannot spawn subagents',
})
return patterns
def scan_execution_deps(skill_path: Path) -> dict:
"""Run all deterministic execution efficiency checks."""
# Build dependency graph from skill structure
dep_graph: dict[str, list[str]] = {}
prefer_after: dict[str, list[str]] = {}
all_stages: set[str] = set()
# Check for stage-level prompt files at skill root
for f in sorted(skill_path.iterdir()):
if f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md':
all_stages.add(f.stem)
# Cycle detection
cycles = detect_cycles(dep_graph)
# Transitive redundancy
redundancies = find_transitive_redundancy(dep_graph)
# Parallel groups
parallel_groups = find_parallel_groups(dep_graph, all_stages)
# Sequential pattern detection across all prompt and agent files at root
sequential_patterns = []
for f in sorted(skill_path.iterdir()):
if f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md':
patterns = scan_sequential_patterns(f, f.name)
sequential_patterns.extend(patterns)
# Also scan SKILL.md
skill_md = skill_path / 'SKILL.md'
if skill_md.exists():
sequential_patterns.extend(scan_sequential_patterns(skill_md, 'SKILL.md'))
# Build issues from deterministic findings
issues = []
for cycle in cycles:
issues.append({
'severity': 'critical',
'category': 'circular-dependency',
'issue': f'Circular dependency detected: {"".join(cycle)}',
})
for r in redundancies:
issues.append({
'severity': 'medium',
'category': 'dependency-bloat',
'issue': r['issue'],
})
for p in sequential_patterns:
severity = 'critical' if p['type'] == 'subagent-chain-violation' else 'medium'
issues.append({
'file': p['file'],
'severity': severity,
'category': p['type'],
'issue': p['issue'],
})
by_severity = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0}
for issue in issues:
sev = issue['severity']
if sev in by_severity:
by_severity[sev] += 1
status = 'pass'
if by_severity['critical'] > 0:
status = 'fail'
elif by_severity['medium'] > 0:
status = 'warning'
return {
'scanner': 'execution-efficiency-prepass',
'script': 'prepass-execution-deps.py',
'version': '1.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': status,
'dependency_graph': {
'stages': sorted(all_stages),
'hard_dependencies': dep_graph,
'soft_dependencies': prefer_after,
'cycles': cycles,
'transitive_redundancies': redundancies,
'parallel_groups': parallel_groups,
},
'sequential_patterns': sequential_patterns,
'issues': issues,
'summary': {
'total_issues': len(issues),
'by_severity': by_severity,
},
}
def main() -> int:
parser = argparse.ArgumentParser(
description='Extract execution dependency graph and patterns for LLM scanner pre-pass',
)
parser.add_argument(
'skill_path',
type=Path,
help='Path to the skill directory to scan',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Write JSON output to file instead of stdout',
)
args = parser.parse_args()
if not args.skill_path.is_dir():
print(f"Error: {args.skill_path} is not a directory", file=sys.stderr)
return 2
result = scan_execution_deps(args.skill_path)
output = json.dumps(result, indent=2)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(output)
print(f"Results written to {args.output}", file=sys.stderr)
else:
print(output)
return 0
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,285 @@
#!/usr/bin/env python3
"""Deterministic pre-pass for prompt craft scanner.
Extracts metrics and flagged patterns from SKILL.md and prompt files
so the LLM scanner can work from compact data instead of reading raw files.
Covers:
- SKILL.md line count and section inventory
- Overview section size
- Inline data detection (tables, fenced code blocks)
- Defensive padding pattern grep
- Meta-explanation pattern grep
- Back-reference detection ("as described above")
- Config header and progression condition presence per prompt
- File-level token estimates (chars / 4 rough approximation)
"""
# /// script
# requires-python = ">=3.9"
# ///
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
# Defensive padding / filler patterns
WASTE_PATTERNS = [
(r'\b[Mm]ake sure (?:to|you)\b', 'defensive-padding', 'Defensive: "make sure to/you"'),
(r"\b[Dd]on'?t forget (?:to|that)\b", 'defensive-padding', "Defensive: \"don't forget\""),
(r'\b[Rr]emember (?:to|that)\b', 'defensive-padding', 'Defensive: "remember to/that"'),
(r'\b[Bb]e sure to\b', 'defensive-padding', 'Defensive: "be sure to"'),
(r'\b[Pp]lease ensure\b', 'defensive-padding', 'Defensive: "please ensure"'),
(r'\b[Ii]t is important (?:to|that)\b', 'defensive-padding', 'Defensive: "it is important"'),
(r'\b[Yy]ou are an AI\b', 'meta-explanation', 'Meta: "you are an AI"'),
(r'\b[Aa]s a language model\b', 'meta-explanation', 'Meta: "as a language model"'),
(r'\b[Aa]s an AI assistant\b', 'meta-explanation', 'Meta: "as an AI assistant"'),
(r'\b[Tt]his (?:workflow|skill|process) is designed to\b', 'meta-explanation', 'Meta: "this workflow is designed to"'),
(r'\b[Tt]he purpose of this (?:section|step) is\b', 'meta-explanation', 'Meta: "the purpose of this section is"'),
(r"\b[Ll]et'?s (?:think about|begin|start)\b", 'filler', "Filler: \"let's think/begin\""),
(r'\b[Nn]ow we(?:\'ll| will)\b', 'filler', "Filler: \"now we'll\""),
]
# Back-reference patterns (self-containment risk)
BACKREF_PATTERNS = [
(r'\bas described above\b', 'Back-reference: "as described above"'),
(r'\bper the overview\b', 'Back-reference: "per the overview"'),
(r'\bas mentioned (?:above|in|earlier)\b', 'Back-reference: "as mentioned above/in/earlier"'),
(r'\bsee (?:above|the overview)\b', 'Back-reference: "see above/the overview"'),
(r'\brefer to (?:the )?(?:above|overview|SKILL)\b', 'Back-reference: "refer to above/overview"'),
]
def count_tables(content: str) -> tuple[int, int]:
"""Count markdown tables and their total lines."""
table_count = 0
table_lines = 0
in_table = False
for line in content.split('\n'):
if '|' in line and re.match(r'^\s*\|', line):
if not in_table:
table_count += 1
in_table = True
table_lines += 1
else:
in_table = False
return table_count, table_lines
def count_fenced_blocks(content: str) -> tuple[int, int]:
"""Count fenced code blocks and their total lines."""
block_count = 0
block_lines = 0
in_block = False
for line in content.split('\n'):
if line.strip().startswith('```'):
if in_block:
in_block = False
else:
in_block = True
block_count += 1
elif in_block:
block_lines += 1
return block_count, block_lines
def extract_overview_size(content: str) -> int:
"""Count lines in the ## Overview section."""
lines = content.split('\n')
in_overview = False
overview_lines = 0
for line in lines:
if re.match(r'^##\s+Overview\b', line):
in_overview = True
continue
elif in_overview and re.match(r'^##\s', line):
break
elif in_overview:
overview_lines += 1
return overview_lines
def scan_file_patterns(filepath: Path, rel_path: str) -> dict:
"""Extract metrics and pattern matches from a single file."""
content = filepath.read_text(encoding='utf-8')
lines = content.split('\n')
line_count = len(lines)
# Token estimate (rough: chars / 4)
token_estimate = len(content) // 4
# Section inventory
sections = []
for i, line in enumerate(lines, 1):
m = re.match(r'^(#{2,3})\s+(.+)$', line)
if m:
sections.append({'level': len(m.group(1)), 'title': m.group(2).strip(), 'line': i})
# Tables and code blocks
table_count, table_lines = count_tables(content)
block_count, block_lines = count_fenced_blocks(content)
# Pattern matches
waste_matches = []
for pattern, category, label in WASTE_PATTERNS:
for m in re.finditer(pattern, content):
line_num = content[:m.start()].count('\n') + 1
waste_matches.append({
'line': line_num,
'category': category,
'pattern': label,
'context': lines[line_num - 1].strip()[:100],
})
backref_matches = []
for pattern, label in BACKREF_PATTERNS:
for m in re.finditer(pattern, content, re.IGNORECASE):
line_num = content[:m.start()].count('\n') + 1
backref_matches.append({
'line': line_num,
'pattern': label,
'context': lines[line_num - 1].strip()[:100],
})
# Config header
has_config_header = '{communication_language}' in content or '{document_output_language}' in content
# Progression condition
prog_keywords = ['progress', 'advance', 'move to', 'next stage',
'when complete', 'proceed to', 'transition', 'completion criteria']
has_progression = any(kw in content.lower() for kw in prog_keywords)
result = {
'file': rel_path,
'line_count': line_count,
'token_estimate': token_estimate,
'sections': sections,
'table_count': table_count,
'table_lines': table_lines,
'fenced_block_count': block_count,
'fenced_block_lines': block_lines,
'waste_patterns': waste_matches,
'back_references': backref_matches,
'has_config_header': has_config_header,
'has_progression': has_progression,
}
return result
def scan_prompt_metrics(skill_path: Path) -> dict:
"""Extract metrics from all prompt-relevant files."""
files_data = []
# SKILL.md
skill_md = skill_path / 'SKILL.md'
if skill_md.exists():
data = scan_file_patterns(skill_md, 'SKILL.md')
content = skill_md.read_text(encoding='utf-8')
data['overview_lines'] = extract_overview_size(content)
data['is_skill_md'] = True
files_data.append(data)
# Prompt files at skill root (non-SKILL.md .md files)
for f in sorted(skill_path.iterdir()):
if f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md':
data = scan_file_patterns(f, f.name)
data['is_skill_md'] = False
files_data.append(data)
# References (just sizes, for progressive disclosure assessment)
references_dir = skill_path / 'references'
reference_sizes = {}
if references_dir.exists():
for f in sorted(references_dir.iterdir()):
if f.is_file() and f.suffix in ('.md', '.json', '.yaml', '.yml'):
content = f.read_text(encoding='utf-8')
reference_sizes[f.name] = {
'lines': len(content.split('\n')),
'tokens': len(content) // 4,
}
# Aggregate stats
total_waste = sum(len(f['waste_patterns']) for f in files_data)
total_backrefs = sum(len(f['back_references']) for f in files_data)
total_tokens = sum(f['token_estimate'] for f in files_data)
prompts_with_config = sum(1 for f in files_data if not f.get('is_skill_md') and f['has_config_header'])
prompts_with_progression = sum(1 for f in files_data if not f.get('is_skill_md') and f['has_progression'])
total_prompts = sum(1 for f in files_data if not f.get('is_skill_md'))
skill_md_data = next((f for f in files_data if f.get('is_skill_md')), None)
return {
'scanner': 'prompt-craft-prepass',
'script': 'prepass-prompt-metrics.py',
'version': '1.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': 'info',
'skill_md_summary': {
'line_count': skill_md_data['line_count'] if skill_md_data else 0,
'token_estimate': skill_md_data['token_estimate'] if skill_md_data else 0,
'overview_lines': skill_md_data.get('overview_lines', 0) if skill_md_data else 0,
'table_count': skill_md_data['table_count'] if skill_md_data else 0,
'table_lines': skill_md_data['table_lines'] if skill_md_data else 0,
'fenced_block_count': skill_md_data['fenced_block_count'] if skill_md_data else 0,
'fenced_block_lines': skill_md_data['fenced_block_lines'] if skill_md_data else 0,
'section_count': len(skill_md_data['sections']) if skill_md_data else 0,
},
'prompt_health': {
'total_prompts': total_prompts,
'prompts_with_config_header': prompts_with_config,
'prompts_with_progression': prompts_with_progression,
},
'aggregate': {
'total_files_scanned': len(files_data),
'total_token_estimate': total_tokens,
'total_waste_patterns': total_waste,
'total_back_references': total_backrefs,
},
'reference_sizes': reference_sizes,
'files': files_data,
}
def main() -> int:
parser = argparse.ArgumentParser(
description='Extract prompt craft metrics for LLM scanner pre-pass',
)
parser.add_argument(
'skill_path',
type=Path,
help='Path to the skill directory to scan',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Write JSON output to file instead of stdout',
)
args = parser.parse_args()
if not args.skill_path.is_dir():
print(f"Error: {args.skill_path} is not a directory", file=sys.stderr)
return 2
result = scan_prompt_metrics(args.skill_path)
output = json.dumps(result, indent=2)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(output)
print(f"Results written to {args.output}", file=sys.stderr)
else:
print(output)
return 0
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,475 @@
#!/usr/bin/env python3
"""Deterministic pre-pass for workflow integrity scanner.
Extracts structural metadata from a BMad skill that the LLM scanner
can use instead of reading all files itself. Covers:
- Frontmatter parsing and validation
- Section inventory (H2/H3 headers)
- Template artifact detection
- Stage file cross-referencing
- Stage numbering validation
- Config header detection in prompts
- Language/directness pattern grep
- On Exit / Exiting section detection (invalid)
"""
# /// script
# requires-python = ">=3.9"
# ///
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
# Template artifacts that should NOT appear in finalized skills
TEMPLATE_ARTIFACTS = [
r'\{if-complex-workflow\}', r'\{/if-complex-workflow\}',
r'\{if-simple-workflow\}', r'\{/if-simple-workflow\}',
r'\{if-simple-utility\}', r'\{/if-simple-utility\}',
r'\{if-module\}', r'\{/if-module\}',
r'\{if-headless\}', r'\{/if-headless\}',
r'\{displayName\}', r'\{skillName\}',
]
# Runtime variables that ARE expected (not artifacts)
RUNTIME_VARS = {
'{user_name}', '{communication_language}', '{document_output_language}',
'{project-root}', '{output_folder}', '{planning_artifacts}',
}
# Directness anti-patterns
DIRECTNESS_PATTERNS = [
(r'\byou should\b', 'Suggestive "you should" — use direct imperative'),
(r'\bplease\b(?! note)', 'Polite "please" — use direct imperative'),
(r'\bhandle appropriately\b', 'Ambiguous "handle appropriately" — specify how'),
(r'\bwhen ready\b', 'Vague "when ready" — specify testable condition'),
]
# Invalid sections
INVALID_SECTIONS = [
(r'^##\s+On\s+Exit\b', 'On Exit section found — no exit hooks exist in the system, this will never run'),
(r'^##\s+Exiting\b', 'Exiting section found — no exit hooks exist in the system, this will never run'),
]
def parse_frontmatter(content: str) -> tuple[dict | None, list[dict]]:
"""Parse YAML frontmatter and validate."""
findings = []
fm_match = re.match(r'^---\s*\n(.*?)\n---\s*\n', content, re.DOTALL)
if not fm_match:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'critical', 'category': 'frontmatter',
'issue': 'No YAML frontmatter found',
})
return None, findings
try:
# Frontmatter is YAML-like key: value pairs — parse manually
fm = {}
for line in fm_match.group(1).strip().split('\n'):
line = line.strip()
if not line or line.startswith('#'):
continue
if ':' in line:
key, _, value = line.partition(':')
fm[key.strip()] = value.strip().strip('"').strip("'")
except Exception as e:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'critical', 'category': 'frontmatter',
'issue': f'Invalid frontmatter: {e}',
})
return None, findings
if not isinstance(fm, dict):
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'critical', 'category': 'frontmatter',
'issue': 'Frontmatter is not a YAML mapping',
})
return None, findings
# name check
name = fm.get('name')
if not name:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'critical', 'category': 'frontmatter',
'issue': 'Missing "name" field in frontmatter',
})
elif not re.match(r'^[a-z0-9]+(-[a-z0-9]+)*$', name):
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'high', 'category': 'frontmatter',
'issue': f'Name "{name}" is not kebab-case',
})
# bmad- prefix check removed — bmad- is reserved for official BMad creations only
# description check
desc = fm.get('description')
if not desc:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'high', 'category': 'frontmatter',
'issue': 'Missing "description" field in frontmatter',
})
elif 'Use when' not in desc and 'use when' not in desc:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'medium', 'category': 'frontmatter',
'issue': 'Description missing "Use when..." trigger phrase',
})
# Extra fields check
allowed = {'name', 'description', 'menu-code'}
extra = set(fm.keys()) - allowed
if extra:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'low', 'category': 'frontmatter',
'issue': f'Extra frontmatter fields: {", ".join(sorted(extra))}',
})
return fm, findings
def extract_sections(content: str) -> list[dict]:
"""Extract all H2 headers with line numbers."""
sections = []
for i, line in enumerate(content.split('\n'), 1):
m = re.match(r'^(#{2,3})\s+(.+)$', line)
if m:
sections.append({
'level': len(m.group(1)),
'title': m.group(2).strip(),
'line': i,
})
return sections
def check_required_sections(sections: list[dict]) -> list[dict]:
"""Check for required and invalid sections."""
findings = []
h2_titles = [s['title'] for s in sections if s['level'] == 2]
if 'Overview' not in h2_titles:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'high', 'category': 'sections',
'issue': 'Missing ## Overview section',
})
if 'On Activation' not in h2_titles:
findings.append({
'file': 'SKILL.md', 'line': 1,
'severity': 'high', 'category': 'sections',
'issue': 'Missing ## On Activation section',
})
# Invalid sections
for s in sections:
if s['level'] == 2:
for pattern, message in INVALID_SECTIONS:
if re.match(pattern, f"## {s['title']}"):
findings.append({
'file': 'SKILL.md', 'line': s['line'],
'severity': 'high', 'category': 'invalid-section',
'issue': message,
})
return findings
def find_template_artifacts(filepath: Path, rel_path: str) -> list[dict]:
"""Scan for orphaned template substitution artifacts."""
findings = []
content = filepath.read_text(encoding='utf-8')
for pattern in TEMPLATE_ARTIFACTS:
for m in re.finditer(pattern, content):
matched = m.group()
if matched in RUNTIME_VARS:
continue
line_num = content[:m.start()].count('\n') + 1
findings.append({
'file': rel_path, 'line': line_num,
'severity': 'high', 'category': 'artifacts',
'issue': f'Orphaned template artifact: {matched}',
'fix': 'Resolve or remove this template conditional/placeholder',
})
return findings
def cross_reference_stages(skill_path: Path, skill_content: str) -> tuple[dict, list[dict]]:
"""Cross-reference stage files between SKILL.md and numbered prompt files at skill root."""
findings = []
# Get actual numbered prompt files at skill root (exclude SKILL.md)
actual_files = set()
for f in skill_path.iterdir():
if f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md' and re.match(r'^\d+-', f.name):
actual_files.add(f.name)
# Find stage references in SKILL.md — look for both old prompts/ style and new root style
referenced = set()
# Match `prompts/XX-name.md` (legacy) or bare `XX-name.md` references
ref_pattern = re.compile(r'(?:prompts/)?(\d+-[^\s)`]+\.md)')
for m in ref_pattern.finditer(skill_content):
referenced.add(m.group(1))
# Missing files (referenced but don't exist)
missing = referenced - actual_files
for f in sorted(missing):
findings.append({
'file': 'SKILL.md', 'line': 0,
'severity': 'critical', 'category': 'missing-stage',
'issue': f'Referenced stage file does not exist: {f}',
})
# Orphaned files (exist but not referenced)
orphaned = actual_files - referenced
for f in sorted(orphaned):
findings.append({
'file': f, 'line': 0,
'severity': 'medium', 'category': 'naming',
'issue': f'Stage file exists but not referenced in SKILL.md: {f}',
})
# Stage numbering check
numbered = []
for f in sorted(actual_files):
m = re.match(r'^(\d+)-(.+)\.md$', f)
if m:
numbered.append((int(m.group(1)), f))
if numbered:
numbered.sort()
nums = [n[0] for n in numbered]
expected = list(range(nums[0], nums[0] + len(nums)))
if nums != expected:
gaps = set(expected) - set(nums)
if gaps:
findings.append({
'file': skill_path.name, 'line': 0,
'severity': 'medium', 'category': 'naming',
'issue': f'Stage numbering has gaps: missing {sorted(gaps)}',
})
stage_summary = {
'total_stages': len(actual_files),
'referenced': sorted(referenced),
'actual': sorted(actual_files),
'missing_stages': sorted(missing),
'orphaned_stages': sorted(orphaned),
}
return stage_summary, findings
def check_prompt_basics(skill_path: Path) -> tuple[list[dict], list[dict]]:
"""Check each prompt file for config header and progression conditions."""
findings = []
prompt_details = []
# Look for numbered prompt files at skill root
prompt_files = sorted(
f for f in skill_path.iterdir()
if f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md' and re.match(r'^\d+-', f.name)
)
if not prompt_files:
return prompt_details, findings
for f in prompt_files:
content = f.read_text(encoding='utf-8')
rel_path = f.name
detail = {'file': f.name, 'has_config_header': False, 'has_progression': False}
# Config header check
if '{communication_language}' in content or '{document_output_language}' in content:
detail['has_config_header'] = True
else:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'config-header',
'issue': 'No config header with language variables found',
})
# Progression condition check (look for progression-related keywords near end)
lower = content.lower()
prog_keywords = ['progress', 'advance', 'move to', 'next stage', 'when complete',
'proceed to', 'transition', 'completion criteria']
if any(kw in lower for kw in prog_keywords):
detail['has_progression'] = True
else:
findings.append({
'file': rel_path, 'line': len(content.split('\n')),
'severity': 'high', 'category': 'progression',
'issue': 'No progression condition keywords found',
})
# Directness checks
for pattern, message in DIRECTNESS_PATTERNS:
for m in re.finditer(pattern, content, re.IGNORECASE):
line_num = content[:m.start()].count('\n') + 1
findings.append({
'file': rel_path, 'line': line_num,
'severity': 'low', 'category': 'language',
'issue': message,
})
# Template artifacts
findings.extend(find_template_artifacts(f, rel_path))
prompt_details.append(detail)
return prompt_details, findings
def detect_workflow_type(skill_content: str, has_prompts: bool) -> str:
"""Detect workflow type from SKILL.md content."""
has_stage_refs = bool(re.search(r'(?:prompts/)?\d+-\S+\.md', skill_content))
has_routing = bool(re.search(r'(?i)(rout|stage|branch|path)', skill_content))
if has_stage_refs or (has_prompts and has_routing):
return 'complex'
elif re.search(r'(?m)^\d+\.\s', skill_content):
return 'simple-workflow'
else:
return 'simple-utility'
def scan_workflow_integrity(skill_path: Path) -> dict:
"""Run all deterministic workflow integrity checks."""
all_findings = []
# Read SKILL.md
skill_md = skill_path / 'SKILL.md'
if not skill_md.exists():
return {
'scanner': 'workflow-integrity-prepass',
'script': 'prepass-workflow-integrity.py',
'version': '1.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': 'fail',
'issues': [{'file': 'SKILL.md', 'line': 1, 'severity': 'critical',
'category': 'missing-file', 'issue': 'SKILL.md does not exist'}],
'summary': {'total_issues': 1, 'by_severity': {'critical': 1, 'high': 0, 'medium': 0, 'low': 0}},
}
skill_content = skill_md.read_text(encoding='utf-8')
# Frontmatter
frontmatter, fm_findings = parse_frontmatter(skill_content)
all_findings.extend(fm_findings)
# Sections
sections = extract_sections(skill_content)
section_findings = check_required_sections(sections)
all_findings.extend(section_findings)
# Template artifacts in SKILL.md
all_findings.extend(find_template_artifacts(skill_md, 'SKILL.md'))
# Directness checks in SKILL.md
for pattern, message in DIRECTNESS_PATTERNS:
for m in re.finditer(pattern, skill_content, re.IGNORECASE):
line_num = skill_content[:m.start()].count('\n') + 1
all_findings.append({
'file': 'SKILL.md', 'line': line_num,
'severity': 'low', 'category': 'language',
'issue': message,
})
# Workflow type
has_prompts = any(
f.is_file() and f.suffix == '.md' and f.name != 'SKILL.md' and re.match(r'^\d+-', f.name)
for f in skill_path.iterdir()
)
workflow_type = detect_workflow_type(skill_content, has_prompts)
# Stage cross-reference
stage_summary, stage_findings = cross_reference_stages(skill_path, skill_content)
all_findings.extend(stage_findings)
# Prompt basics
prompt_details, prompt_findings = check_prompt_basics(skill_path)
all_findings.extend(prompt_findings)
# Build severity summary
by_severity = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0}
for f in all_findings:
sev = f['severity']
if sev in by_severity:
by_severity[sev] += 1
status = 'pass'
if by_severity['critical'] > 0:
status = 'fail'
elif by_severity['high'] > 0:
status = 'warning'
return {
'scanner': 'workflow-integrity-prepass',
'script': 'prepass-workflow-integrity.py',
'version': '1.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': status,
'metadata': {
'frontmatter': frontmatter,
'sections': sections,
'workflow_type': workflow_type,
},
'stage_summary': stage_summary,
'prompt_details': prompt_details,
'issues': all_findings,
'summary': {
'total_issues': len(all_findings),
'by_severity': by_severity,
},
}
def main() -> int:
parser = argparse.ArgumentParser(
description='Deterministic pre-pass for workflow integrity scanning',
)
parser.add_argument(
'skill_path',
type=Path,
help='Path to the skill directory to scan',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Write JSON output to file instead of stdout',
)
args = parser.parse_args()
if not args.skill_path.is_dir():
print(f"Error: {args.skill_path} is not a directory", file=sys.stderr)
return 2
result = scan_workflow_integrity(args.skill_path)
output = json.dumps(result, indent=2)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(output)
print(f"Results written to {args.output}", file=sys.stderr)
else:
print(output)
return 0 if result['status'] == 'pass' else 1
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,298 @@
#!/usr/bin/env python3
"""Deterministic path standards scanner for BMad skills.
Validates all .md and .json files against BMad path conventions:
1. {project-root} for any project-scope path (not just _bmad)
2. Bare _bmad references must have {project-root} prefix
3. Config variables used directly — no double-prefix with {project-root}
4. ./ only for same-folder references — never ./subdir/ cross-directory
5. No ../ parent directory references
6. No absolute paths
7. Frontmatter allows only name and description
8. No .md files at skill root except SKILL.md
"""
# /// script
# requires-python = ">=3.9"
# ///
from __future__ import annotations
import argparse
import json
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
# Patterns to detect
# Double-prefix: {project-root}/{config-variable} — config vars already contain project-root
DOUBLE_PREFIX_RE = re.compile(r'\{project-root\}/\{[^}]+\}')
# Bare _bmad without {project-root} prefix — match _bmad at word boundary
# but not when preceded by {project-root}/
BARE_BMAD_RE = re.compile(r'(?<!\{project-root\}/)_bmad[/\s]')
# Absolute paths
ABSOLUTE_PATH_RE = re.compile(r'(?:^|[\s"`\'(])(/(?:Users|home|opt|var|tmp|etc|usr)/\S+)', re.MULTILINE)
HOME_PATH_RE = re.compile(r'(?:^|[\s"`\'(])(~/\S+)', re.MULTILINE)
# Parent directory reference (still invalid)
RELATIVE_DOT_RE = re.compile(r'(?:^|[\s"`\'(])(\.\./\S+)', re.MULTILINE)
# Cross-directory ./ — ./subdir/ is wrong because ./ means same folder only
CROSS_DIR_DOT_SLASH_RE = re.compile(r'(?:^|[\s"`\'(])\./(?:references|scripts|assets)/\S+', re.MULTILINE)
# Fenced code block detection (to skip examples showing wrong patterns)
FENCE_RE = re.compile(r'^```', re.MULTILINE)
# Valid frontmatter keys
VALID_FRONTMATTER_KEYS = {'name', 'description'}
def is_in_fenced_block(content: str, pos: int) -> bool:
"""Check if a position is inside a fenced code block."""
fences = [m.start() for m in FENCE_RE.finditer(content[:pos])]
# Odd number of fences before pos means we're inside a block
return len(fences) % 2 == 1
def get_line_number(content: str, pos: int) -> int:
"""Get 1-based line number for a position in content."""
return content[:pos].count('\n') + 1
def check_frontmatter(content: str, filepath: Path) -> list[dict]:
"""Validate SKILL.md frontmatter contains only allowed keys."""
findings = []
if filepath.name != 'SKILL.md':
return findings
if not content.startswith('---'):
findings.append({
'file': filepath.name,
'line': 1,
'severity': 'critical',
'category': 'frontmatter',
'title': 'SKILL.md missing frontmatter block',
'detail': 'SKILL.md must start with --- frontmatter containing name and description',
'action': 'Add frontmatter with name and description fields',
})
return findings
# Find closing ---
end = content.find('\n---', 3)
if end == -1:
findings.append({
'file': filepath.name,
'line': 1,
'severity': 'critical',
'category': 'frontmatter',
'title': 'SKILL.md frontmatter block not closed',
'detail': 'Missing closing --- for frontmatter',
'action': 'Add closing --- after frontmatter fields',
})
return findings
frontmatter = content[4:end]
for i, line in enumerate(frontmatter.split('\n'), start=2):
line = line.strip()
if not line or line.startswith('#'):
continue
if ':' in line:
key = line.split(':', 1)[0].strip()
if key not in VALID_FRONTMATTER_KEYS:
findings.append({
'file': filepath.name,
'line': i,
'severity': 'high',
'category': 'frontmatter',
'title': f'Invalid frontmatter key: {key}',
'detail': f'Only {", ".join(sorted(VALID_FRONTMATTER_KEYS))} are allowed in frontmatter',
'action': f'Remove {key} from frontmatter — use as content field in SKILL.md body instead',
})
return findings
def check_root_md_files(skill_path: Path) -> list[dict]:
"""Check that no .md files exist at skill root except SKILL.md."""
findings = []
for md_file in skill_path.glob('*.md'):
if md_file.name != 'SKILL.md':
findings.append({
'file': md_file.name,
'line': 0,
'severity': 'high',
'category': 'structure',
'title': f'Prompt file at skill root: {md_file.name}',
'detail': 'All progressive disclosure content must be in ./references/ — only SKILL.md belongs at root',
'action': f'Move {md_file.name} to references/{md_file.name}',
})
return findings
def scan_file(filepath: Path, skip_fenced: bool = True) -> list[dict]:
"""Scan a single file for path standard violations."""
findings = []
content = filepath.read_text(encoding='utf-8')
rel_path = filepath.name
checks = [
(DOUBLE_PREFIX_RE, 'double-prefix', 'critical',
'Double-prefix: {project-root}/{variable} — config variables already contain {project-root} at runtime'),
(ABSOLUTE_PATH_RE, 'absolute-path', 'high',
'Absolute path found — not portable across machines'),
(HOME_PATH_RE, 'absolute-path', 'high',
'Home directory path (~/) found — environment-specific'),
(RELATIVE_DOT_RE, 'relative-prefix', 'high',
'Parent directory reference (../) found — fragile, breaks with reorganization'),
(CROSS_DIR_DOT_SLASH_RE, 'cross-dir-dot-slash', 'high',
'Cross-directory ./ reference — ./ means same folder only; use bare skill-root relative path (e.g., references/foo.md not ./references/foo.md)'),
]
for pattern, category, severity, message in checks:
for match in pattern.finditer(content):
pos = match.start()
if skip_fenced and is_in_fenced_block(content, pos):
continue
line_num = get_line_number(content, pos)
line_content = content.split('\n')[line_num - 1].strip()
findings.append({
'file': rel_path,
'line': line_num,
'severity': severity,
'category': category,
'title': message,
'detail': line_content[:120],
'action': '',
})
# Bare _bmad check — more nuanced, need to avoid false positives
# inside {project-root}/_bmad which is correct
for match in BARE_BMAD_RE.finditer(content):
pos = match.start()
if skip_fenced and is_in_fenced_block(content, pos):
continue
start = max(0, pos - 30)
before = content[start:pos]
if '{project-root}/' in before:
continue
line_num = get_line_number(content, pos)
line_content = content.split('\n')[line_num - 1].strip()
findings.append({
'file': rel_path,
'line': line_num,
'severity': 'high',
'category': 'bare-bmad',
'title': 'Bare _bmad reference without {project-root} prefix',
'detail': line_content[:120],
'action': '',
})
return findings
def scan_skill(skill_path: Path, skip_fenced: bool = True) -> dict:
"""Scan all .md and .json files in a skill directory."""
all_findings = []
# Check for .md files at root that aren't SKILL.md
all_findings.extend(check_root_md_files(skill_path))
# Check SKILL.md frontmatter
skill_md = skill_path / 'SKILL.md'
if skill_md.exists():
content = skill_md.read_text(encoding='utf-8')
all_findings.extend(check_frontmatter(content, skill_md))
# Find all .md and .json files
md_files = sorted(list(skill_path.rglob('*.md')) + list(skill_path.rglob('*.json')))
if not md_files:
print(f"Warning: No .md or .json files found in {skill_path}", file=sys.stderr)
files_scanned = []
for md_file in md_files:
rel = md_file.relative_to(skill_path)
files_scanned.append(str(rel))
file_findings = scan_file(md_file, skip_fenced)
for f in file_findings:
f['file'] = str(rel)
all_findings.extend(file_findings)
# Build summary
by_severity = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0}
by_category = {
'double_prefix': 0,
'bare_bmad': 0,
'absolute_path': 0,
'relative_prefix': 0,
'cross_dir_dot_slash': 0,
'frontmatter': 0,
'structure': 0,
}
for f in all_findings:
sev = f['severity']
if sev in by_severity:
by_severity[sev] += 1
cat = f['category'].replace('-', '_')
if cat in by_category:
by_category[cat] += 1
return {
'scanner': 'path-standards',
'script': 'scan-path-standards.py',
'version': '3.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'files_scanned': files_scanned,
'status': 'pass' if not all_findings else 'fail',
'findings': all_findings,
'assessments': {},
'summary': {
'total_findings': len(all_findings),
'by_severity': by_severity,
'by_category': by_category,
'assessment': 'Path standards scan complete',
},
}
def main() -> int:
parser = argparse.ArgumentParser(
description='Scan BMad skill for path standard violations',
)
parser.add_argument(
'skill_path',
type=Path,
help='Path to the skill directory to scan',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Write JSON output to file instead of stdout',
)
parser.add_argument(
'--include-fenced',
action='store_true',
help='Also check inside fenced code blocks (by default they are skipped)',
)
args = parser.parse_args()
if not args.skill_path.is_dir():
print(f"Error: {args.skill_path} is not a directory", file=sys.stderr)
return 2
result = scan_skill(args.skill_path, skip_fenced=not args.include_fenced)
output = json.dumps(result, indent=2)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(output)
print(f"Results written to {args.output}", file=sys.stderr)
else:
print(output)
return 0 if result['status'] == 'pass' else 1
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,745 @@
#!/usr/bin/env python3
"""Deterministic scripts scanner for BMad skills.
Validates scripts in a skill's scripts/ folder for:
- PEP 723 inline dependencies (Python)
- Shebang, set -e, portability (Shell)
- Version pinning for npx/uvx
- Agentic design: no input(), has argparse/--help, JSON output, exit codes
- Unit test existence
- Over-engineering signals (line count, simple-op imports)
- External lint: ruff (Python), shellcheck (Bash), biome (JS/TS)
"""
# /// script
# requires-python = ">=3.9"
# ///
from __future__ import annotations
import argparse
import ast
import json
import re
import shutil
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
# =============================================================================
# External Linter Integration
# =============================================================================
def _run_command(cmd: list[str], timeout: int = 30) -> tuple[int, str, str]:
"""Run a command and return (returncode, stdout, stderr)."""
try:
result = subprocess.run(
cmd, capture_output=True, text=True, timeout=timeout,
)
return result.returncode, result.stdout, result.stderr
except FileNotFoundError:
return -1, '', f'Command not found: {cmd[0]}'
except subprocess.TimeoutExpired:
return -2, '', f'Command timed out after {timeout}s: {" ".join(cmd)}'
def _find_uv() -> str | None:
"""Find uv binary on PATH."""
return shutil.which('uv')
def _find_npx() -> str | None:
"""Find npx binary on PATH."""
return shutil.which('npx')
def lint_python_ruff(filepath: Path, rel_path: str) -> list[dict]:
"""Run ruff on a Python file via uv. Returns lint findings."""
uv = _find_uv()
if not uv:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': 'uv not found on PATH — cannot run ruff for Python linting',
'detail': '',
'action': 'Install uv: https://docs.astral.sh/uv/getting-started/installation/',
}]
rc, stdout, stderr = _run_command([
uv, 'run', 'ruff', 'check', '--output-format', 'json', str(filepath),
])
if rc == -1:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': f'Failed to run ruff via uv: {stderr.strip()}',
'detail': '',
'action': 'Ensure uv can install and run ruff: uv run ruff --version',
}]
if rc == -2:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'ruff timed out on {rel_path}',
'detail': '',
'action': '',
}]
# ruff outputs JSON array on stdout (even on rc=1 when issues found)
findings = []
try:
issues = json.loads(stdout) if stdout.strip() else []
except json.JSONDecodeError:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'Failed to parse ruff output for {rel_path}',
'detail': '',
'action': '',
}]
for issue in issues:
fix_msg = issue.get('fix', {}).get('message', '') if issue.get('fix') else ''
findings.append({
'file': rel_path,
'line': issue.get('location', {}).get('row', 0),
'severity': 'high',
'category': 'lint',
'title': f'[{issue.get("code", "?")}] {issue.get("message", "")}',
'detail': '',
'action': fix_msg or f'See https://docs.astral.sh/ruff/rules/{issue.get("code", "")}',
})
return findings
def lint_shell_shellcheck(filepath: Path, rel_path: str) -> list[dict]:
"""Run shellcheck on a shell script via uv. Returns lint findings."""
uv = _find_uv()
if not uv:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': 'uv not found on PATH — cannot run shellcheck for shell linting',
'detail': '',
'action': 'Install uv: https://docs.astral.sh/uv/getting-started/installation/',
}]
rc, stdout, stderr = _run_command([
uv, 'run', '--with', 'shellcheck-py',
'shellcheck', '--format', 'json', str(filepath),
])
if rc == -1:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': f'Failed to run shellcheck via uv: {stderr.strip()}',
'detail': '',
'action': 'Ensure uv can install shellcheck-py: uv run --with shellcheck-py shellcheck --version',
}]
if rc == -2:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'shellcheck timed out on {rel_path}',
'detail': '',
'action': '',
}]
findings = []
# shellcheck outputs JSON on stdout (rc=1 when issues found)
raw = stdout.strip() or stderr.strip()
try:
issues = json.loads(raw) if raw else []
except json.JSONDecodeError:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'Failed to parse shellcheck output for {rel_path}',
'detail': '',
'action': '',
}]
# Map shellcheck levels to our severity
level_map = {'error': 'high', 'warning': 'high', 'info': 'high', 'style': 'medium'}
for issue in issues:
sc_code = issue.get('code', '')
findings.append({
'file': rel_path,
'line': issue.get('line', 0),
'severity': level_map.get(issue.get('level', ''), 'high'),
'category': 'lint',
'title': f'[SC{sc_code}] {issue.get("message", "")}',
'detail': '',
'action': f'See https://www.shellcheck.net/wiki/SC{sc_code}',
})
return findings
def lint_node_biome(filepath: Path, rel_path: str) -> list[dict]:
"""Run biome on a JS/TS file via npx. Returns lint findings."""
npx = _find_npx()
if not npx:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': 'npx not found on PATH — cannot run biome for JS/TS linting',
'detail': '',
'action': 'Install Node.js 20+: https://nodejs.org/',
}]
rc, stdout, stderr = _run_command([
npx, '--yes', '@biomejs/biome', 'lint', '--reporter', 'json', str(filepath),
], timeout=60)
if rc == -1:
return [{
'file': rel_path, 'line': 0,
'severity': 'high', 'category': 'lint-setup',
'title': f'Failed to run biome via npx: {stderr.strip()}',
'detail': '',
'action': 'Ensure npx can run biome: npx @biomejs/biome --version',
}]
if rc == -2:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'biome timed out on {rel_path}',
'detail': '',
'action': '',
}]
findings = []
# biome outputs JSON on stdout
raw = stdout.strip()
try:
result = json.loads(raw) if raw else {}
except json.JSONDecodeError:
return [{
'file': rel_path, 'line': 0,
'severity': 'medium', 'category': 'lint',
'title': f'Failed to parse biome output for {rel_path}',
'detail': '',
'action': '',
}]
for diag in result.get('diagnostics', []):
loc = diag.get('location', {})
start = loc.get('start', {})
findings.append({
'file': rel_path,
'line': start.get('line', 0),
'severity': 'high',
'category': 'lint',
'title': f'[{diag.get("category", "?")}] {diag.get("message", "")}',
'detail': '',
'action': diag.get('advices', [{}])[0].get('message', '') if diag.get('advices') else '',
})
return findings
# =============================================================================
# BMad Pattern Checks (Existing)
# =============================================================================
def scan_python_script(filepath: Path, rel_path: str) -> list[dict]:
"""Check a Python script for standards compliance."""
findings = []
content = filepath.read_text(encoding='utf-8')
lines = content.split('\n')
line_count = len(lines)
# PEP 723 check
if '# /// script' not in content:
# Only flag if the script has imports (not a trivial script)
if 'import ' in content:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'dependencies',
'title': 'No PEP 723 inline dependency block (# /// script)',
'detail': '',
'action': 'Add PEP 723 block with requires-python and dependencies',
})
else:
# Check requires-python is present
if 'requires-python' not in content:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'low', 'category': 'dependencies',
'title': 'PEP 723 block exists but missing requires-python constraint',
'detail': '',
'action': 'Add requires-python = ">=3.9" or appropriate version',
})
# requirements.txt reference
if 'requirements.txt' in content or 'pip install' in content:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'high', 'category': 'dependencies',
'title': 'References requirements.txt or pip install — use PEP 723 inline deps',
'detail': '',
'action': 'Replace with PEP 723 inline dependency block',
})
# Agentic design checks via AST
try:
tree = ast.parse(content)
except SyntaxError:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'critical', 'category': 'error-handling',
'title': 'Python syntax error — script cannot be parsed',
'detail': '',
'action': '',
})
return findings
has_argparse = False
has_json_dumps = False
has_sys_exit = False
imports = set()
for node in ast.walk(tree):
# Track imports
if isinstance(node, ast.Import):
for alias in node.names:
imports.add(alias.name)
elif isinstance(node, ast.ImportFrom):
if node.module:
imports.add(node.module)
# input() calls
if isinstance(node, ast.Call):
func = node.func
if isinstance(func, ast.Name) and func.id == 'input':
findings.append({
'file': rel_path, 'line': node.lineno,
'severity': 'critical', 'category': 'agentic-design',
'title': 'input() call found — blocks in non-interactive agent execution',
'detail': '',
'action': 'Use argparse with required flags instead of interactive prompts',
})
# json.dumps
if isinstance(func, ast.Attribute) and func.attr == 'dumps':
has_json_dumps = True
# sys.exit
if isinstance(func, ast.Attribute) and func.attr == 'exit':
has_sys_exit = True
if isinstance(func, ast.Name) and func.id == 'exit':
has_sys_exit = True
# argparse
if isinstance(node, ast.Attribute) and node.attr == 'ArgumentParser':
has_argparse = True
if not has_argparse and line_count > 20:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'agentic-design',
'title': 'No argparse found — script lacks --help self-documentation',
'detail': '',
'action': 'Add argparse with description and argument help text',
})
if not has_json_dumps and line_count > 20:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'agentic-design',
'title': 'No json.dumps found — output may not be structured JSON',
'detail': '',
'action': 'Use json.dumps for structured output parseable by workflows',
})
if not has_sys_exit and line_count > 20:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'low', 'category': 'agentic-design',
'title': 'No sys.exit() calls — may not return meaningful exit codes',
'detail': '',
'action': 'Return 0=success, 1=fail, 2=error via sys.exit()',
})
# Over-engineering: simple file ops in Python
simple_op_imports = {'shutil', 'glob', 'fnmatch'}
over_eng = imports & simple_op_imports
if over_eng and line_count < 30:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'low', 'category': 'over-engineered',
'title': f'Short script ({line_count} lines) imports {", ".join(over_eng)} — may be simpler as bash',
'detail': '',
'action': 'Consider if cp/mv/find shell commands would suffice',
})
# Very short script
if line_count < 5:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'over-engineered',
'title': f'Script is only {line_count} lines — could be an inline command',
'detail': '',
'action': 'Consider inlining this command directly in the prompt',
})
return findings
def scan_shell_script(filepath: Path, rel_path: str) -> list[dict]:
"""Check a shell script for standards compliance."""
findings = []
content = filepath.read_text(encoding='utf-8')
lines = content.split('\n')
line_count = len(lines)
# Shebang
if not lines[0].startswith('#!'):
findings.append({
'file': rel_path, 'line': 1,
'severity': 'high', 'category': 'portability',
'title': 'Missing shebang line',
'detail': '',
'action': 'Add #!/usr/bin/env bash or #!/usr/bin/env sh',
})
elif '/usr/bin/env' not in lines[0]:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'portability',
'title': f'Shebang uses hardcoded path: {lines[0].strip()}',
'detail': '',
'action': 'Use #!/usr/bin/env bash for cross-platform compatibility',
})
# set -e
if 'set -e' not in content and 'set -euo' not in content:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'error-handling',
'title': 'Missing set -e — errors will be silently ignored',
'detail': '',
'action': 'Add set -e (or set -euo pipefail) near the top',
})
# Hardcoded interpreter paths
hardcoded_re = re.compile(r'/usr/bin/(python|ruby|node|perl)\b')
for i, line in enumerate(lines, 1):
if hardcoded_re.search(line):
findings.append({
'file': rel_path, 'line': i,
'severity': 'medium', 'category': 'portability',
'title': f'Hardcoded interpreter path: {line.strip()}',
'detail': '',
'action': 'Use /usr/bin/env or PATH-based lookup',
})
# GNU-only tools
gnu_re = re.compile(r'\b(gsed|gawk|ggrep|gfind)\b')
for i, line in enumerate(lines, 1):
m = gnu_re.search(line)
if m:
findings.append({
'file': rel_path, 'line': i,
'severity': 'medium', 'category': 'portability',
'title': f'GNU-only tool: {m.group()} — not available on all platforms',
'detail': '',
'action': 'Use POSIX-compatible equivalent',
})
# Unquoted variables (basic check)
unquoted_re = re.compile(r'(?<!")\$\w+(?!")')
for i, line in enumerate(lines, 1):
if line.strip().startswith('#'):
continue
for m in unquoted_re.finditer(line):
# Skip inside double-quoted strings (rough heuristic)
before = line[:m.start()]
if before.count('"') % 2 == 1:
continue
findings.append({
'file': rel_path, 'line': i,
'severity': 'low', 'category': 'portability',
'title': f'Potentially unquoted variable: {m.group()} — breaks with spaces in paths',
'detail': '',
'action': f'Use "{m.group()}" with double quotes',
})
# npx/uvx without version pinning
no_pin_re = re.compile(r'\b(npx|uvx)\s+([a-zA-Z][\w-]+)(?!\S*@)')
for i, line in enumerate(lines, 1):
if line.strip().startswith('#'):
continue
m = no_pin_re.search(line)
if m:
findings.append({
'file': rel_path, 'line': i,
'severity': 'medium', 'category': 'dependencies',
'title': f'{m.group(1)} {m.group(2)} without version pinning',
'detail': '',
'action': f'Pin version: {m.group(1)} {m.group(2)}@<version>',
})
# Very short script
if line_count < 5:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'over-engineered',
'title': f'Script is only {line_count} lines — could be an inline command',
'detail': '',
'action': 'Consider inlining this command directly in the prompt',
})
return findings
def scan_node_script(filepath: Path, rel_path: str) -> list[dict]:
"""Check a JS/TS script for standards compliance."""
findings = []
content = filepath.read_text(encoding='utf-8')
lines = content.split('\n')
line_count = len(lines)
# npx/uvx without version pinning
no_pin = re.compile(r'\b(npx|uvx)\s+([a-zA-Z][\w-]+)(?!\S*@)')
for i, line in enumerate(lines, 1):
m = no_pin.search(line)
if m:
findings.append({
'file': rel_path, 'line': i,
'severity': 'medium', 'category': 'dependencies',
'title': f'{m.group(1)} {m.group(2)} without version pinning',
'detail': '',
'action': f'Pin version: {m.group(1)} {m.group(2)}@<version>',
})
# Very short script
if line_count < 5:
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'over-engineered',
'title': f'Script is only {line_count} lines — could be an inline command',
'detail': '',
'action': 'Consider inlining this command directly in the prompt',
})
return findings
# =============================================================================
# Main Scanner
# =============================================================================
def scan_skill_scripts(skill_path: Path) -> dict:
"""Scan all scripts in a skill directory."""
scripts_dir = skill_path / 'scripts'
all_findings = []
lint_findings = []
script_inventory = {'python': [], 'shell': [], 'node': [], 'other': []}
missing_tests = []
if not scripts_dir.exists():
return {
'scanner': 'scripts',
'script': 'scan-scripts.py',
'version': '2.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': 'pass',
'findings': [{
'file': 'scripts/',
'severity': 'info',
'category': 'none',
'title': 'No scripts/ directory found — nothing to scan',
'detail': '',
'action': '',
}],
'assessments': {
'lint_summary': {
'tools_used': [],
'files_linted': 0,
'lint_issues': 0,
},
'script_summary': {
'total_scripts': 0,
'by_type': script_inventory,
'missing_tests': [],
},
},
'summary': {
'total_findings': 0,
'by_severity': {'critical': 0, 'high': 0, 'medium': 0, 'low': 0},
'assessment': '',
},
}
# Find all script files (exclude tests/ and __pycache__)
script_files = []
for f in sorted(scripts_dir.iterdir()):
if f.is_file() and f.suffix in ('.py', '.sh', '.bash', '.js', '.ts', '.mjs'):
script_files.append(f)
tests_dir = scripts_dir / 'tests'
lint_tools_used = set()
for script_file in script_files:
rel_path = f'scripts/{script_file.name}'
ext = script_file.suffix
if ext == '.py':
script_inventory['python'].append(script_file.name)
findings = scan_python_script(script_file, rel_path)
lf = lint_python_ruff(script_file, rel_path)
lint_findings.extend(lf)
if lf and not any(f['category'] == 'lint-setup' for f in lf):
lint_tools_used.add('ruff')
elif ext in ('.sh', '.bash'):
script_inventory['shell'].append(script_file.name)
findings = scan_shell_script(script_file, rel_path)
lf = lint_shell_shellcheck(script_file, rel_path)
lint_findings.extend(lf)
if lf and not any(f['category'] == 'lint-setup' for f in lf):
lint_tools_used.add('shellcheck')
elif ext in ('.js', '.ts', '.mjs'):
script_inventory['node'].append(script_file.name)
findings = scan_node_script(script_file, rel_path)
lf = lint_node_biome(script_file, rel_path)
lint_findings.extend(lf)
if lf and not any(f['category'] == 'lint-setup' for f in lf):
lint_tools_used.add('biome')
else:
script_inventory['other'].append(script_file.name)
findings = []
# Check for unit tests
if tests_dir.exists():
stem = script_file.stem
test_patterns = [
f'test_{stem}{ext}', f'test-{stem}{ext}',
f'{stem}_test{ext}', f'{stem}-test{ext}',
f'test_{stem}.py', f'test-{stem}.py',
]
has_test = any((tests_dir / t).exists() for t in test_patterns)
else:
has_test = False
if not has_test:
missing_tests.append(script_file.name)
findings.append({
'file': rel_path, 'line': 1,
'severity': 'medium', 'category': 'tests',
'title': f'No unit test found for {script_file.name}',
'detail': '',
'action': f'Create scripts/tests/test-{script_file.stem}{ext} with test cases',
})
all_findings.extend(findings)
# Check if tests/ directory exists at all
if script_files and not tests_dir.exists():
all_findings.append({
'file': 'scripts/tests/',
'line': 0,
'severity': 'high',
'category': 'tests',
'title': 'scripts/tests/ directory does not exist — no unit tests',
'detail': '',
'action': 'Create scripts/tests/ with test files for each script',
})
# Merge lint findings into all findings
all_findings.extend(lint_findings)
# Build summary
by_severity = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0}
by_category: dict[str, int] = {}
for f in all_findings:
sev = f['severity']
if sev in by_severity:
by_severity[sev] += 1
cat = f['category']
by_category[cat] = by_category.get(cat, 0) + 1
total_scripts = sum(len(v) for v in script_inventory.values())
status = 'pass'
if by_severity['critical'] > 0:
status = 'fail'
elif by_severity['high'] > 0:
status = 'warning'
elif total_scripts == 0:
status = 'pass'
lint_issue_count = sum(1 for f in lint_findings if f['category'] == 'lint')
return {
'scanner': 'scripts',
'script': 'scan-scripts.py',
'version': '2.0.0',
'skill_path': str(skill_path),
'timestamp': datetime.now(timezone.utc).isoformat(),
'status': status,
'findings': all_findings,
'assessments': {
'lint_summary': {
'tools_used': sorted(lint_tools_used),
'files_linted': total_scripts,
'lint_issues': lint_issue_count,
},
'script_summary': {
'total_scripts': total_scripts,
'by_type': {k: len(v) for k, v in script_inventory.items()},
'scripts': {k: v for k, v in script_inventory.items() if v},
'missing_tests': missing_tests,
},
},
'summary': {
'total_findings': len(all_findings),
'by_severity': by_severity,
'by_category': by_category,
'assessment': '',
},
}
def main() -> int:
parser = argparse.ArgumentParser(
description='Scan BMad skill scripts for quality, portability, agentic design, and lint issues',
)
parser.add_argument(
'skill_path',
type=Path,
help='Path to the skill directory to scan',
)
parser.add_argument(
'--output', '-o',
type=Path,
help='Write JSON output to file instead of stdout',
)
args = parser.parse_args()
if not args.skill_path.is_dir():
print(f"Error: {args.skill_path} is not a directory", file=sys.stderr)
return 2
result = scan_skill_scripts(args.skill_path)
output = json.dumps(result, indent=2)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(output)
print(f"Results written to {args.output}", file=sys.stderr)
else:
print(output)
return 0 if result['status'] == 'pass' else 1
if __name__ == '__main__':
sys.exit(main())