From 7652faafa44a02aacd05491999788630f58a02a6 Mon Sep 17 00:00:00 2001 From: igerber Date: Sun, 1 Feb 2026 16:39:30 -0500 Subject: [PATCH 1/3] Add /review-plan skill for staff-engineer-level plan review New read-only skill that evaluates Claude Code plan files across 8 dimensions: completeness, codebase correctness, scope, edge cases, architecture, execution risks, backward compatibility, and testing strategy. Outputs structured feedback with severity-graded issues, checklist gaps, and a verdict. Co-Authored-By: Claude Opus 4.5 --- .claude/commands/review-plan.md | 211 ++++++++++++++++++++++++++++++++ 1 file changed, 211 insertions(+) create mode 100644 .claude/commands/review-plan.md diff --git a/.claude/commands/review-plan.md b/.claude/commands/review-plan.md new file mode 100644 index 0000000..5f358af --- /dev/null +++ b/.claude/commands/review-plan.md @@ -0,0 +1,211 @@ +--- +description: Review a Claude Code plan file from a staff engineer perspective +argument-hint: "" +--- + +# Review Plan + +Review a Claude Code plan file from a staff engineer perspective and provide structured feedback across 8 dimensions. + +## Arguments + +`$ARGUMENTS` should contain: +- **Plan file path** (required): Path to the plan file, e.g., `~/.claude/plans/dreamy-coalescing-brook.md` + +If no path is provided, use AskUserQuestion to request it: +``` +Which plan file would you like me to review? + +Options: +1. Enter the path (e.g., ~/.claude/plans/plan-name.md) +``` + +## Constraints + +- **Read-only**: Do NOT create, edit, or delete any files. This skill only reads and reports. +- **Advisory-only**: Provide feedback and recommendations. Do not implement fixes. +- **No code changes**: Do not modify any source code, test files, or documentation. +- Use the Read tool for files and the Glob/Grep tools for searching. Do not use Write, Edit, NotebookEdit, or any file-modifying Bash commands. + +## Instructions + +### Step 1: Read the Plan File + +Read the plan file at the path provided in `$ARGUMENTS`. + +If the file does not exist, report the error and stop: +``` +Error: Plan file not found at +``` + +### Step 2: Read CLAUDE.md for Project Context + +Read the project's `CLAUDE.md` file to understand: +- Module structure and architecture +- Estimator inheritance map +- Development checklists (adding parameters, methodology-critical code, etc.) +- Test structure and conventions +- Design patterns (sklearn-like API, formula interface, results objects, etc.) + +### Step 3: Read Referenced Files + +Identify all files the plan references (file paths, module names, class names). Then read them to validate the plan's assumptions: + +**Priority order for reading files:** +1. **Files the plan proposes to modify** — read ALL of these first +2. **Files referenced for context** (imports, call sites, existing patterns) — read selectively to verify specific claims + +**What to verify:** +- File paths exist +- Class names, function signatures, and method names match what the plan describes +- Line numbers (if referenced) are accurate +- The plan's description of existing code matches reality + +If the plan references more than ~15 files, use judgment: read all files slated for modification, then spot-check context files as needed rather than reading every one. + +### Step 4: Evaluate Across 8 Dimensions + +#### Dimension 1: Completeness & Executability + +Could a fresh Claude Code session — with no access to the conversation history that produced this plan — execute it without asking clarifying questions? + +Check for: +- Are all file paths explicit? (No "the relevant file" or "the test file") +- Are code changes described concretely? (Function signatures, parameter names, not just "add a method") +- Are decision points resolved, not deferred? ("We'll figure out the API later" is a red flag) +- Are there implicit assumptions that require conversation context to understand? + +#### Dimension 2: Codebase Correctness + +Do file paths, class names, function signatures, and line-number references in the plan match the actual codebase? + +Use your findings from Step 3. Flag: +- File paths that don't exist +- Function/class names that are misspelled or don't exist +- Line numbers that point to the wrong code +- Descriptions of existing behavior that don't match reality + +#### Dimension 3: Scope + +Is the scope right — not too much, not too little? + +Check for **missing related changes**: +- Tests for new/changed functionality +- `__init__.py` export updates +- `get_params()` / `set_params()` updates for new parameters +- Documentation updates (README, RST, tutorials, CLAUDE.md) +- For bug fixes: did the plan grep for ALL occurrences of the pattern, or just the one reported? + +Check for **unnecessary additions**: +- Docstrings/comments/type annotations for untouched code +- Premature abstractions or over-engineering +- Feature flags or backward-compatibility shims when the code can just be changed + +#### Dimension 4: Edge Cases & Failure Modes + +For methodology-critical code: +- NaN propagation through ALL inference fields (SE, t-stat, p-value, CI) +- Empty inputs / empty result sets +- Boundary conditions (single observation, single group, etc.) + +For all code: +- Error handling paths — are they tested with behavioral assertions (not just "runs without exception")? +- What happens when the feature interacts with other parameters/modes? + +#### Dimension 5: Architecture & Patterns + +Check against CLAUDE.md conventions: +- Does it respect the estimator inheritance map? (Adding a param to `DifferenceInDifferences` auto-propagates to `TwoWayFixedEffects` and `MultiPeriodDiD`; standalone estimators need individual updates) +- Does it use `linalg.py` for OLS/variance instead of reimplementing? +- Does it follow the sklearn-like `fit()` / results-object pattern? +- Is there a simpler alternative that avoids new abstraction? +- Does it match existing code patterns in the codebase? + +#### Dimension 6: Plan Execution Risks + +Plan-specific failure modes that wouldn't show up in a code review: + +- **Ordering issues**: Does the plan propose changes in an order that would break things mid-implementation? (e.g., changing an import before the module it imports from exists, deleting a function before updating its callers) +- **Ambiguous decision points**: Does the plan defer decisions that should be made now? Vague phrases like "choose an appropriate approach" or "handle edge cases" without specifying which ones +- **Missing rollback path**: For risky changes (public API modifications, data format changes), does the plan consider what happens if something goes wrong? +- **Implicit dependencies**: Does step N assume step M was completed, but this ordering isn't stated? + +#### Dimension 7: Backward Compatibility & API Risk + +- Does the plan add, remove, or rename public API surface (parameters, methods, classes)? +- If so, does it acknowledge the breaking change and state the versioning decision (deprecation period vs clean removal)? +- Downstream effects on: + - Convenience functions + - Re-exports in `__init__.py` + - Existing tutorials and documentation + - User code that may depend on the current API + +#### Dimension 8: Testing Strategy + +- Are tests included in the plan? Do they cover the happy path AND the edge cases from Dimension 4? +- Are test assertions behavioral (checking outcomes) rather than just "runs without exception"? +- For bug fixes: does the plan fix all pattern instances and test all of them? +- Are there missing test scenarios? (Parameter interactions, error paths, boundary conditions) + +### Step 5: Present Structured Feedback + +Present the review in the following format. Do NOT skip any section — if a section has no findings, write "None." for that section. + +``` +## Overall Assessment + +[2-3 sentences: what the plan does, whether it's ready for implementation, and the biggest concern if any] + +--- + +## Issues + +### CRITICAL +[Issues that would cause implementation failure, incorrect results, or breaking changes if not addressed. Each issue should include: file path and/or line number if applicable, what's wrong, and a concrete suggestion for fixing it.] + +### MEDIUM +[Issues that should be addressed but won't block implementation. Missing test cases, incomplete documentation updates, scope gaps.] + +### LOW +[Minor suggestions. Style consistency, optional improvements, things to consider.] + +--- + +## Checklist Gaps + +Cross-reference against the relevant CLAUDE.md checklists. List which checklist items are not addressed by the plan. + +[Identify which CLAUDE.md checklist applies (e.g., "Adding a New Parameter to Estimators", "Implementing Methodology-Critical Code", "Fixing Bugs Across Multiple Locations") and list any items from that checklist that the plan doesn't cover.] + +--- + +## Questions for the Author + +[Ambiguities or missing information that should be clarified before implementation begins. Phrase as specific questions.] + +--- + +## Summary + +| Category | Issues | +|----------|--------| +| Critical | [count] | +| Medium | [count] | +| Low | [count] | +| Checklist gaps | [count] | +| Questions | [count] | + +**Verdict**: [Ready / Ready with minor fixes / Needs revision] + +- **Ready**: No critical issues, few or no medium issues +- **Ready with minor fixes**: No critical issues, some medium issues that are straightforward to address +- **Needs revision**: Has critical issues or many medium issues that require rethinking the approach +``` + +## Notes + +- This skill is strictly read-only — it does not create, edit, or delete any files +- Plan files are typically located in `~/.claude/plans/` +- The review is displayed directly in the conversation, not saved to a file +- For best results, run this before implementing a plan to catch issues early +- The 8 dimensions are tuned for plan-specific failure modes, not generic code review From abc068628000ab08dbad8898dc5222d59b7d5212 Mon Sep 17 00:00:00 2001 From: igerber Date: Sun, 1 Feb 2026 16:47:15 -0500 Subject: [PATCH 2/3] Add repo-scope restriction to /review-plan file reading Addresses AI review P2 security feedback: the skill now only reads files within the project repository, preventing access to sensitive paths like ~/.ssh or /etc. External path references are flagged in review output instead of being read. Co-Authored-By: Claude Opus 4.5 --- .claude/commands/review-plan.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/.claude/commands/review-plan.md b/.claude/commands/review-plan.md index 5f358af..23ab506 100644 --- a/.claude/commands/review-plan.md +++ b/.claude/commands/review-plan.md @@ -55,6 +55,11 @@ Identify all files the plan references (file paths, module names, class names). 1. **Files the plan proposes to modify** — read ALL of these first 2. **Files referenced for context** (imports, call sites, existing patterns) — read selectively to verify specific claims +**Scope restriction:** +- Only read files that are within the project repository (the working directory tree). +- The plan file itself (the `$ARGUMENTS` input) is exempt — it can be anywhere (e.g., `~/.claude/plans/`). +- If the plan references paths outside the repo (home directory configs, SSH keys, `/etc/` files, etc.), do NOT read them. Instead, note in the review output under Dimension 2 (Codebase Correctness) that those external paths were not verified. + **What to verify:** - File paths exist - Class names, function signatures, and method names match what the plan describes From 8e060a425fc3886df5d6c3e096d427708d7b002e Mon Sep 17 00:00:00 2001 From: igerber Date: Sun, 1 Feb 2026 17:06:03 -0500 Subject: [PATCH 3/3] Direct /review-plan to consult methodology registry for estimator changes Co-Authored-By: Claude Opus 4.5 --- .claude/commands/review-plan.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.claude/commands/review-plan.md b/.claude/commands/review-plan.md index 23ab506..3bfe757 100644 --- a/.claude/commands/review-plan.md +++ b/.claude/commands/review-plan.md @@ -47,6 +47,8 @@ Read the project's `CLAUDE.md` file to understand: - Test structure and conventions - Design patterns (sklearn-like API, formula interface, results objects, etc.) +If the plan modifies estimator math, standard error formulas, inference logic, or edge-case handling, also read `docs/methodology/REGISTRY.md` to understand the academic foundations and reference implementations for the affected estimator(s). + ### Step 3: Read Referenced Files Identify all files the plan references (file paths, module names, class names). Then read them to validate the plan's assumptions: