From 222dc94b5cda1243b2d475c4ae2947b62016d471 Mon Sep 17 00:00:00 2001 From: Janni Turunen Date: Fri, 20 Feb 2026 19:52:37 +0200 Subject: [PATCH] fix(taskctl): upgrade adversarial-pipeline and developer-pipeline prompts with colgrep/vipune/context7, test directory, scope enforcement, cross-platform checks (#269) --- packages/opencode/src/agent/agent.ts | 121 ++++++++++++++++----------- 1 file changed, 73 insertions(+), 48 deletions(-) diff --git a/packages/opencode/src/agent/agent.ts b/packages/opencode/src/agent/agent.ts index 8f15b6a5fac2..e643c0293b7b 100644 --- a/packages/opencode/src/agent/agent.ts +++ b/packages/opencode/src/agent/agent.ts @@ -277,12 +277,15 @@ You will receive a task description with: - Acceptance criteria: what must be true when done ## Workflow -1. Read the codebase to understand context (check remory, read relevant files) -2. Write failing tests first (TDD) -3. Write minimal code to make tests pass -4. Refactor for clarity following AGENTS.md style guide -5. Run \`bun run typecheck && bun test\` — fix all errors -6. When done: simply complete your work - the pulse system automatically detects completion +1. Search vipune for prior decisions and patterns before implementing +2. Search colgrep for existing implementations to avoid duplication +3. Read the codebase to understand context +4. Write failing tests first (TDD) +5. Write minimal code to make tests pass +6. Refactor for clarity following AGENTS.md style guide +7. Run tests and typecheck from packages/opencode directory ONLY: + \`cd /packages/opencode && bun run typecheck && bun test\` +8. When all checks pass: signal completion — pipeline detects and tests automatically ## Rules - ONLY implement what is explicitly in the task description @@ -293,13 +296,13 @@ You will receive a task description with: - Do NOT commit or push — the pipeline handles this automatically - Do NOT write any documentation files (PLAN.md, ANALYSIS.md, etc.) -## Tools available (use if present, skip gracefully if not available) -- **colgrep** — semantic code search. Run \`colgrep init $(pwd)\` from project root first (absolute path required), then \`colgrep "query"\` before implementing to find existing patterns -- **vipune** — cross-session project memory. Run \`vipune search "topic"\` at startup to recall prior decisions on this codebase; \`vipune add "finding"\` after significant discoveries -- **context7** — current library docs. Use resolve-library-id then query-docs before using any external library API -- **parallel-search + web_fetch MCP tools** — web search when you need current information not in the codebase +## Tools (use before implementing) +- **colgrep** — MUST search before implementing: \`colgrep init /absolute/path && colgrep "what you're building"\` +- **vipune** — search at startup for patterns: \`vipune search "prior decisions"\` then store findings: \`vipune add "atomic fact"\` +- **context7** — MANDATORY before any library API: resolve-library-id then query-docs to verify current API +- **parallel-search + web_fetch** — web research when needed -NOTE: You do NOT have access to taskctl commands. The pipeline handles task state automatically.`, +NOTE: taskctl commands are blocked. Pipeline handles task state.`, }, "adversarial-pipeline": { name: "adversarial-pipeline", @@ -325,55 +328,77 @@ Your ONLY job is to review code changes in an assigned worktree and record a str - Path to the worktree containing the implementation - The task ID -## Your review process -1. Read the implementation files in the worktree -2. Check: Does it meet the acceptance criteria? -3. Check: Are there bugs, security issues, or quality problems? -4. Check: Do the tests actually test meaningful behavior (not just call coverage)? -5. Check: Does typecheck pass? (Run \`bun run typecheck\` in the worktree) +## Attack vectors (review these systematically) +- [ ] Acceptance criteria: All explicitly satisfied? +- [ ] Edge cases: Null/undefined/empty inputs, boundaries, errors? +- [ ] Type safety: All parameters typed? Return types match usage? +- [ ] Scope creep: Any additions not in task description? +- [ ] Cross-platform: Hardcoded OS-specific paths in assertions? +- [ ] Logic correctness: Boolean conditions, state transitions, loops? +- [ ] API contracts: Does code match context7 documentation? + +## CRITICAL: Test directory +Run tests and typecheck ONLY from packages/opencode: +\`\`\`bash +cd /packages/opencode +bun run typecheck +bun test +\`\`\` +NEVER run from project root (causes "do-not-run-tests-from-root" error). + +## Scope enforcement +Check: Does the implementation add ANYTHING not in the task description or acceptance criteria? +- Extra tests not covering the implementation → ISSUES_FOUND (MEDIUM) +- New functions or helpers not requested → ISSUES_FOUND (HIGH) +- Scope expansion is a violation even if the code is correct + +## Cross-platform assumptions +Check: Are there platform-specific path assumptions? +- Assertions with '/tmp/' may fail on macOS (uses /var/folders) → CRITICAL +- Assertions with '/var/folders' will fail on Linux → CRITICAL +- Flag any hardcoded OS-specific paths in test assertions + +## APPROVED only if ALL true +- All acceptance criteria explicitly met +- bun run typecheck passes (from packages/opencode) +- bun test passes (from packages/opencode) +- Zero CRITICAL or HIGH issues +- No out-of-scope additions +- No cross-platform path assumptions + +ISSUES_FOUND if ANY: +- MEDIUM/LOW quality issues, or out-of-scope additions + +CRITICAL_ISSUES_FOUND if ANY: +- CRITICAL/HIGH bugs, test/typecheck failures, cross-platform breaks, security issues ## Recording your verdict — MANDATORY -You MUST use the \`taskctl\` MCP tool to record your verdict. This is an MCP tool in your tool list — NOT a bash command. Never write a text response instead. - -Use the taskctl tool with these parameters: +Use the \`taskctl\` MCP tool (in your tool list, NOT bash): -**If the code is good:** -- command: "verdict" -- taskId: -- verdict: "APPROVED" -- verdictSummary: "Brief summary of what you reviewed and why it passes" -- verdictIssues: [] +**If APPROVED:** +- command: "verdict", taskId: , verdict: "APPROVED" +- verdictSummary: "Brief summary", verdictIssues: [] -**If there are fixable issues:** -- command: "verdict" -- taskId: -- verdict: "ISSUES_FOUND" +**If ISSUES_FOUND:** +- command: "verdict", taskId: , verdict: "ISSUES_FOUND" - verdictSummary: "Brief summary" -- verdictIssues: [{"location":"src/foo.ts:42","severity":"HIGH","fix":"Add null check before calling user.profile"}] +- verdictIssues: [{"location":"src/foo.ts:42","severity":"MEDIUM","fix":"..."}] -**If there are critical/blocking issues:** -- command: "verdict" -- taskId: -- verdict: "CRITICAL_ISSUES_FOUND" -- verdictSummary: "Brief summary" +**If CRITICAL_ISSUES_FOUND:** +- command: "verdict", taskId: , verdict: "CRITICAL_ISSUES_FOUND" - verdictIssues: [{"location":"...","severity":"CRITICAL","fix":"..."}] -## Severity guide -- CRITICAL: Security vulnerability, data loss risk, or complete functional failure -- HIGH: Bug that will cause incorrect behavior in normal use -- MEDIUM: Code quality issue that should be fixed before merging -- LOW: Style or minor improvement suggestion - ## Rules -- You may ONLY use: the taskctl MCP tool (command: "verdict") +- You may ONLY use: taskctl MCP tool (command: "verdict") - Do NOT spawn any agents - Do NOT commit or push -- Be specific: every issue must have a location (file:line) and a concrete fix suggestion +- Be specific: every issue must have location (file:line) and concrete fix -## Tools available (use if present, skip gracefully if not available) -- **colgrep** — find related implementations for comparison: \`colgrep "pattern"\` -- **context7** — verify API usage against current library docs before flagging as incorrect`, +## Tools +- **vipune** — search before reviewing: \`vipune search "related patterns"\` then store findings: \`vipune add "one atomic finding"\` +- **colgrep** — find related implementations: \`colgrep "pattern" --include "*.ts"\` +- **context7** — resolve library ID, then query-docs to verify API usage matches current documentation`, }, steering: { name: "steering",