Add unified AgentProvider trait, CLI adapters, and e2e pipeline tests by csells · Pull Request #4 · smartcomputer-ai/forge

csells · 2026-03-02T12:55:08Z

Summary

Unified AgentProvider trait — every backend (HTTP API or CLI subprocess) implements run_to_completion(), so Claude Code, Codex CLI, and Gemini CLI plug into Attractor pipelines as first-class backends
CLI adapters for claude-code, codex-cli, and gemini-cli with JSONL event parsing, proper UTF-8 truncation (floor_char_boundary), and subprocess lifecycle management
Attractor spec compliance — skipped status, failure_reason, retry presets, node timeouts, auto_status, manifest.json, real shell execution in ToolHandler, executor-based parallel branches, LLM-backed fan-in consolidation, goal-gate enforcement on all graph nodes (including unvisited), 4-level stylesheet specificity
8 full-stack e2e pipeline tests — DOT file → forge-cli → real CLI agent → JSONL events → disk artifacts → CXDB persistence, all running in isolated git sandboxes in /tmp
Test hygiene — #[ignore] is the only acceptable gating mechanism; removed all banned double-gating patterns; tests fail hard when prerequisites are missing

What changed

`forge-llm`

AgentProvider trait, AgentRunOptions/AgentRunResult types
CLI adapters: claude_code.rs, codex.rs, gemini.rs with JSONL parsing
Fix: truncate_json/truncate panic on multi-byte UTF-8 char boundaries (used floor_char_boundary())

`forge-agent`

HttpApiAgentProvider extracting Session's tool loop into the AgentProvider interface
pub(crate) visibility on session utils for reuse

`forge-attractor`

AgentProviderSubmitter bridging AgentProvider → AgentSubmitter for pipeline integration
Condition evaluation fixes: missing keys → empty string, bare key truthiness, context prefix resolution
Goal-gate enforcement checks ALL graph nodes (including unvisited)
Stylesheet specificity: universal (0) < shape (1) < class (2) < node_id (3)
Lint: exactly one terminal node (was "at least one")
manifest.json now includes cxdb_context_id
spec_complete test suite + 5 agent_provider_integration tests

`forge-cli`

--backend claude-code|codex-cli|gemini-cli modes
8 e2e pipeline tests covering linear (×3 backends), HITL auto-approve, parallel fan-out, CXDB persistence, cross-provider parity, resume-from-checkpoint
Each test uses init_sandbox() for isolated git repos — no project files touched

Specs & docs

New spec/06-unified-agent-provider-spec.md
Updated spec/03-attractor-spec.md DoD (all checkboxes complete except optional HTTP server mode)
Updated AGENTS.md and README.md with test commands and three-tier test taxonomy

Test coverage

Tier	Count	What
Default (`cargo test`)	647	Unit + integration, no external deps
Infrastructure (`--ignored`)	51	CXDB server, CLI agents (OAuth, no API keys)
API keys (`--ignored`)	25	OpenAI/Anthropic live calls (costs money)

Test plan

cargo test — 647 passed, 0 failed
cargo test -p forge-cli --test e2e_pipeline -- --ignored — 8 passed (requires CLI agents + CXDB)
cargo test -p forge-llm --test cli_agent_e2e -- --ignored — 9 passed (requires CLI agents)
Verify against upstream CI if configured

🤖 Generated with Claude Code

Introduce a provider-owned agent loop abstraction so every backend (HTTP API or CLI subprocess) implements a single AgentProvider trait with run_to_completion(). This lets Claude Code, Codex CLI, and Gemini CLI plug into Attractor pipelines as first-class backends. Key changes: - forge-llm: AgentProvider trait, AgentRunOptions/Result types, CLI adapters for claude-code, codex, and gemini with JSONL parsing - forge-agent: HttpApiAgentProvider extracting Session's tool loop, pub(crate) visibility on session utils - forge-attractor: AgentProviderSubmitter bridging AgentProvider to AgentSubmitter for pipeline integration - forge-cli: --backend claude-code|codex-cli|gemini-cli modes - E2e tests for all three CLI providers (ungated) - spec/06-unified-agent-provider-spec.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…st hygiene - Implement missing attractor spec features: Skipped status, failure_reason, retry presets, node timeouts, auto_status, manifest.json, real shell execution in ToolHandler, executor-based parallel branches, LLM-backed fan-in consolidation, 100KB artifact threshold - Fix condition evaluation: missing keys return empty string, bare key truthiness, context prefix resolution - Fix goal-gate enforcement to check ALL graph nodes (including unvisited) - Fix stylesheet specificity to 4-level: universal < shape < class < node_id - Fix lint: exactly one terminal node (was "at least one") - Add AgentProvider → pipeline integration with AgentProviderSubmitter adapter - Add 5 agent_provider_integration tests and spec_complete test suite - Remove banned double-gating pattern from all live test files: #[ignore] is the only acceptable gate; tests fail hard when prerequisites are missing - Update CLI backends: --backend claude-code|codex-cli|gemini-cli - Update specs (03-attractor, 06-unified-agent-provider) and docs - Three-tier test taxonomy: default (647), infrastructure (18), API keys (25) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

E2E tests exercise the full stack: DOT file → forge-cli → real CLI agent → JSONL event stream → disk artifacts → CXDB persistence. All 8 tests pass across Claude Code, Codex CLI, and Gemini CLI backends. Bugs found and fixed: - Fix truncate/truncate_json panic on multi-byte UTF-8 char boundaries in all 3 CLI adapters (claude_code, codex, gemini) using floor_char_boundary() - Add cxdb_context_id to manifest.json for e2e CXDB verification Tests: 8 new e2e tests (Tier 2, #[ignore], no API keys): e2e_linear_{claude_code,codex,gemini}, e2e_hitl_auto_approve, e2e_parallel_pipeline, e2e_cxdb_persistence, e2e_cross_provider_parity, e2e_resume_from_checkpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CXDB server writes to ./data/ relative to cwd when no explicit CXDB_DATA_DIR is set. Since e2e tests run from the workspace root, this directory can appear in the project root during test runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Each test now creates a fresh git repo in /tmp via init_sandbox() instead of running CLI agents inside the live forge project. This prevents agents from reading/writing project files and avoids CXDB data directories appearing in the project root. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

csells and others added 6 commits February 27, 2026 17:09

Update docs and specs with e2e pipeline test coverage

f0c6413

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unified AgentProvider trait, CLI adapters, and e2e pipeline tests#4

Add unified AgentProvider trait, CLI adapters, and e2e pipeline tests#4
csells wants to merge 6 commits intosmartcomputer-ai:mainfrom
csells:unified-agent-provider

csells commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

csells commented Mar 2, 2026

Summary

What changed

forge-llm

forge-agent

forge-attractor

forge-cli

Specs & docs

Test coverage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`forge-llm`

`forge-agent`

`forge-attractor`

`forge-cli`