Skip to content

Add unified AgentProvider trait, CLI adapters, and e2e pipeline tests#4

Open
csells wants to merge 6 commits intosmartcomputer-ai:mainfrom
csells:unified-agent-provider
Open

Add unified AgentProvider trait, CLI adapters, and e2e pipeline tests#4
csells wants to merge 6 commits intosmartcomputer-ai:mainfrom
csells:unified-agent-provider

Conversation

@csells
Copy link

@csells csells commented Mar 2, 2026

Summary

  • Unified AgentProvider trait — every backend (HTTP API or CLI subprocess) implements run_to_completion(), so Claude Code, Codex CLI, and Gemini CLI plug into Attractor pipelines as first-class backends
  • CLI adapters for claude-code, codex-cli, and gemini-cli with JSONL event parsing, proper UTF-8 truncation (floor_char_boundary), and subprocess lifecycle management
  • Attractor spec compliance — skipped status, failure_reason, retry presets, node timeouts, auto_status, manifest.json, real shell execution in ToolHandler, executor-based parallel branches, LLM-backed fan-in consolidation, goal-gate enforcement on all graph nodes (including unvisited), 4-level stylesheet specificity
  • 8 full-stack e2e pipeline tests — DOT file → forge-cli → real CLI agent → JSONL events → disk artifacts → CXDB persistence, all running in isolated git sandboxes in /tmp
  • Test hygiene#[ignore] is the only acceptable gating mechanism; removed all banned double-gating patterns; tests fail hard when prerequisites are missing

What changed

forge-llm

  • AgentProvider trait, AgentRunOptions/AgentRunResult types
  • CLI adapters: claude_code.rs, codex.rs, gemini.rs with JSONL parsing
  • Fix: truncate_json/truncate panic on multi-byte UTF-8 char boundaries (used floor_char_boundary())

forge-agent

  • HttpApiAgentProvider extracting Session's tool loop into the AgentProvider interface
  • pub(crate) visibility on session utils for reuse

forge-attractor

  • AgentProviderSubmitter bridging AgentProviderAgentSubmitter for pipeline integration
  • Condition evaluation fixes: missing keys → empty string, bare key truthiness, context prefix resolution
  • Goal-gate enforcement checks ALL graph nodes (including unvisited)
  • Stylesheet specificity: universal (0) < shape (1) < class (2) < node_id (3)
  • Lint: exactly one terminal node (was "at least one")
  • manifest.json now includes cxdb_context_id
  • spec_complete test suite + 5 agent_provider_integration tests

forge-cli

  • --backend claude-code|codex-cli|gemini-cli modes
  • 8 e2e pipeline tests covering linear (×3 backends), HITL auto-approve, parallel fan-out, CXDB persistence, cross-provider parity, resume-from-checkpoint
  • Each test uses init_sandbox() for isolated git repos — no project files touched

Specs & docs

  • New spec/06-unified-agent-provider-spec.md
  • Updated spec/03-attractor-spec.md DoD (all checkboxes complete except optional HTTP server mode)
  • Updated AGENTS.md and README.md with test commands and three-tier test taxonomy

Test coverage

Tier Count What
Default (cargo test) 647 Unit + integration, no external deps
Infrastructure (--ignored) 51 CXDB server, CLI agents (OAuth, no API keys)
API keys (--ignored) 25 OpenAI/Anthropic live calls (costs money)

Test plan

  • cargo test — 647 passed, 0 failed
  • cargo test -p forge-cli --test e2e_pipeline -- --ignored — 8 passed (requires CLI agents + CXDB)
  • cargo test -p forge-llm --test cli_agent_e2e -- --ignored — 9 passed (requires CLI agents)
  • Verify against upstream CI if configured

🤖 Generated with Claude Code

csells and others added 6 commits February 27, 2026 17:09
Introduce a provider-owned agent loop abstraction so every backend
(HTTP API or CLI subprocess) implements a single AgentProvider trait
with run_to_completion(). This lets Claude Code, Codex CLI, and
Gemini CLI plug into Attractor pipelines as first-class backends.

Key changes:
- forge-llm: AgentProvider trait, AgentRunOptions/Result types,
  CLI adapters for claude-code, codex, and gemini with JSONL parsing
- forge-agent: HttpApiAgentProvider extracting Session's tool loop,
  pub(crate) visibility on session utils
- forge-attractor: AgentProviderSubmitter bridging AgentProvider to
  AgentSubmitter for pipeline integration
- forge-cli: --backend claude-code|codex-cli|gemini-cli modes
- E2e tests for all three CLI providers (ungated)
- spec/06-unified-agent-provider-spec.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…st hygiene

- Implement missing attractor spec features: Skipped status, failure_reason,
  retry presets, node timeouts, auto_status, manifest.json, real shell
  execution in ToolHandler, executor-based parallel branches, LLM-backed
  fan-in consolidation, 100KB artifact threshold
- Fix condition evaluation: missing keys return empty string, bare key
  truthiness, context prefix resolution
- Fix goal-gate enforcement to check ALL graph nodes (including unvisited)
- Fix stylesheet specificity to 4-level: universal < shape < class < node_id
- Fix lint: exactly one terminal node (was "at least one")
- Add AgentProvider → pipeline integration with AgentProviderSubmitter adapter
- Add 5 agent_provider_integration tests and spec_complete test suite
- Remove banned double-gating pattern from all live test files: #[ignore] is
  the only acceptable gate; tests fail hard when prerequisites are missing
- Update CLI backends: --backend claude-code|codex-cli|gemini-cli
- Update specs (03-attractor, 06-unified-agent-provider) and docs
- Three-tier test taxonomy: default (647), infrastructure (18), API keys (25)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
E2E tests exercise the full stack: DOT file → forge-cli → real CLI agent
→ JSONL event stream → disk artifacts → CXDB persistence. All 8 tests
pass across Claude Code, Codex CLI, and Gemini CLI backends.

Bugs found and fixed:
- Fix truncate/truncate_json panic on multi-byte UTF-8 char boundaries
  in all 3 CLI adapters (claude_code, codex, gemini) using
  floor_char_boundary()
- Add cxdb_context_id to manifest.json for e2e CXDB verification

Tests: 8 new e2e tests (Tier 2, #[ignore], no API keys):
  e2e_linear_{claude_code,codex,gemini}, e2e_hitl_auto_approve,
  e2e_parallel_pipeline, e2e_cxdb_persistence,
  e2e_cross_provider_parity, e2e_resume_from_checkpoint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CXDB server writes to ./data/ relative to cwd when no explicit
CXDB_DATA_DIR is set. Since e2e tests run from the workspace root,
this directory can appear in the project root during test runs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each test now creates a fresh git repo in /tmp via init_sandbox()
instead of running CLI agents inside the live forge project. This
prevents agents from reading/writing project files and avoids CXDB
data directories appearing in the project root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant