Skip to content

[PECOBLR-1928] Add AI coding agent detection to User-Agent header#740

Open
vikrantpuppala wants to merge 1 commit intomainfrom
agent-detection
Open

[PECOBLR-1928] Add AI coding agent detection to User-Agent header#740
vikrantpuppala wants to merge 1 commit intomainfrom
agent-detection

Conversation

@vikrantpuppala
Copy link
Contributor

Summary

  • Adds agent.py module that detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, Antigravity) by checking well-known environment variables they set in spawned shell processes
  • Integrates detection into both Session (Thrift path) and build_client_context (SEA path) to append agent/<product> to the User-Agent header
  • Uses exactly-one detection rule: if zero or multiple agent env vars are set, no agent is attributed (avoids ambiguity)

Approach

Mirrors the implementation in databricks/cli#4287 and aligns with the latest agent list in libs/agent/agent.go.

Agent Product String Environment Variable
Google Antigravity antigravity ANTIGRAVITY_AGENT
Claude Code claude-code CLAUDECODE
Cline cline CLINE_ACTIVE
OpenAI Codex codex CODEX_CI
Cursor cursor CURSOR_AGENT
Gemini CLI gemini-cli GEMINI_CLI
OpenCode opencode OPENCODE

Adding a new agent requires only a new entry in the KNOWN_AGENTS list.

Changes

  • New: src/databricks/sql/common/agent.py — environment-variable-based agent detection with injectable env dict for testability
  • Modified: src/databricks/sql/session.py — appends agent/<product> to useragent_header (Thrift path)
  • Modified: src/databricks/sql/utils.py — appends agent/<product> in build_client_context() (SEA path)
  • New: tests/unit/test_agent_detection.py — 12 test cases covering all agents, no agent, multiple agents, and empty values

Test plan

  • test_agent_detection.py — 12 unit tests pass
  • Manual: verified User-Agent contains agent/claude-code when run from Claude Code
    User-Agent: PyDatabricksSqlConnector/4.2.5 agent/claude-code
    User-Agent: PyDatabricksSqlConnector/4.2.5 (TestPartner) agent/claude-code
    
  • Executed SELECT 1 successfully against dogfood warehouse with the new header

🤖 Generated with Claude Code

Detect when the Python SQL connector is invoked by an AI coding agent
(e.g. Claude Code, Cursor, Gemini CLI) by checking well-known
environment variables, and append `agent/<product>` to the User-Agent
string.

This enables Databricks to understand how much driver usage originates
from AI coding agents. Detection only succeeds when exactly one agent
is detected to avoid ambiguous attribution.

Mirrors the approach in databricks/cli#4287.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds AI coding agent detection to the User-Agent header in the Databricks SQL Python connector. The implementation detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, and Antigravity) by checking for well-known environment variables they set, and appends an agent/<product> suffix to the User-Agent header when exactly one agent is detected.

Changes:

  • Adds a new agent.py module with environment-variable-based detection logic following an exactly-one detection rule
  • Integrates agent detection into both the Thrift path (Session) and SEA path (build_client_context)
  • Includes comprehensive unit tests covering all agents and edge cases

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/databricks/sql/common/agent.py New module implementing AI agent detection with exactly-one rule via environment variables
src/databricks/sql/session.py Integrates agent detection into User-Agent header for Thrift backend path
src/databricks/sql/utils.py Integrates agent detection into User-Agent header for SEA backend path via build_client_context
tests/unit/test_agent_detection.py Comprehensive unit tests covering all 7 agents, edge cases, and the exactly-one detection rule

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +925 to +927
agent_product = detect_agent()
if agent_product:
user_agent += f" agent/{agent_product}"
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding integration tests to verify that the agent detection is properly integrated into the User-Agent header in the SEA path. This would ensure the integration works end-to-end and the User-Agent header includes the agent suffix when an agent environment variable is set.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detection logic is fully covered by unit tests in test_agent_detection.py. The integration in build_client_context is a trivial string append. The SEA path uses the same detect() function already covered by tests.

)

# Build user agent
from databricks.sql.common.agent import detect as detect_agent
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement should be moved to the top of the function (after line 904) to group all imports together. This improves code readability and follows the convention established in this function where imports are placed at the beginning.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import is intentionally placed inside the function to avoid a circular import — utils.py is imported early in the module graph and agent.py lives in common/. This is consistent with the existing local import pattern used in this function (e.g. from databricks.sql.auth.common import ClientContext on line 903).

Comment on lines +68 to +70
agent_product = detect_agent()
if agent_product:
self.useragent_header += " agent/{}".format(agent_product)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding integration tests to verify that the agent detection is properly integrated into the User-Agent header. The existing test_useragent_header in test_session.py could be extended to verify that when an agent environment variable is set, the User-Agent header includes the agent suffix. This would ensure the integration works end-to-end, not just the detection logic in isolation.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detection logic is fully covered by unit tests in test_agent_detection.py. The integration in session.py is a 3-line append that is straightforward. Adding an integration test here would require mocking the full Session constructor which adds complexity without meaningful coverage gain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants