[PECOBLR-1928] Add AI coding agent detection to User-Agent header#740
[PECOBLR-1928] Add AI coding agent detection to User-Agent header#740vikrantpuppala wants to merge 1 commit intomainfrom
Conversation
Detect when the Python SQL connector is invoked by an AI coding agent (e.g. Claude Code, Cursor, Gemini CLI) by checking well-known environment variables, and append `agent/<product>` to the User-Agent string. This enables Databricks to understand how much driver usage originates from AI coding agents. Detection only succeeds when exactly one agent is detected to avoid ambiguous attribution. Mirrors the approach in databricks/cli#4287. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds AI coding agent detection to the User-Agent header in the Databricks SQL Python connector. The implementation detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, and Antigravity) by checking for well-known environment variables they set, and appends an agent/<product> suffix to the User-Agent header when exactly one agent is detected.
Changes:
- Adds a new
agent.pymodule with environment-variable-based detection logic following an exactly-one detection rule - Integrates agent detection into both the Thrift path (Session) and SEA path (build_client_context)
- Includes comprehensive unit tests covering all agents and edge cases
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/databricks/sql/common/agent.py | New module implementing AI agent detection with exactly-one rule via environment variables |
| src/databricks/sql/session.py | Integrates agent detection into User-Agent header for Thrift backend path |
| src/databricks/sql/utils.py | Integrates agent detection into User-Agent header for SEA backend path via build_client_context |
| tests/unit/test_agent_detection.py | Comprehensive unit tests covering all 7 agents, edge cases, and the exactly-one detection rule |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| agent_product = detect_agent() | ||
| if agent_product: | ||
| user_agent += f" agent/{agent_product}" |
There was a problem hiding this comment.
Consider adding integration tests to verify that the agent detection is properly integrated into the User-Agent header in the SEA path. This would ensure the integration works end-to-end and the User-Agent header includes the agent suffix when an agent environment variable is set.
There was a problem hiding this comment.
The detection logic is fully covered by unit tests in test_agent_detection.py. The integration in build_client_context is a trivial string append. The SEA path uses the same detect() function already covered by tests.
| ) | ||
|
|
||
| # Build user agent | ||
| from databricks.sql.common.agent import detect as detect_agent |
There was a problem hiding this comment.
The import statement should be moved to the top of the function (after line 904) to group all imports together. This improves code readability and follows the convention established in this function where imports are placed at the beginning.
There was a problem hiding this comment.
The import is intentionally placed inside the function to avoid a circular import — utils.py is imported early in the module graph and agent.py lives in common/. This is consistent with the existing local import pattern used in this function (e.g. from databricks.sql.auth.common import ClientContext on line 903).
| agent_product = detect_agent() | ||
| if agent_product: | ||
| self.useragent_header += " agent/{}".format(agent_product) |
There was a problem hiding this comment.
Consider adding integration tests to verify that the agent detection is properly integrated into the User-Agent header. The existing test_useragent_header in test_session.py could be extended to verify that when an agent environment variable is set, the User-Agent header includes the agent suffix. This would ensure the integration works end-to-end, not just the detection logic in isolation.
There was a problem hiding this comment.
The detection logic is fully covered by unit tests in test_agent_detection.py. The integration in session.py is a 3-line append that is straightforward. Adding an integration test here would require mocking the full Session constructor which adds complexity without meaningful coverage gain.
Summary
agent.pymodule that detects 7 AI coding agents (Claude Code, Cursor, Gemini CLI, Cline, Codex, OpenCode, Antigravity) by checking well-known environment variables they set in spawned shell processesSession(Thrift path) andbuild_client_context(SEA path) to appendagent/<product>to the User-Agent headerApproach
Mirrors the implementation in databricks/cli#4287 and aligns with the latest agent list in
libs/agent/agent.go.antigravityANTIGRAVITY_AGENTclaude-codeCLAUDECODEclineCLINE_ACTIVEcodexCODEX_CIcursorCURSOR_AGENTgemini-cliGEMINI_CLIopencodeOPENCODEAdding a new agent requires only a new entry in the
KNOWN_AGENTSlist.Changes
src/databricks/sql/common/agent.py— environment-variable-based agent detection with injectable env dict for testabilitysrc/databricks/sql/session.py— appendsagent/<product>touseragent_header(Thrift path)src/databricks/sql/utils.py— appendsagent/<product>inbuild_client_context()(SEA path)tests/unit/test_agent_detection.py— 12 test cases covering all agents, no agent, multiple agents, and empty valuesTest plan
test_agent_detection.py— 12 unit tests passagent/claude-codewhen run from Claude CodeSELECT 1successfully against dogfood warehouse with the new header🤖 Generated with Claude Code