Skip to content

Comments

[fix] Extra changes related to corrupt SDK evals#3795

Merged
bekossy merged 2 commits intorelease/v0.86.6from
fix/follow-up-on-corrupted-sdk-evals
Feb 20, 2026
Merged

[fix] Extra changes related to corrupt SDK evals#3795
bekossy merged 2 commits intorelease/v0.86.6from
fix/follow-up-on-corrupted-sdk-evals

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Feb 20, 2026

No description provided.

Copilot AI review requested due to automatic review settings February 20, 2026 14:35
@vercel
Copy link

vercel bot commented Feb 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Feb 20, 2026 3:03pm

Request Review

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 20, 2026
@junaway junaway changed the base branch from main to release/v0.86.6 February 20, 2026 14:35
@dosubot dosubot bot added bug Something isn't working SDK labels Feb 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issues related to corrupt SDK evaluations by improving robustness in application name handling and evaluation slug management.

Changes:

  • Replaces unreliable ApplicationRevision.name with authoritative SimpleApplication.name by fetching from the API
  • Removes manual slug computation in favor of using the application_revision.slug attribute directly
  • Adds defensive name sanitization for evaluation runs to handle None, empty, or whitespace-only names

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
sdk/agenta/sdk/managers/applications.py Adds _fetch_simple_application() function to retrieve authoritative application names and updates name preservation logic to use it instead of ApplicationRevision.name
sdk/agenta/sdk/evaluations/preview/evaluate.py Removes get_slug_from_name_and_id import/usage in favor of using application_revision.slug directly, adds name sanitization with "SDK Eval" fallback, and increases evaluator trace retry count from 20 to 30 for consistency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bekossy
Copy link
Member

bekossy commented Feb 20, 2026

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 20, 2026
@bekossy bekossy merged commit 3f83c75 into release/v0.86.6 Feb 20, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer SDK size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants