Skip to content

Adding the driver-toolkit as a blocking job to nightly testing.#76327

Open
ybettan wants to merge 1 commit intoopenshift:mainfrom
ybettan:dtk-blocking-job
Open

Adding the driver-toolkit as a blocking job to nightly testing.#76327
ybettan wants to merge 1 commit intoopenshift:mainfrom
ybettan:dtk-blocking-job

Conversation

@ybettan
Copy link
Member

@ybettan ybettan commented Mar 16, 2026

Following up on 1f3172b, also making it a blocking job for 4.12 <= OCP <= 4.15.

The driver-toolkit image contain the kernel RPMs that matches the exact kernel version on the nodes for a giver OCP payload.

Having a kernel mismatch between RHCOS and DTK will break every user using DTK including the NVIDIA GPU operator.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Mar 16, 2026
@sdodson
Copy link
Member

sdodson commented Mar 16, 2026

/lgtm
/approved

@openshift-ci openshift-ci bot requested review from dgoodwin and neisw March 16, 2026 14:04
@openshift-ci-robot
Copy link
Contributor

@ybettan, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not load configuration from base revision of release repo: could not checkout worktree: '[git checkout daaa8978cbb691c684260c2aa5430e173dca4ad1]' failed with out:  and error exec: Stdout already set
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2026
@sdodson
Copy link
Member

sdodson commented Mar 16, 2026

@ybettan still getting release-controller-config failure, needs make release-controllers

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2026
@ybettan
Copy link
Member Author

ybettan commented Mar 16, 2026

@ybettan still getting release-controller-config failure, needs make release-controllers

The job definition was missing for 4.13 and 4.14. Added them now.

@sdodson
Copy link
Member

sdodson commented Mar 16, 2026

/lgtm
/approved

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdodson, ybettan
Once this PR has been reviewed and has the lgtm label, please assign dgoodwin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot removed the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Mar 16, 2026
@ybettan
Copy link
Member Author

ybettan commented Mar 16, 2026

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@ybettan: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@sdodson
Copy link
Member

sdodson commented Mar 16, 2026

/label approved

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 16, 2026
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2026
@ybettan
Copy link
Member Author

ybettan commented Mar 16, 2026

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@ybettan: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2026

New changes are detected. LGTM label has been removed.

@openshift-ci openshift-ci bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 16, 2026
Following up on 1f3172b, also making it a
blocking job for 4.12 <= OCP <= 4.15.

The driver-toolkit image contain the kernel RPMs that matches the exact
kernel version on the nodes for a giver OCP payload.

Having a kernel mismatch between RHCOS and DTK will break every user using DTK
including the NVIDIA GPU operator.

Signed-off-by: Yoni Bettan <yonibettan@gmail.com>
@ybettan
Copy link
Member Author

ybettan commented Mar 16, 2026

/pj-rehearse

@openshift-ci-robot
Copy link
Contributor

@ybettan: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@ybettan: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-release-main-nightly-4.13-e2e-azure-upgrade-cnv N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-azure-sdn-fips-serial N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-gcp-ovn-csi N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-ovn-serial N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-console-aws N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-ovn-single-node-serial N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-azure-sdn-fips N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-azure-sdn-fips-serial N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-aws-sdn-crun N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-ovirt-sdn N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-metal-ipi-ovn-dualstack-local-gateway N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-metal-ovn-single-node-live-iso N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-gcp-ovn-upi N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-alibaba-csi N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-upgrade-from-stable-4.13-e2e-aws-sdn-upgrade N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-gcp-ovn-upgrade N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-metal-ipi-upgrade-ovn-ipv6 N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-metal-ipi-ovn-ipv4 N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-metal-ipi-serial-ovn-dualstack N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-sdn-crun N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-ovn-shared-vpc-phz-techpreview N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-aws-sdn-serial N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-e2e-vsphere-sdn N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.13-e2e-aws-csi N/A periodic Ci-operator config changed
periodic-ci-openshift-release-main-nightly-4.14-upgrade-from-stable-4.13-e2e-aws-upgrade-ovn-single-node N/A periodic Ci-operator config changed

A total of 174 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@sdodson
Copy link
Member

sdodson commented Mar 16, 2026

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.12-e2e-aws-driver-toolkit periodic-ci-openshift-release-main-nightly-4.15-e2e-aws-driver-toolkit periodic-ci-openshift-release-main-nightly-4.13-e2e-aws-driver-toolkit

Looks like 4.13 rehearsal failed to start the debug container, flakey, lets hope.

@openshift-ci-robot
Copy link
Contributor

@sdodson: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

@sdodson: job(s): periodic-ci-openshift-release-main-nightly-4.12-e2e-aws-driver-toolkit, periodic-ci-openshift-release-main-nightly-4.15-e2e-aws-driver-toolkit either don't exist or were not found to be affected, and cannot be rehearsed

@ybettan
Copy link
Member Author

ybettan commented Mar 16, 2026

/pj-rehearse periodic-ci-openshift-release-main-nightly-4.13-e2e-alibaba-ovn

level=error msg=failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: [controlPlane.platform.alibabacloud.instanceType: Invalid value: "ecs.g6.2xlarge": no available availability zones found, compute[0].platform.alibabacloud.instanceType: Invalid value: "ecs.g6.2xlarge": no available availability zones found]

@openshift-ci-robot
Copy link
Contributor

@ybettan: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2026

@ybettan: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-release-main-ci-4.10-e2e-aws 4d3e670 link unknown /pj-rehearse periodic-ci-openshift-release-main-ci-4.10-e2e-aws
ci/rehearse/periodic-ci-openshift-release-main-nightly-4.13-console-aws caef72c link unknown /pj-rehearse periodic-ci-openshift-release-main-nightly-4.13-console-aws
ci/rehearse/periodic-ci-openshift-release-main-nightly-4.13-e2e-alibaba-ovn caef72c link unknown /pj-rehearse periodic-ci-openshift-release-main-nightly-4.13-e2e-alibaba-ovn
ci/rehearse/periodic-ci-openshift-release-main-ci-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade-workload 4d3e670 link unknown /pj-rehearse periodic-ci-openshift-release-main-ci-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade-workload

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants