Skip to content

OCPBUGS-66898: Implement mTLS authentication and authorization for CVO metrics endpoint#1271

Open
DavidHurta wants to merge 12 commits intoopenshift:mainfrom
DavidHurta:metrics-mtls
Open

OCPBUGS-66898: Implement mTLS authentication and authorization for CVO metrics endpoint#1271
DavidHurta wants to merge 12 commits intoopenshift:mainfrom
DavidHurta:metrics-mtls

Conversation

@DavidHurta
Copy link
Contributor

@DavidHurta DavidHurta commented Dec 10, 2025

This pull request proposes to migrate the CVO metrics endpoint from bearer token authentication to mutual TLS (mTLS) to comply with OpenShift metrics conventions.

Key changes:

  • Replaced custom certificate watching logic with the k8s.io/apiserver/pkg/server/dynamiccertificates package to simplify logic
  • Authenticate clients using mutual TLS (mTLS)
  • Authorize clients by verifying CN
  • Add configurable authentication/authorization options for HyperShift compatibility
  • Update ServiceMonitor to use mTLS instead of bearer tokens

Breaking Changes:

  • Prometheus must now authenticate using client certificates signed by a trusted cluster CA. Clients without valid certificates are rejected during the TLS handshake.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 10, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link

coderabbitai bot commented Dec 10, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Switched metrics auth from bearer tokens to TLS client certificates with CN-based authorization; introduced dynamic serving-cert and client-CA controllers; added public MetricsOptions and updated RunMetrics/createHttpServer signatures; updated ServiceMonitor, go.mod, tests, startup flags, and call sites.

Changes

Cohort / File(s) Summary
Go module updates
go.mod
Reorganized dependencies: added direct k8s.io/apiserver v0.34.1, removed direct gopkg.in/fsnotify.v1 from primary require block, and added github.com/fsnotify/fsnotify v1.9.0 as an indirect dependency.
ServiceMonitor manifest
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Removed bearerTokenFile from the ServiceMonitor endpoint and configured TLS client-cert scraping via tlsConfig.certFile / tlsConfig.keyFile.
Metrics implementation
pkg/cvo/metrics.go
Replaced token-review flow with TLS client-cert auth and CN-based authorization; introduced public MetricsOptions; changed RunMetrics and createHttpServer signatures; added dynamic serving-cert and client-CA controllers and GetConfigForClient TLS wiring; removed tokenReview interface and file-watcher rotation.
Metrics tests
pkg/cvo/metrics_test.go
Rewrote tests to use generated TLS client certificates and a mock CA provider; expanded authn/authz scenarios (no security, authn-only, authz-only, both, untrusted/missing certs); removed token-based test scaffolding.
Startup, flags, and integration tests
pkg/start/start.go, cmd/cluster-version-operator/start.go, pkg/start/start_integration_test.go
Replaced top-level ListenAddr/ServingCertFile/ServingKeyFile with nested MetricsOptions on exported Options; updated flag bindings, validation, call sites, and integration test assignments to use MetricsOptions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@DavidHurta
Copy link
Contributor Author

/test ?

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 10, 2025
@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-ovn
/test e2e-hypershift
/test e2e-hypershift-conformance

1 similar comment
@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-ovn
/test e2e-hypershift
/test e2e-hypershift-conformance

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

DavidHurta commented Dec 13, 2025

Note: PromeCIeus and the e2e-hypershift job can be used to verify that the CVO metrics in a hosted control plane can be scraped (when HyperShift is specified to use the OCP monitoring).

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Dec 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/cvo/metrics.go (1)

201-299: Harden the listener TLS config: ensure MinVersion/ciphers apply even if GetConfigForClient returns nil.

The base TLS config is secured via crypto.SecureTLSConfig, but the listener uses a fresh tls.Config literal containing only GetConfigForClient. If the callback returns nil or fails, the listener operates with default TLS settings lacking explicit cipher suite and minimum version constraints. Wrap the listener config with crypto.SecureTLSConfig to guarantee hardened settings at the listener level.

-	tlsConfig := &tls.Config{GetConfigForClient: servingCertController.GetConfigForClient}
+	tlsConfig := crypto.SecureTLSConfig(&tls.Config{GetConfigForClient: servingCertController.GetConfigForClient})
 	go startListening(server, tlsConfig, listenAddress, resultChannel)

Also verify the process has RBAC to read kube-system/extension-apiserver-authentication and that client-ca-file is the intended trust root for Prometheus' client cert chain in all target environments.

🧹 Nitpick comments (3)
pkg/cvo/metrics_test.go (1)

4-6: CN-based TLS auth tests are aligned with the new implementation; consider using httptest.NewRequest for a fully-formed request.
Current http.NewRequest("GET", "url-not-important", nil) is probably fine for this handler, but httptest.NewRequest("GET", "https://example.invalid/metrics", nil) is more idiomatic and avoids edge cases if the handler later reads URL/Host.

Also applies to: 1030-1088

pkg/cvo/metrics.go (2)

126-143: Rename disableAuth to disableAuthorization to avoid confusion (authn vs authz).
Right now createHttpServer(disableAuth bool) is called with metricsOptions.DisableAuthorization, but the parameter name reads like “disable authentication”.

-func createHttpServer(disableAuth bool) *http.Server {
-	if disableAuth {
+func createHttpServer(disableAuthorization bool) *http.Server {
+	if disableAuthorization {
 		handler := http.NewServeMux()
 		handler.Handle("/metrics", promhttp.Handler())
 		server := &http.Server{
 			Handler: handler,
 		}
 		return server
 	}

149-166: CN authorization is very strict; confirm the allowed identity set is complete for all supported deployments.
Hardcoding only system:serviceaccount:openshift-monitoring:prometheus-k8s may be correct, but it will break scraping if the client cert CN differs in hosted control planes / alternate Prometheus setups. Consider allowing a small allowlist (still static) to reduce fragility.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between d4cb3b0 and 920ad71.

⛔ Files ignored due to path filters (69)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/fsnotify/fsnotify/.cirrus.yml is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/.gitignore is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/.mailmap is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/CHANGELOG.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/CONTRIBUTING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/LICENSE is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/README.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_fen.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_inotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_other.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/fsnotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_dragonfly.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_freebsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_linux.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_netbsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_openbsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_solaris.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/freebsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/internal.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/unix.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/unix2.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/shared.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/staticcheck.conf is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/system_bsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/system_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.editorconfig is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.gitignore is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.travis.yml is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/AUTHORS is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/CHANGELOG.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/CONTRIBUTING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/LICENSE is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/README.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/fen.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/fsnotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/inotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/inotify_poller.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/open_mode_bsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/open_mode_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/cert_key.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/client_ca.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_cafile_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_serving_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_sni_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/interfaces.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/named_certificates.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/static_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/tlsconfig.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/union_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/util.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/OWNERS is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/event_broadcaster.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/event_recorder.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/fake.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/helper.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/interfaces.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (5)
  • go.mod (2 hunks)
  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1 hunks)
  • pkg/cvo/metrics.go (5 hunks)
  • pkg/cvo/metrics_test.go (2 hunks)
  • pkg/start/start.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml
  • pkg/cvo/metrics_test.go
  • pkg/start/start.go
  • pkg/cvo/metrics.go
  • go.mod
🧬 Code graph analysis (1)
pkg/start/start.go (1)
pkg/cvo/metrics.go (2)
  • MetricsOptions (196-199)
  • RunMetrics (208-299)
🪛 ast-grep (0.40.0)
pkg/cvo/metrics.go

[warning] 245-245: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 247-247: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{ClientAuth: tls.RequireAndVerifyClientCert}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 273-273: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{GetConfigForClient: servingCertController.GetConfigForClient}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🔇 Additional comments (5)
install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1)

14-21: Verify Prometheus mounts metrics-client-certs at the referenced file paths (otherwise scraping will fail).
ServiceMonitor now expects client cert/key at /etc/prometheus/secrets/metrics-client-certs/{tls.crt,tls.key}; ensure the cluster-monitoring-managed Prometheus pod has that secret mounted and populated in all target topologies (self-managed + hosted).

go.mod (1)

29-29: Dependency changes look consistent with dynamiccertificates adoption; please verify module graph is tidy and conflict-free.
Run a go mod tidy-equivalent in CI and ensure no remaining imports of gopkg.in/fsnotify.v1, and no k8s.io/* version skew introduced by the new direct k8s.io/apiserver requirement.

Also applies to: 42-42

pkg/start/start.go (1)

353-358: Double-check HyperShift semantics: this disables both client-cert auth and CN authorization.
Wiring to o.HyperShift is clean, but please confirm the hosted-control-plane scraping model really expects unauthenticated metrics over TLS (vs. mTLS with different client identity).

pkg/cvo/metrics.go (2)

18-20: Good move switching to dynamiccertificates; verify k8s.io/apiserver integration matches the vendored K8s minor version.
This import-level change is appropriate for hot-reloading certs/CA; just ensure no k8s.io/* skew and that dynamiccertificates APIs used here match v0.34.1 expectations.


196-199: MetricsOptions is a clean API shape for callers.

@DavidHurta
Copy link
Contributor Author

/test e2e-hypershift
/test e2e-agnostic-ovn

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Dec 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/cvo/metrics.go (1)

149-166: Certificate-based authorization correctly implemented.

The CN validation logic is sound: it checks for the presence of peer certificates and validates that the Common Name matches the expected Prometheus service account. The HTTP status codes (401 for missing certificate, 403 for unauthorized CN) are appropriate.

Note: The CN "system:serviceaccount:openshift-monitoring:prometheus-k8s" is hardcoded. Consider whether this should be configurable via MetricsOptions if other clients might need access in the future.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 920ad71 and ff6d96e.

📒 Files selected for processing (4)
  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1 hunks)
  • pkg/cvo/metrics.go (5 hunks)
  • pkg/cvo/metrics_test.go (2 hunks)
  • pkg/start/start.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/cvo/metrics_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml
  • pkg/start/start.go
  • pkg/cvo/metrics.go
🧬 Code graph analysis (1)
pkg/start/start.go (1)
pkg/cvo/metrics.go (2)
  • MetricsOptions (196-199)
  • RunMetrics (208-312)
🪛 ast-grep (0.40.0)
pkg/cvo/metrics.go

[warning] 256-256: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{ClientAuth: clientAuth}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 283-286: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{
ClientAuth: clientAuth,
GetConfigForClient: servingCertController.GetConfigForClient,
}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🔇 Additional comments (10)
install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1)

14-21: LGTM! TLS client certificate authentication correctly configured.

The ServiceMonitor now uses TLS client certificates (certFile and keyFile) for authentication instead of bearer tokens, aligning with the mTLS implementation in pkg/cvo/metrics.go that validates the client certificate's CN.

pkg/cvo/metrics.go (8)

18-19: LGTM! Imports correctly updated for certificate-based approach.

The new imports for dynamiccertificates and kubernetes client replace the previous token-based authentication dependencies and support the dynamic certificate loading implementation.


126-143: LGTM! Simplified server creation aligns with TLS-based authentication.

The function signature correctly removes the authentication client dependency since authentication is now handled at the TLS layer via client certificates, with authorization delegated to the authHandler.


196-199: LGTM! Clean options struct for metrics configuration.

The MetricsOptions struct provides a clear, type-safe way to configure authentication and authorization behavior for the metrics server.


208-223: LGTM! Serving certificate controller correctly initialized.

The serving certificate controller is properly set up to watch and automatically reload the server's TLS certificate and key files. The RunOnce initialization before starting the background watcher ensures the files are loaded before the server starts.


225-249: LGTM! Client CA controller correctly set up when authentication is enabled.

The conditional creation and initialization of the client CA controller is appropriate—it's only needed when client certificate authentication is enabled. The controller watches the extension-apiserver-authentication ConfigMap in kube-system for CA changes, which aligns with Kubernetes' standard approach for extension API server authentication.


276-288: LGTM! Server and TLS configuration properly wired.

The HTTP server creation respects the DisableAuthorization setting, and the TLS configuration correctly uses GetConfigForClient to enable dynamic certificate updates. The ClientAuth setting in the final config ensures proper fallback behavior if the callback fails.


290-311: LGTM! Server lifecycle properly managed with graceful shutdown.

The implementation correctly handles the run/shutdown contexts, waits for server termination, performs graceful shutdown on context cancellation, and properly collects results from the server goroutine. Error handling and logging are appropriate.


252-267: This concern is not valid. crypto.SecureTLSConfig from github.com/openshift/library-go/pkg/crypto automatically sets MinVersion to the cluster's default TLS version (via DefaultTLSVersion()) when MinVersion is not explicitly set in the passed tls.Config. The code in metrics.go correctly relies on this behavior and does not need explicit MinVersion configuration.

Likely an incorrect or invalid review comment.

pkg/start/start.go (1)

353-357: LGTM! MetricsOptions correctly configured for HyperShift mode.

The creation of the MetricsOptions struct with both DisableAuthentication and DisableAuthorization set to o.HyperShift is appropriate. This ensures that in HyperShift deployments, both authentication and authorization are disabled together, while in standard deployments, both are enabled.

@DavidHurta
Copy link
Contributor Author

/retest

1 similar comment
@DavidHurta
Copy link
Contributor Author

/retest

@DavidHurta
Copy link
Contributor Author

/test e2e-hypershift-conformance
/test e2e-hypershift

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

@DavidHurta
Copy link
Contributor Author

/hold

Developing and testing.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 4, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pkg/cvo/metrics.go`:
- Around line 206-209: The HTTP handler in metrics.go is returning the rejected
common name in the response body (commonName vs metricsAllowedClientCommonName),
which can leak internal names; change the http.Error call to return a generic
message like "access denied" or "unauthorized" while keeping the detailed
klog.V(4).Infof that logs the actual commonName for internal debugging; ensure
status remains http.StatusForbidden and do not include commonName in any
client-facing response (leave klog line unchanged or adjust verbosity if
needed).
🧹 Nitpick comments (1)
pkg/cvo/metrics.go (1)

197-215: Hardcoded authorized CN limits extensibility.

The metricsAllowedClientCommonName constant is defined inline within the authorize function. While this is acceptable for the current use case (only Prometheus scraping), consider extracting it to a package-level constant or making it configurable via MetricsOptions if future requirements may need additional authorized clients.

@DavidHurta
Copy link
Contributor Author

/retest

@DavidHurta
Copy link
Contributor Author

/unhold

@DavidHurta
Copy link
Contributor Author

/test okd-scos-images

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 16, 2026
Copy link
Member

@hongkailiu hongkailiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a question about a0b08df.
Others are just NITs and remarks.

@hongkailiu
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2026
DavidHurta and others added 11 commits February 18, 2026 14:59
…updates

This commit's goal is to prepare the existing code for mTLS support.

In OpenShift, core operators SHOULD require authentication, and they
SHOULD support TLS client certificate authentication [1]. They also
SHOULD support local authorization and SHOULD allow the well-known
metrics scraping identity [1]. To achieve this, an operator must be able
to verify a client's certificate. To do this, the certificate can be
verified using the certificate authority (CA) bundle located in a
ConfigMap in the kube-system namespace [2].

This would entail an implementation of a new controller to watch the
ConfigMap for changes. To avoid such implementation to avoid
potential bugs and future maintenance, my goal is to utilize the
`k8s.io/apiserver/pkg/server/dynamiccertificates` package for this goal
as the package provides a functionality for this specific use case.

While doing so, we can also rework the existing, a bit complex,
implementation and utilize the package for existing use cases as well
to simplify the logic and use an existing, well-tested library.

[1]: https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#metrics
[2]: https://rhobs-handbook.netlify.app/products/openshiftmonitoring/collecting_metrics.md/#exposing-metrics-for-prometheus

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
In OpenShift, core operators SHOULD require authentication and they
SHOULD support TLS client certificate authentication [1]. They also
SHOULD support local authorization and SHOULD allow the well-known
metrics scraping identity [1]. To achieve this, an operator must be able
to verify a client's certificate. To do this, the certificate can be
verified using the certificate authority (CA) bundle located at the
client-ca-file key of the kube-system/extension-apiserver-authentication
ConfigMap [2].

Guarantee failed connections when the config from the GetConfigForClient
method is nil to ensure connections are only using the TLS config
from the serving cert controller.

[1]: https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#metrics
[2]: https://rhobs-handbook.netlify.app/products/openshiftmonitoring/collecting_metrics.md/#exposing-metrics-for-prometheus

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
In OpenShift, core operators SHOULD support local authorization and
SHOULD allow the well-known metrics scraping identity
(system:serviceaccount:openshift-monitoring:prometheus-k8s) to access
the /metrics endpoint. They MAY support delegated authorization check
via SubjectAccessReviews. [1]

The well-known metrics scraping identity's client certificate is issued
for the system:serviceaccount:openshift-monitoring:prometheus-k8s
Common Name (CN) and signed by the kubernetes.io/kube-apiserver-client
signer. [2]

Thus, the commit utilizes this fact to check the client's certificate
for this specific CN value. This is also done by the hardcodedauthorizer
package utilized by other OpenShift operators for the metrics
endpoint [3].

We could utilize the existing bearer token authorization as a fallback.
However, I would like to minimize the attack surface. Especially for
security things that we are implementing and testing, rather than
importing from well-established modules.

The commit implements a user information extraction from a
certificate to minimize the needed dependencies.

[1]: https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#metrics
[2]: https://rhobs-handbook.netlify.app/products/openshiftmonitoring/collecting_metrics.md/#exposing-metrics-for-prometheus
[3]: https://pkg.go.dev/github.com/openshift/library-go/pkg/authorization/hardcodedauthorizer

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The `o.HyperShift` option is not available in older release branches.
In newer branches, the option can be utilized.
In HyperShift, the CVO currently needs to have disabled both
authorization and authentication. Ensure the aspects are disabled so as
not break HyperShift.

However, in the future, the authentication will be enabled using mTLS
and a mounted CA bundle file. Thus, authentication needs to be
configurable.

Authorization needs to be configurable as well because HyperShift
allows a custom monitoring stack to scrape hosted control plane
components. In the future in HyperShift, authentication of the metrics
endpoint of the CVO will be enforced; however, the authorization will be
disabled. This commit prepares the code for these changes.
This is done to provide HTTP return values in failures to comply
with the origin test suite.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2026
@hongkailiu
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DavidHurta, hongkailiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [DavidHurta,hongkailiu]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 18, 2026

@DavidHurta: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agnostic-ovn 62f88dd link true /test e2e-agnostic-ovn
ci/prow/e2e-agnostic-ovn-techpreview-serial 62f88dd link true /test e2e-agnostic-ovn-techpreview-serial

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments