Skip to content

Forward‑porting Azure-related changes from Jammy to Noble#476

Open
s4heid wants to merge 14 commits intocloudfoundry:ubuntu-noblefrom
s4heid:azure-cutover-to-noble
Open

Forward‑porting Azure-related changes from Jammy to Noble#476
s4heid wants to merge 14 commits intocloudfoundry:ubuntu-noblefrom
s4heid:azure-cutover-to-noble

Conversation

@s4heid
Copy link
Contributor

@s4heid s4heid commented Feb 25, 2026

To keep the Noble stemcell aligned with recent fixes introduced in Jammy, this PR forward‑ports the following changes to the Noble branch using cherry‑picks.

Included pull requests

Note

There is no need to forward-port #443, because the SRIOV-udev rules are already included in the azure-vm-utils package, which was added to Noble as part of #469.

Resolves #450

s4heid and others added 14 commits March 4, 2026 09:45
The typo leads to problems when cloud-init attempts to load the logging
configuration. This is evident in the logs, as shown by the following
stack trace:

> cloud-init[515]: 2025-09-24 12:07:29,260 - util.py[WARNING]: Failed
>   loading yaml blob. Invalid format at line 10 column 1: "expected
?   '<document start>', but found '<block mapping start>'
> cloud-init[515]:   in "<unicode string>", line 10, column 1:
> cloud-init[515]:     _log:
> cloud-init[515]:     ^"
In a previous change (cloudfoundry#449), loading the floppy module was disabled by
redirecting the floppy command to true. However, Buffer I/O errors
observed in the `dmesg` output indicate that the kernel is still
attempting to load the floppy module when the hardware supports it.

```txt
blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
floppy0: disk absent or changed during operation
```

This suggests that during boot:
1. Kernel detects floppy controller hardware
2. udev/kernel auto-loads floppy module (install directive not active yet)
3. Floppy driver starts, finds no disk → I/O errors
4. install directive becomes active (too late!)

The blacklist in /etc/modprobe.d only affects modprobe after the root
filesystem is active. If the initramfs contains a `floppy.ko` file and
is not rebuilt after the blacklist has been applied, the initramfs
auto-loads the floppy driver.
`update-initramfs` rebuilds the initramfs and includes the /etc/modprobe.d
rules to the root fs.
The system_kernel_modules stage fails when running update-initramfs
with FIPS kernels because the FIPS initramfs hooks require /sys to be
mounted for hardware introspection when MODULES=dep is configured.

The run_in_chroot helper creates an isolated mount namespace via
unshare -m, which only mounts /dev and /proc but not /sys. This causes
mkinitramfs to fail with "MODULES dep requires mounted sysfs on /sys"
when the FIPS hooks attempt to scan hardware capabilities.

Mount /sys explicitly in system_kernel_modules/apply.sh before calling
update-initramfs to ensure the kernel device model and driver
information is available for initramfs generation.
Azure runs on the Hyper-V hypervisor, and Linux VMs rely on Linux Integration
Services (LIS) to communicate with the host. These services include kernel
modules (like hv_utils) and daemons (hv_kvp_daemon, hv_vss_daemon, etc.) that
handle:

- Host-to-guest communication
- IP address reporting
- VM metadata exchange
- Backup coordination (via VSS)

The hv_kvp_daemon specifically manages key-value pair exchange between the
Azure host and the Linux guest. If the daemon is missing, Azure may not
be able to retrieve guest metadata (e.g., hostname, IP address) This can
affect Azure Resource Graph queries, scripts that rely on guest-reported
data, backup and monitoring tools that use KVP for coordination

cloud-init writes warnings about events that could not be sent to hv-kvp:

```txt
$ grep kvp /var/log/cloud-init.log
failed to truncate kvp pool file, [Errno 2] No such file or directory: '/var/lib/hyperv/.kvp_pool_1'
failed posting events to kvp, [Errno 2] No such file or directory: '/var/lib/hyperv/.kvp_pool_1'
```
Azure stemcells now properly use Azure-optimized APT mirrors
(azure.archive.ubuntu.com) by enabling the apt-configure module
in cloud-init configuration.

The LISA (Linux Integration Services Automation) test suite's
verify_repository_installed test was failing because Azure VMs
were using archive.ubuntu.com instead of azure.archive.ubuntu.com
for APT package repositories. While the Azure mirror configuration
existed in /etc/cloud/cloud.cfg.d/90-azure-apt-sources.cfg, it
was never applied because the apt-configure module was not enabled
in cloud-init's module list.

Added apt-configure to the cloud_init_modules list in cloud.cfg,
which instructs cloud-init to read and apply the Azure APT mirror
configuration at VM boot time. This ensures /etc/apt/sources.list
is automatically updated to use Azure-optimized mirrors.
 * upgrades waagent to the latest version
 * syncs the systemd config, which adds memory-accounting and a dependency
   to cloud-init.service, which ensures that the waagent.service does not start
   before cloud-init has finished.

<https://github.com/Azure/WALinuxAgent/blob/v2.15.0.1/init/ubuntu/walinuxagent.service>
waagent assumes that the `/var/log/azure` directory exists and raises an
exception in the telemetry module if it does not.

```sh
ERROR TelemetryEventsCollector ExtHandler Event: name=WALinuxAgent,
op=ExtensionTelemetryEventProcessing, message=Unknown error occurred
when trying to collect extension events:[Errno 2] No such file or
directory: '/var/log/azure'
```
Co-Authored-By: Sebastian Heid <8442432+s4heid@users.noreply.github.com>
add systemd override (FIFO scheduling, priority 50, OOMScoreAdjust -500) change
Hyper-V PTP refclock poll 3 -> -1 (sampling interval max from 8s to 0.5s; with
dpoll -2 dynamic range shifts from 2-8s to 0.125-0.5s) for tighter
offset/jitter. (cloudfoundry#442)

Co-Authored-By: Sebastian Heid <8442432+s4heid@users.noreply.github.com>
This change sets the correct path when creating the chrony.service
directory. The wrong path was introduced in cloudfoundry#442.
@s4heid s4heid force-pushed the azure-cutover-to-noble branch from 8b8a7c2 to 72c9bbe Compare March 4, 2026 14:01
@s4heid s4heid requested a review from ramonskie March 4, 2026 14:04
Copy link
Contributor

@ragaskar ragaskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the number of changes is a little tricky to reason about. I know it's a lot of busy-work, but I think if each of these was its own PR I'd be more inclined to hit the Approve button (because if there's a break it would be a little easier to track which change caused it).

That said, these mostly LGTM? @ramonskie -- I see you've already looked at this and have been pinged to take another look given the fixes you suggested, and I definitely trust your knowledge here more than my own, so I'll let you click the Approve button.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Pending Review | Discussion

Development

Successfully merging this pull request may close these issues.

5 participants