Skip to content

[Test] Fix test_patching_cluster false failures on Lustre modules and private-OS AMI drift#7467

Merged
gmarciani merged 2 commits into
aws:developfrom
gmarciani:wip/mgiacomo/3160/test-patching-0701-2
Jul 2, 2026
Merged

[Test] Fix test_patching_cluster false failures on Lustre modules and private-OS AMI drift#7467
gmarciani merged 2 commits into
aws:developfrom
gmarciani:wip/mgiacomo/3160/test-patching-0701-2

Conversation

@gmarciani

@gmarciani gmarciani commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Description of changes

Fix test_patching_cluster false failures on Lustre modules and private-OS AMI drift

In particular:

  1. The FSx Lustre mount is on-demand (x-systemd.automount), so its client modules aren't loaded on the head node right after the patch reboot. Trigger the mount before the post-patch snapshot and define the mount dir once in the test, injecting it into the cluster config.

  2. Pin private OSes (rocky8/rocky9) to the AMI the cluster was created with on updates, since the framework re-injects the latest private AMI on every render and the update would otherwise drift to a newer, possibly unavailable, AMI.

Also added a log line to report the active kernel after the patching to facilitate troubleshooting.

Tests

SUCCESS:

test-suites:
  patching:
    test_patching.py::test_patching_cluster:
      dimensions:
        - regions: [{{ g4dn_8xlarge_CAPACITY_RESERVATION_3_INSTANCES_2_HOURS_NOPG_rocky8 }}]
          instances: ["g4dn.8xlarge"]
          oss: ["rocky8"]
          schedulers: ["slurm"]

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

… private-OS AMI drift

The FSx Lustre mount is on-demand (x-systemd.automount), so its client
modules aren't loaded on the head node right after the patch reboot. Trigger
the mount before the post-patch snapshot and define the mount dir once in the
test, injecting it into the cluster config.

Also pin private OSes (rocky8/rocky9) to the AMI the cluster was created with
on updates, since the framework re-injects the latest private AMI on every
render and the update would otherwise drift to a newer, possibly unavailable,
AMI.
@gmarciani gmarciani requested review from a team as code owners July 1, 2026 23:23
@gmarciani gmarciani added skip-changelog-update Disables the check that enforces changelog updates in PRs 3.x Test labels Jul 1, 2026
@gmarciani gmarciani enabled auto-merge (rebase) July 2, 2026 03:08
@gmarciani gmarciani merged commit faba4a8 into aws:develop Jul 2, 2026
19 checks passed
@gmarciani gmarciani deleted the wip/mgiacomo/3160/test-patching-0701-2 branch July 2, 2026 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.x skip-changelog-update Disables the check that enforces changelog updates in PRs Test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants