Skip to content

Add IMDSv2 migration command for ROSA Classic clusters#909

Open
givaldolins wants to merge 1 commit into
openshift:masterfrom
givaldolins:imdsv2
Open

Add IMDSv2 migration command for ROSA Classic clusters#909
givaldolins wants to merge 1 commit into
openshift:masterfrom
givaldolins:imdsv2

Conversation

@givaldolins

@givaldolins givaldolins commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Implements automated IMDSv2 migration workflow with:

  • Pre-flight health checks (ClusterOperators, CPMS, nodes, etcd)
  • Hive MachinePool patching for infra/worker nodes
  • Sequential infra machine replacement with health validation
  • ControlPlaneMachineSet update for master node rollout
  • Post-migration validation

Supports selective migration via --nodes flag (all, master, infra, workers).

This MR automates SOP

Summary by CodeRabbit

  • New Features

    • Added osdctl cluster imdsv2 to migrate ROSA Classic clusters to require EC2 IMDSv2 authentication.
    • Supports targeting --nodes all|master|infra|workers, with pre-flight gating, interactive confirmations, and post-migration validation of node and provider-spec configuration.
  • Documentation

    • Added full command documentation and integrated it into existing cluster command references, including examples and --nodes/flag details.
  • Tests

    • Added a comprehensive test suite covering validation, pre-flight rules, configuration checks, and node readiness filtering.

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Walkthrough

This PR introduces osdctl cluster imdsv2, a new command that orchestrates AWS ROSA Classic cluster node migration to require IMDSv2. The implementation coordinates pre-flight validation, Hive MachinePool patching, infra machine sequential replacement, ControlPlaneMachineSet updates, and final verification across multiple Kubernetes and OCM resources.

Changes

IMDSv2 Migration Command

Layer / File(s) Summary
Cluster and Hive client utilities
cmd/cluster/helpers.go
Provides ClusterClients struct and SetupClusterClients/SetupHiveClients to establish read-only and elevated admin clients; includes health-check helpers (CheckClusterOperators, WaitForClusterOperatorsHealthy), CPMS monitoring (CheckCPMSState, MonitorCPMSRollout), node utilities (CountReadyNodes), and validation helpers (ValidateAWSClassicCluster, GetHiveNamespace).
Command registration and scaffolding
cmd/cluster/cmd.go, cmd/cluster/imdsv2.go (lines 1–44)
Registers newCmdIMDSv2() subcommand; defines module constants for IMDSv2 mode, timeouts, and Hive override annotation; imports required Kubernetes, OCM, and Hive API packages.
Command options and input validation
cmd/cluster/imdsv2.go (lines 46–109)
Defines imdsv2Options struct and Cobra command with flags (--cluster-id, --nodes, --reason); enforces cluster key format and allowed node-role values (all, master, infra, workers).
Client initialization and cluster validation
cmd/cluster/imdsv2.go (lines 111–149)
Establishes OCM connection, retrieves cluster, rejects non-AWS and HCP/Hypershift clusters, sets up standard and elevated Kubernetes clients, conditionally initializes Hive clients for infra/worker operations.
Pre-flight safety validation
cmd/cluster/imdsv2.go (lines 232–285)
Checks ClusterOperator health, validates CPMS Active state with correct replicas, confirms 3 Ready master nodes, verifies infra node readiness when applicable, and checks CPMS state for master migration.
Migration orchestration and execution
cmd/cluster/imdsv2.go (lines 151–673)
Orchestrates run sequence: infra MachinePool patching (skips if already configured, clones and patches, waits for operator health); worker MachinePool patching (lists, filters, prompts, patches with admin client); ControlPlaneMachineSet update (unmarshals spec, skips if configured, prompts for rollout, patches, monitors completion); final validation (polls node readiness, checks Machine provider specs, reports warnings).
Test coverage
cmd/cluster/imdsv2_test.go
Unit tests validate command options, node readiness polling with deletion/cordoning skipping logic, machine pool IMDSv2 configuration detection, CPMS state checking, preflight master node enforcement, node-role filtering, helper functions, and AWS Classic cluster validation across provider and hypershift scenarios.
Documentation
docs/README.md, docs/osdctl_cluster.md, docs/osdctl_cluster_imdsv2.md
Adds imdsv2 command to README cluster overview with detailed workflow and flags; updates cluster command reference; introduces dedicated osdctl_cluster_imdsv2.md page with usage, examples, and CLI interface documentation.

Sequence Diagram

sequenceDiagram
  participant User as User/CLI
  participant Cmd as osdctl imdsv2
  participant OCM as OCM API
  participant K8s as Kubernetes API
  participant Hive as Hive API

  User->>Cmd: cluster imdsv2 --cluster-id=... --nodes=...
  Cmd->>OCM: Fetch cluster and validate AWS Classic
  Cmd->>K8s: Initialize standard and admin clients
  Cmd->>Hive: Initialize Hive clients (if infra/workers)
  
  Cmd->>K8s: Pre-flight: Check operators, CPMS, nodes, etcd

  rect rgba(100, 200, 150, 0.5)
  Note over Cmd,Hive: Infra Migration (if infra/all)
  Cmd->>Hive: Get infra MachinePool
  Cmd->>Hive: Patch for IMDSv2
  Cmd->>K8s: Wait for operators healthy
  end

  rect rgba(200, 150, 100, 0.5)
  Note over Cmd,K8s: CPMS Update (if master/all)
  Cmd->>K8s: Get ControlPlaneMachineSet
  Cmd->>K8s: Patch for IMDSv2
  Cmd->>K8s: Monitor rollout completion
  end

  rect rgba(150, 150, 200, 0.5)
  Note over Cmd,Hive: Worker Migration (if workers/all)
  Cmd->>Hive: List worker MachinePools
  Cmd->>Hive: Patch for IMDSv2
  end

  Cmd->>K8s: Final validation: poll nodes Ready
  Cmd->>K8s: Check Machine provider specs
  Cmd->>User: Success/warnings
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 13 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning The test file uses standard Go testing (testing.T), not Ginkgo, so the Ginkgo-specific requirements don't apply. However, the tests fail general Go testing quality standards: (1) Multiple tests ign... Tests don't use Ginkgo framework (which the check specifically requires), but have significant Go testing quality issues: fix setup error handling with require.NoError(), correct line 397 to fmt.Sprintf("master-%d", i), add assertion mes...
✅ Passed checks (13 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the primary change: adding an IMDSv2 migration command for ROSA Classic clusters, which is exactly what the changeset implements across multiple files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The PR's test file (imdsv2_test.go) uses standard Go testing with *testing.T, not Ginkgo. All test names are static and descriptive with no dynamic values; Ginkgo-specific constructs (It, Describe,...
Microshift Test Compatibility ✅ Passed This PR adds only regular Go unit tests (cmd/cluster/imdsv2_test.go using testing.T), not Ginkgo e2e tests. The custom check only applies to new Ginkgo e2e tests, so it does not apply here.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All test additions are unit tests using Go's standard testing package (func TestXxx(t *testing.T)) with testify assertions. This is a CLI tool (osdctl), no...
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds a CLI tool (osdctl cluster imdsv2), not deployment manifests, controllers, or operator code. The check applies to "deployment manifests, operator code, or controllers" which this PR...
Ote Binary Stdout Contract ✅ Passed The custom check applies to OTE (OpenShift Tests Extension) test binaries, but osdctl is a regular CLI tool for SRE operations. The added imdsv2 command is not part of a test execution framework an...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests are present in this PR. The test file (imdsv2_test.go) uses standard Go testing with testing.T interface; no Ginkgo framework imports or patterns (It(), Describe(), Context(), e...
No-Weak-Crypto ✅ Passed PR contains no weak crypto usage. No crypto packages (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB) are imported. Code uses only JSON serialization and Kubernetes/OpenShift APIs for IMDSv2 migration or...
Container-Privileges ✅ Passed PR contains only Go CLI code and documentation; no container/K8s manifests with privileged configurations exist to check.
No-Sensitive-Data-In-Logs ✅ Passed No hardcoded secrets, API keys, passwords, tokens, or PII are exposed in logs. Node/machine names and cluster IDs are standard infrastructure identifiers. The reason flag is user input but intentio...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from devppratik and petrkotas June 3, 2026 20:35

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmd/cluster/imdsv2.go (1)

348-395: ⚖️ Poor tradeoff

Consider using the Kubernetes client instead of shelling out to oc.

Shelling out to oc introduces a dependency on the CLI being installed and configured. The kubeconfig context used by oc may differ from the client initialized via backplane in init(). Consider using the o.client to fetch ClusterOperator resources directly via the config.openshift.io/v1 API group.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/cluster/imdsv2.go` around lines 348 - 395, The checkClusterOperators
function currently shells out to `oc` to list ClusterOperators; replace that
with direct API calls using the existing o.client so the same
kubeconfig/backplane client is used. Use the OpenShift config v1 ClusterOperator
type (e.g., configv1.ClusterOperatorList) or a dynamic client list against
"config.openshift.io/v1" to fetch all ClusterOperator resources with
o.client.List(ctx, &list) (or equivalent), then iterate list.Items and inspect
each item.Status.Conditions exactly as done now to build unhealthyOps, returning
errors on client.List or unmarshalling failures as appropriate; keep the same
final behavior/message and error formatting but remove the exec.CommandContext
call and JSON unmarshal logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 728-754: The function monitorCPMSRollout creates a timeout context
from context.Background(), breaking cancellation propagation; change the context
creation to derive from the passed ctx (use context.WithTimeout(ctx,
imdsv2RolloutPollTimeout) and keep cancel deferred) so parent cancellations
(SIGINT or upstream cancellation) interrupt the polling loop; update the context
creation in monitorCPMSRollout where pollCtx and cancel are defined (and
continue to pass pollCtx into wait.PollUntilContextTimeout as before).

---

Nitpick comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 348-395: The checkClusterOperators function currently shells out
to `oc` to list ClusterOperators; replace that with direct API calls using the
existing o.client so the same kubeconfig/backplane client is used. Use the
OpenShift config v1 ClusterOperator type (e.g., configv1.ClusterOperatorList) or
a dynamic client list against "config.openshift.io/v1" to fetch all
ClusterOperator resources with o.client.List(ctx, &list) (or equivalent), then
iterate list.Items and inspect each item.Status.Conditions exactly as done now
to build unhealthyOps, returning errors on client.List or unmarshalling failures
as appropriate; keep the same final behavior/message and error formatting but
remove the exec.CommandContext call and JSON unmarshal logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 993b8d49-1d85-43e7-9f2d-d04b1a1555f0

📥 Commits

Reviewing files that changed from the base of the PR and between 7117673 and 38748b6.

📒 Files selected for processing (2)
  • cmd/cluster/cmd.go
  • cmd/cluster/imdsv2.go

Comment thread cmd/cluster/imdsv2.go Outdated
@givaldolins

Copy link
Copy Markdown
Contributor Author

/retest

@givaldolins

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@clcollins clcollins left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions.

Also, Claude noticed ""preFlightChecks, monitorCPMSRollout, init, and Hive client setup are near-identical copies of changevolumetype.go" and suggested they should be consolidated/adjusted to work so there's less redundant code.

Comment thread cmd/cluster/imdsv2.go Outdated
}

// replaceInfraMachines replaces infra machines one at a time.
func (o *imdsv2Options) replaceInfraMachines(ctx context.Context) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to try to use

func RunMachinePoolDance(ctx context.Context, clients DanceClients, originalMp, newMp *hivev1.MachinePool, onTimeout func(ctx context.Context, nodes *corev1.NodeList) error) error {
for this piece.

Comment thread cmd/cluster/imdsv2.go Outdated
}

// Delete the machine (MachineSet will create replacement with IMDSv2)
if err := o.clientAdmin.Delete(ctx, machine); err != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are confirmation functions elsewhere in this PR, but there's not one included here. Is that intended or desired?

Comment thread cmd/cluster/imdsv2.go Outdated
}
}

// Step 3: Clean up temporary override annotations from MachinePools

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this process fails partway through, before it gets here? The annotations would continue to live on the machine pools:

Comment thread cmd/cluster/imdsv2.go Outdated
fmt.Printf("Hive namespace: %s\n", hiveNamespace)

// Retrieve all MachinePools for this cluster
mpList := &hivev1.MachinePoolList{}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance this would match machinepools you don't want to patch? Are ALL machinepools in that namespace supposed to be patched, every time? (Truly asking - i'm not 100% sure)

Comment thread cmd/cluster/imdsv2.go Outdated
fmt.Println("Worker nodes must be replaced by the customer.")
}

printer.PrintlnGreen("\n✓ IMDSv2 migration completed successfully!")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Everything before this is an if statement, so there's a chance (however small - errors, or changes in the future) where you might get here with no actual operation happening, and still get the "completed successfully!" message. Is there something you can check on to ensure that it happened? Some state on the nodes, etc?

@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: givaldolins
Once this PR has been reviewed and has the lgtm label, please assign zmird-r for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmd/cluster/imdsv2.go (1)

628-634: 💤 Low value

time.Sleep ignores context cancellation.

The retry loop uses time.Sleep(retryDelay) which doesn't respect the passed ctx. If the user sends SIGINT during the 30-second sleep, the command won't respond until the sleep completes.

Consider using a context-aware wait:

Suggested improvement
-		if attempt < maxRetries {
-			fmt.Printf("  ⏳ Waiting for %d nodes to become Ready (attempt %d/%d): %s\n",
-				len(notReadyNodes), attempt, maxRetries, strings.Join(notReadyNodes, ", "))
-			time.Sleep(retryDelay)
-			continue
+		if attempt < maxRetries {
+			fmt.Printf("  ⏳ Waiting for %d nodes to become Ready (attempt %d/%d): %s\n",
+				len(notReadyNodes), attempt, maxRetries, strings.Join(notReadyNodes, ", "))
+			select {
+			case <-ctx.Done():
+				return ctx.Err()
+			case <-time.After(retryDelay):
+			}
+			continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/cluster/imdsv2.go` around lines 628 - 634, The `time.Sleep(retryDelay)`
call in the retry loop blocks indefinitely and ignores context cancellation,
preventing responsive handling of SIGINT signals. Replace the
`time.Sleep(retryDelay)` statement with a context-aware wait mechanism using a
select statement that monitors both `ctx.Done()` for cancellation and a timer
for the delay. If the context is cancelled first, break out of the retry loop;
otherwise, continue to the next retry attempt after the timer fires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 399-406: The validWorkerNames map and filtering logic in the
MachinePool iteration only matches the default "worker" pool name, silently
skipping custom worker pools like "worker-2" or "gpu-workers" that ROSA
supports. Replace the inclusion-based filter (validWorkerNames map checking
mp.Spec.Name == "worker") with an exclusion-based approach that skips only
"infra" and "master" pool types, or add logging to indicate which pools are
being skipped so operators understand which pools require separate handling.
This ensures the function processes all worker pools as its comment claims
rather than just the single default pool.

---

Nitpick comments:
In `@cmd/cluster/imdsv2.go`:
- Around line 628-634: The `time.Sleep(retryDelay)` call in the retry loop
blocks indefinitely and ignores context cancellation, preventing responsive
handling of SIGINT signals. Replace the `time.Sleep(retryDelay)` statement with
a context-aware wait mechanism using a select statement that monitors both
`ctx.Done()` for cancellation and a timer for the delay. If the context is
cancelled first, break out of the retry loop; otherwise, continue to the next
retry attempt after the timer fires.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a8204653-bfa2-4998-9304-1c072b527ed8

📥 Commits

Reviewing files that changed from the base of the PR and between 07e230f and a43353a.

📒 Files selected for processing (2)
  • cmd/cluster/helpers.go
  • cmd/cluster/imdsv2.go

Comment thread cmd/cluster/imdsv2.go Outdated
This command automates the SOP for migrating ROSA Classic cluster nodes
to enforce IMDSv2 (Instance Metadata Service v2) by:
- Replacing infra nodes using the MachinePool dance pattern
- Updating ControlPlaneMachineSet for automatic master node rollout
- Patching worker MachinePools (customer performs machine replacement)
- Validating all nodes/machines are using IMDSv2

Key features:
- Pre-flight checks verify cluster health before making changes
- Confirmation prompts before all destructive operations
- Tracks actual changes and shows accurate success/skip messages
- Supports selective migration: --nodes all|master|infra|workers
- Context-aware with proper cancellation handling
- Comprehensive test coverage (11 test cases)

Implementation follows all review comments from @clcollins:
1. Code consolidation via helpers.go (shared with changevolumetype)
2. Uses RunMachinePoolDance for atomic infra replacement
3. Confirmation prompts before destructive operations
4. No annotation cleanup risk (eliminated pattern entirely)
5. MachinePool name validation for safety
6. Accurate change tracking and reporting

Additional improvements from CodeRabbit AI review:
- Exclusion-based worker pool filtering to support custom pools
  (worker-2, gpu-workers, etc.)
- Context-aware sleep for responsive cancellation handling

Files added:
- cmd/cluster/imdsv2.go (501 lines) - main command implementation
- cmd/cluster/helpers.go (235 lines) - shared helper functions
- cmd/cluster/imdsv2_test.go (580 lines) - comprehensive tests
- docs/osdctl_cluster_imdsv2.md - generated documentation

All tests passing, builds successfully.

Signed-off-by: Givaldo Lins <glins@redhat.com>
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@givaldolins: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@givaldolins

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/cluster/imdsv2_test.go`:
- Around line 224-333: The tests `TestCheckIMDSv2Configuration` and
`TestMachinePoolNameValidation` are tautological because they duplicate logic
locally instead of testing actual production functions. To fix this, identify
the real production functions or helpers in `imdsv2.go` that implement the
IMDSv2 configuration checking and machine pool name validation logic, then
rewrite these tests to call those actual functions with the test inputs and
assert on their outputs, rather than reimplementing the same logic inline. If no
suitable public functions exist, extract testable helper functions from
`imdsv2.go` and test those instead. This ensures the tests will catch actual
regressions in the production code.
- Line 397: The Name field assignment is using string(rune(i)) which converts
the integer loop counter to its corresponding Unicode character rather than its
numeric digit string representation, resulting in non-printable control
characters. Replace the rune conversion with proper numeric string formatting
such as using strconv.Itoa(i) or fmt.Sprintf with the %d verb to convert the
integer i to its string representation, so that the master node names are
generated correctly as "master-0", "master-1", etc.
- Around line 88-90: Replace all instances where `AddToScheme()` and `Build()`
errors are ignored using blank identifier assignments (like `_ =
corev1.AddToScheme(scheme)`) with proper error checking using
`require.NoError(t, err)` instead. This applies to all occurrences in
imdsv2_test.go, particularly at the locations mentioned (lines 88-90, 136-138,
183-185, 337, 366-368, and 527-553). Capture the error returned from each
AddToScheme() and Build() call and validate it immediately to ensure setup
failures are caught at the point of failure rather than causing misleading test
symptoms later.

In `@cmd/cluster/imdsv2.go`:
- Around line 514-518: Add a nil check for ProviderSpec.Value before
dereferencing Value.Raw in the code block that parses the AWS provider spec.
Before the json.Unmarshal call that accesses
cpms.Spec.Template.OpenShiftMachineV1Beta1Machine.Spec.ProviderSpec.Value.Raw,
verify that Value is not nil and return an appropriate error if it is missing.
This defensive check should follow the same pattern used in the validateIMDSv2
function to prevent potential panics from malformed or missing provider
specifications.

In `@docs/osdctl_cluster_imdsv2.md`:
- Around line 7-40: The command documentation advertises workers as a supported
value in the --nodes flag option, but the synopsis and examples only document
migration steps for infra and master nodes, creating a discrepancy. Either add a
documented worker node migration step and example to match the advertised
--nodes workers capability, or remove workers from the supported values in the
--nodes option. Since this documentation is auto-generated, make these changes
in the source code generator or template that produces this markdown file, not
directly in the emitted markdown.

In `@docs/README.md`:
- Around line 1767-1798: The documentation for the imdsv2 command advertises
support for the --nodes workers flag but does not describe the worker migration
workflow, leaving this documented CLI option undocumented. Instead of editing
the markdown file directly under docs/, locate and update the source generator
or template that produces this documentation (likely a source file or template
outside the docs/ directory). Add documentation describing the worker migration
path using the --nodes workers flag to ensure all advertised command options are
properly documented and the CLI contract is consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b71875da-6e93-49c0-bc85-cbd7a3620dc8

📥 Commits

Reviewing files that changed from the base of the PR and between a43353a and e743555.

📒 Files selected for processing (7)
  • cmd/cluster/cmd.go
  • cmd/cluster/helpers.go
  • cmd/cluster/imdsv2.go
  • cmd/cluster/imdsv2_test.go
  • docs/README.md
  • docs/osdctl_cluster.md
  • docs/osdctl_cluster_imdsv2.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • cmd/cluster/cmd.go
  • docs/osdctl_cluster.md
  • cmd/cluster/helpers.go

Comment thread cmd/cluster/imdsv2_test.go
Comment thread cmd/cluster/imdsv2_test.go
Comment thread cmd/cluster/imdsv2_test.go
Comment thread cmd/cluster/imdsv2.go
Comment thread docs/osdctl_cluster_imdsv2.md
Comment thread docs/README.md
@givaldolins

Copy link
Copy Markdown
Contributor Author

A couple of questions.

Also, Claude noticed ""preFlightChecks, monitorCPMSRollout, init, and Hive client setup are near-identical copies of changevolumetype.go" and suggested they should be consolidated/adjusted to work so there's less redundant code.

Hi @clcollins,
I made some changes and rebased the code to avoid unnecessary additional commits. I believe the changes address the observations you made. Could you please take another look whenever you can?
Thank you.

@givaldolins givaldolins requested a review from clcollins June 17, 2026 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants