Skip to content

OCPEDGE-2354: Automate OCP-89594-Verify LVMCluster deviceDiscoveryPolicy is configurable#2709

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
mmakwana30:OCP-89594
Jun 25, 2026
Merged

OCPEDGE-2354: Automate OCP-89594-Verify LVMCluster deviceDiscoveryPolicy is configurable#2709
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
mmakwana30:OCP-89594

Conversation

@mmakwana30

@mmakwana30 mmakwana30 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Logs: https://privatebin.corp.redhat.com/?2ff4c133762d8fc0#ZgUEoWtJpK1zQ1XameFbus1HcnGx8WKYpW7rY56Svq8

Summary by CodeRabbit

  • Tests
    • Added integration tests for LVMCluster device discovery policy runtime configuration, validating transitions between Static (explicit disk selection) and Dynamic (auto-discovery) modes.
    • Verified device exclusion tracking and status reporting across nodes.
    • Introduced supporting test utilities and templates.

@openshift-ci openshift-ci Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 23, 2026
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Walkthrough

Adds a Ginkgo integration test for LVMCluster deviceDiscoveryPolicy runtime configurability, covering Static with explicit disk paths, policy transitions between Static and Dynamic, device count preservation, and original cluster restoration. Backed by new query helpers and a parameterized OpenShift Template testdata file.

Changes

deviceDiscoveryPolicy Integration Test

Layer / File(s) Summary
Template and query helpers
test/integration/qe_tests/testdata/lvmcluster-devicediscovery-template.yaml, test/integration/qe_tests/lvms_utils.go
OpenShift Template parameterizes LVMCluster with deviceDiscoveryPolicy, device class, and fsType. Helper functions query lvmvolumegroupnodestatus for VG devices, policy, excluded devices, and reasons; createLVMClusterWithDeviceDiscoveryPolicy applies the template with defaults.
Test setup and Static policy with explicit disk
test/integration/qe_tests/lvms.go
Ginkgo It block selects a common worker disk, saves and deletes the original LVMCluster, creates a Static LVMCluster with one explicit disk path, asserts VG device count, excluded-device list, and excluded reason, then deletes the test cluster.
Policy transitions and cluster restore
test/integration/qe_tests/lvms.go
Recreates LVMCluster in Static mode (no explicit paths), verifies RuntimeStatic, patches to Dynamic, verifies RuntimeDynamic with preserved device count, patches back to Static, verifies RuntimeStatic, deletes the test cluster, and restores the original from saved JSON.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 12 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Helper functions ignore oc command errors (e.g., getVGDevices, getExcludedDeviceReasons, getDeviceDiscoveryPolicyStatus swallow errors), violating "Never ignore error returns" pattern; deferred cle... Return errors from helper functions matching existing pattern (e.g., resourceExists), add error assertions to deferred cleanup, scope getExcludedDeviceReasons to requested deviceClass, and add meaningful failure messages to all assertions.
Single Node Openshift (Sno) Test Compatibility ⚠️ Warning The test assumes multiple worker nodes but lacks protection against SNO. Line 5842 expects device count to equal worker node count (one per node), failing to validate actual multi-node requirements. Add SNO skip check after line 5768: if workerNodeCount < 2, call g.Skip("Test requires multiple worker nodes"). Or remove SNO label from g.Label() on line 5758 if test is only for MNO.
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Test name is stable and deterministic with no dynamic values (pods, timestamps, UUIDs, nodes, namespaces, IPs). Title clearly describes what is validated.
Microshift Test Compatibility ✅ Passed Test uses TopoLVM/LVMS APIs (lvm.topolvm.io), which are available on MicroShift. No OpenShift-exclusive APIs are used. Test labels (SNO, MNO) indicate multi-platform design.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds only test code and test-data templates; no operator code, controllers, or deployment manifests with scheduling constraints introduced.
Ote Binary Stdout Contract ✅ Passed The PR adds test code and helpers without any process-level stdout writes. The init() function (pre-existing, line 38) correctly configures klog and GinkgoWriter. All new code: test case (inside g....
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Test contains no IPv4 hardcoded addresses, external connectivity, or IP family assumptions. All operations are cluster-API calls via oc, which work identically on IPv4 and IPv6 networks.
No-Weak-Crypto ✅ Passed No weak cryptography detected. PR adds integration tests and helpers for LVMCluster policy configuration with no MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB, custom crypto implementations, or insecure...
Container-Privileges ✅ Passed PR adds test code (Go files) and an LVMCluster template YAML that defines only storage config, not container workloads. No privileged container settings found.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data logging found. New code captures LVMCluster config JSON and cluster status YAML for resource management, but never logs credentials, passwords, tokens, API keys, or PII. Disk path...
Title check ✅ Passed The title accurately describes the main change: automating verification that LVMCluster deviceDiscoveryPolicy is configurable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codecov-commenter

codecov-commenter commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.40%. Comparing base (44e238a) to head (26736af).
⚠️ Report is 36 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2709   +/-   ##
=======================================
  Coverage   53.40%   53.40%           
=======================================
  Files          53       53           
  Lines        4033     4033           
=======================================
  Hits         2154     2154           
  Misses       1709     1709           
  Partials      170      170           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
test/integration/qe_tests/lvms.go (1)

5915-5917: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Prefer polling over fixed sleep for the final preservation check.

time.Sleep(30 * time.Second) makes this phase slower and still timing-sensitive. Consistently on device count is more deterministic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/integration/qe_tests/lvms.go` around lines 5915 - 5917, Replace the
fixed 30-second time.Sleep call before the device count preservation check with
a polling-based approach using Consistently to verify the device count remains
stable. Instead of sleeping once and then checking, use Consistently to
repeatedly call getVGDevices(deviceClassName) over a time period and assert that
the length remains equal to deviceCountAfterDynamic throughout, making the test
more deterministic and avoiding unnecessary delays.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 2066-2069: The function getExcludedDeviceReasons receives a
deviceClassName parameter but does not use it, instead returning the raw YAML
output for all lvmvolumegroupnodestatus items. This causes tests to potentially
pass based on unrelated entries. Modify the function to use the deviceClassName
parameter to filter the query, parse the resulting YAML output, and return only
the excluded device reasons for the specific device class that was requested
rather than the entire unscoped output.
- Around line 2045-2063: The functions getVGDevices,
getDeviceDiscoveryPolicyStatus, and getExcludedDevices all ignore the error
return from cmd.CombinedOutput() by using blank identifier assignment, which
causes CLI failures to appear as valid empty data. Modify each of these three
functions to return an error as a second return value, capture the error from
cmd.CombinedOutput(), and return it if non-nil. Then update all call sites of
these functions to properly handle the returned error by checking for non-nil
values and failing the test appropriately instead of silently continuing with
empty results.

In `@test/integration/qe_tests/lvms.go`:
- Around line 5797-5802: The deferred cleanup function is not checking for
errors returned by createLVMClusterFromJSON and waitForLVMClusterReady, which
can cause the test suite to proceed with a broken cluster state if these
operations fail. Add proper error handling in the deferred cleanup block by
capturing the errors from both createLVMClusterFromJSON(originLVMJSON) and
waitForLVMClusterReady(originLVMClusterName, lvmsNamespace,
LVMClusterReadyTimeout) calls, and explicitly fail the spec or log critical
errors if any of these operations fail to ensure cascading failures are
prevented.

---

Nitpick comments:
In `@test/integration/qe_tests/lvms.go`:
- Around line 5915-5917: Replace the fixed 30-second time.Sleep call before the
device count preservation check with a polling-based approach using Consistently
to verify the device count remains stable. Instead of sleeping once and then
checking, use Consistently to repeatedly call getVGDevices(deviceClassName) over
a time period and assert that the length remains equal to
deviceCountAfterDynamic throughout, making the test more deterministic and
avoiding unnecessary delays.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8cf35574-10dc-4354-91a6-2882fe3748aa

📥 Commits

Reviewing files that changed from the base of the PR and between 46f16ee and 366e9e9.

📒 Files selected for processing (3)
  • test/integration/qe_tests/lvms.go
  • test/integration/qe_tests/lvms_utils.go
  • test/integration/qe_tests/testdata/lvmcluster-devicediscovery-template.yaml

Comment thread test/integration/qe_tests/lvms_utils.go Outdated
Comment thread test/integration/qe_tests/lvms_utils.go Outdated
Comment thread test/integration/qe_tests/lvms.go
@pacevedom

Copy link
Copy Markdown
Contributor

/hold
Do we have a ticket number for this?

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 25, 2026

@pacevedom pacevedom left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 25, 2026
@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mmakwana30, pacevedom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 25, 2026
@pacevedom pacevedom changed the title Automate OCP-89594-Verify LVMCluster deviceDiscoveryPolicy is configurable OCPEDGE-2354: Automate OCP-89594-Verify LVMCluster deviceDiscoveryPolicy is configurable Jun 25, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 25, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 25, 2026

Copy link
Copy Markdown

@mmakwana30: This pull request references OCPEDGE-2354 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Logs: https://privatebin.corp.redhat.com/?2ff4c133762d8fc0#ZgUEoWtJpK1zQ1XameFbus1HcnGx8WKYpW7rY56Svq8

Summary by CodeRabbit

  • Tests
  • Added integration tests for LVMCluster device discovery policy runtime configuration, validating transitions between Static (explicit disk selection) and Dynamic (auto-discovery) modes.
  • Verified device exclusion tracking and status reporting across nodes.
  • Introduced supporting test utilities and templates.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pacevedom

Copy link
Copy Markdown
Contributor

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 25, 2026
g.By("#9. Delete LVMCluster for next test phase")
deleteLVMClusterSafely(newLVMClusterName, lvmsNamespace, deviceClassName)

g.By("#10. Create LVMCluster with deviceDiscoveryPolicy: Static (auto-discovery, no explicit paths)")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step #4 creates a lvmcluster with explicit forceWipeDevicesAndDestroyAllData: true, step 9 deletes it and step 10 recreates via template which does not include that field, the disk may retain vg metadata from step 4 causing vg-manager to reject them in step 10
Fix: Add forceWipeDevicesAndDestroyAllData: true to the template

`, newLVMClusterName, lvmsNamespace, deviceClassName, firstDiskPath)
err = createLVMClusterFromJSON(lvmClusterYAML)
o.Expect(err).NotTo(o.HaveOccurred())
defer func() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this defer here ? we are already doing it in step22 and step 3 right ?

newLVMClusterName := "test-lvmcluster-89594"
deviceClassName := "vg1"

g.By("#4. Create LVMCluster with deviceDiscoveryPolicy: Static and explicit path to ONE disk")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this can we use the template ? extend it with optional paths and forceWipeDevicesAndDestroyAllData parameters

sizePercent: 90
overprovisionRatio: 10
`, newLVMClusterName, lvmsNamespace, deviceClassName, firstDiskPath)
err = createLVMClusterFromJSON(lvmClusterYAML)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this might work the naming is a bit misleading. It says createlvmclusterfromjson but as an argument it passes yaml

@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@mmakwana30: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit bdda898 into openshift:main Jun 25, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants