OCPEDGE-2354: Automate OCP-89594-Verify LVMCluster deviceDiscoveryPolicy is configurable#2709
Conversation
WalkthroughAdds a Ginkgo integration test for ChangesdeviceDiscoveryPolicy Integration Test
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 12 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2709 +/- ##
=======================================
Coverage 53.40% 53.40%
=======================================
Files 53 53
Lines 4033 4033
=======================================
Hits 2154 2154
Misses 1709 1709
Partials 170 170 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
test/integration/qe_tests/lvms.go (1)
5915-5917: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winPrefer polling over fixed sleep for the final preservation check.
time.Sleep(30 * time.Second)makes this phase slower and still timing-sensitive.Consistentlyon device count is more deterministic.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/integration/qe_tests/lvms.go` around lines 5915 - 5917, Replace the fixed 30-second time.Sleep call before the device count preservation check with a polling-based approach using Consistently to verify the device count remains stable. Instead of sleeping once and then checking, use Consistently to repeatedly call getVGDevices(deviceClassName) over a time period and assert that the length remains equal to deviceCountAfterDynamic throughout, making the test more deterministic and avoiding unnecessary delays.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/integration/qe_tests/lvms_utils.go`:
- Around line 2066-2069: The function getExcludedDeviceReasons receives a
deviceClassName parameter but does not use it, instead returning the raw YAML
output for all lvmvolumegroupnodestatus items. This causes tests to potentially
pass based on unrelated entries. Modify the function to use the deviceClassName
parameter to filter the query, parse the resulting YAML output, and return only
the excluded device reasons for the specific device class that was requested
rather than the entire unscoped output.
- Around line 2045-2063: The functions getVGDevices,
getDeviceDiscoveryPolicyStatus, and getExcludedDevices all ignore the error
return from cmd.CombinedOutput() by using blank identifier assignment, which
causes CLI failures to appear as valid empty data. Modify each of these three
functions to return an error as a second return value, capture the error from
cmd.CombinedOutput(), and return it if non-nil. Then update all call sites of
these functions to properly handle the returned error by checking for non-nil
values and failing the test appropriately instead of silently continuing with
empty results.
In `@test/integration/qe_tests/lvms.go`:
- Around line 5797-5802: The deferred cleanup function is not checking for
errors returned by createLVMClusterFromJSON and waitForLVMClusterReady, which
can cause the test suite to proceed with a broken cluster state if these
operations fail. Add proper error handling in the deferred cleanup block by
capturing the errors from both createLVMClusterFromJSON(originLVMJSON) and
waitForLVMClusterReady(originLVMClusterName, lvmsNamespace,
LVMClusterReadyTimeout) calls, and explicitly fail the spec or log critical
errors if any of these operations fail to ensure cascading failures are
prevented.
---
Nitpick comments:
In `@test/integration/qe_tests/lvms.go`:
- Around line 5915-5917: Replace the fixed 30-second time.Sleep call before the
device count preservation check with a polling-based approach using Consistently
to verify the device count remains stable. Instead of sleeping once and then
checking, use Consistently to repeatedly call getVGDevices(deviceClassName) over
a time period and assert that the length remains equal to
deviceCountAfterDynamic throughout, making the test more deterministic and
avoiding unnecessary delays.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 8cf35574-10dc-4354-91a6-2882fe3748aa
📒 Files selected for processing (3)
test/integration/qe_tests/lvms.gotest/integration/qe_tests/lvms_utils.gotest/integration/qe_tests/testdata/lvmcluster-devicediscovery-template.yaml
|
/hold |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mmakwana30, pacevedom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@mmakwana30: This pull request references OCPEDGE-2354 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/hold cancel |
| g.By("#9. Delete LVMCluster for next test phase") | ||
| deleteLVMClusterSafely(newLVMClusterName, lvmsNamespace, deviceClassName) | ||
|
|
||
| g.By("#10. Create LVMCluster with deviceDiscoveryPolicy: Static (auto-discovery, no explicit paths)") |
There was a problem hiding this comment.
step #4 creates a lvmcluster with explicit forceWipeDevicesAndDestroyAllData: true, step 9 deletes it and step 10 recreates via template which does not include that field, the disk may retain vg metadata from step 4 causing vg-manager to reject them in step 10
Fix: Add forceWipeDevicesAndDestroyAllData: true to the template
| `, newLVMClusterName, lvmsNamespace, deviceClassName, firstDiskPath) | ||
| err = createLVMClusterFromJSON(lvmClusterYAML) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) | ||
| defer func() { |
There was a problem hiding this comment.
why do we need this defer here ? we are already doing it in step22 and step 3 right ?
| newLVMClusterName := "test-lvmcluster-89594" | ||
| deviceClassName := "vg1" | ||
|
|
||
| g.By("#4. Create LVMCluster with deviceDiscoveryPolicy: Static and explicit path to ONE disk") |
There was a problem hiding this comment.
instead of this can we use the template ? extend it with optional paths and forceWipeDevicesAndDestroyAllData parameters
| sizePercent: 90 | ||
| overprovisionRatio: 10 | ||
| `, newLVMClusterName, lvmsNamespace, deviceClassName, firstDiskPath) | ||
| err = createLVMClusterFromJSON(lvmClusterYAML) |
There was a problem hiding this comment.
while this might work the naming is a bit misleading. It says createlvmclusterfromjson but as an argument it passes yaml
|
@mmakwana30: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Logs: https://privatebin.corp.redhat.com/?2ff4c133762d8fc0#ZgUEoWtJpK1zQ1XameFbus1HcnGx8WKYpW7rY56Svq8
Summary by CodeRabbit
LVMClusterdevice discovery policy runtime configuration, validating transitions between Static (explicit disk selection) and Dynamic (auto-discovery) modes.