Skip to content

OCPBUGS-90086: [release-4.22] clear stale EtcdRecoveryActive failure condition when etcd is healthy#8806

Open
vsolanki12 wants to merge 1 commit into
openshift:release-4.22from
vsolanki12:backport-c5ada28-to-release-4.22
Open

OCPBUGS-90086: [release-4.22] clear stale EtcdRecoveryActive failure condition when etcd is healthy#8806
vsolanki12 wants to merge 1 commit into
openshift:release-4.22from
vsolanki12:backport-c5ada28-to-release-4.22

Conversation

@vsolanki12

Copy link
Copy Markdown
Contributor

Summary

Backport of #8406 to release-4.22.

  • Add Reason check to condition update logic so EtcdRecoveryJobFailed can transition to AsExpected when both use Status=False
  • Clear stale EtcdRecoveryJobFailed condition when etcd is fully healthy and no pods are failing

Cherry-pick required manual conflict resolution because #8309 refactored the monolithic function on main but was not backported to release branches.

Original PR

JIRA

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 23, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@vsolanki12: This pull request references Jira Issue OCPBUGS-90086, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-84577 to target a version in 5.0.0, but it targets "5.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Backport of #8406 to release-4.22.

  • Add Reason check to condition update logic so EtcdRecoveryJobFailed can transition to AsExpected when both use Status=False
  • Clear stale EtcdRecoveryJobFailed condition when etcd is fully healthy and no pods are failing

Cherry-pick required manual conflict resolution because #8309 refactored the monolithic function on main but was not backported to release branches.

Original PR

JIRA

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vsolanki12
Once this PR has been reviewed and has the lgtm label, please assign muraee for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a5050e35-a77b-4146-95b6-6c1462f3d8f9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels Jun 23, 2026
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 58.33333% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 35.59%. Comparing base (e3ad989) to head (dfc89c3).
⚠️ Report is 5 commits behind head on release-4.22.

Files with missing lines Patch % Lines
...perator/controllers/hostedcluster/etcd_recovery.go 58.33% 8 Missing and 2 partials ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##           release-4.22    #8806      +/-   ##
================================================
+ Coverage         35.45%   35.59%   +0.14%     
================================================
  Files               767      767              
  Lines             93724    93771      +47     
================================================
+ Hits              33226    33381     +155     
+ Misses            57785    57673     -112     
- Partials           2713     2717       +4     
Files with missing lines Coverage Δ
...perator/controllers/hostedcluster/etcd_recovery.go 35.69% <58.33%> (+35.69%) ⬆️

... and 6 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vsolanki12 vsolanki12 marked this pull request as ready for review June 24, 2026 09:08
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 24, 2026
@openshift-ci openshift-ci Bot requested review from cblecker and enxebre June 24, 2026 09:10
@cblecker

Copy link
Copy Markdown
Member

/uncc

@openshift-ci openshift-ci Bot removed the request for review from cblecker June 24, 2026 17:28
@vsolanki12

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown

@vsolanki12: This pull request references Jira Issue OCPBUGS-90086, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-84577 to target a version in 5.0.0, but it targets "5.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@vsolanki12

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown

@vsolanki12: This pull request references Jira Issue OCPBUGS-90086, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@vsolanki12

Copy link
Copy Markdown
Contributor Author

/cc @bryan-cox @sdminonne

@openshift-ci openshift-ci Bot requested review from bryan-cox and sdminonne June 26, 2026 10:41
@vsolanki12 vsolanki12 force-pushed the backport-c5ada28-to-release-4.22 branch from 75a3a26 to 821f069 Compare June 26, 2026 10:53
…when etcd is healthy

When an etcd recovery job fails and etcd later recovers on its own,
the EtcdRecoveryActive condition retains the stale failure reason
and message. The console renders this as a red error icon even
though the cluster is healthy (Available=True, Degraded=False,
EtcdAvailable=True).

Clear the stale EtcdRecoveryActive condition by resetting its
reason and message when etcd quorum is available and no recovery
job is running. Add unit tests covering all condition state
transitions.

Signed-off-by: Vimal Solanki <vsolanki@redhat.com>
@vsolanki12 vsolanki12 force-pushed the backport-c5ada28-to-release-4.22 branch from 821f069 to dfc89c3 Compare June 26, 2026 11:03
@openshift-ci

openshift-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

@vsolanki12: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants