Skip to content

github-actions: Add Slack notifications for kernelCI failures#1217

Open
shreeya-patel98 wants to merge 1 commit into
mainfrom
shreeya_kernelci_slack
Open

github-actions: Add Slack notifications for kernelCI failures#1217
shreeya-patel98 wants to merge 1 commit into
mainfrom
shreeya_kernelci_slack

Conversation

@shreeya-patel98
Copy link
Copy Markdown
Collaborator

Notify the #linux-kernel Slack channel when the kernelCI pipeline fails on a supported release branch. Successful runs stay silent.

Triggers a notification:

  • build / boot failures
  • kselftest or LTP execution infrastructure failures
  • kselftest regressions (>±3 test diff vs baseline)
  • pre-setup / matrix-setup infra failures

Stays silent for:

  • successful runs
  • branches whose base is not in VALID_BASES
  • [skip ci] runs (skip_ci sentinel from pre-setup)
  • LTP regressions (intentionally not classified — LTP runs informationally per existing pipeline policy: continue-on-error on test-ltp/compare-ltp, no PR-blocking in create-pr)
  • LTP test failures within tolerance
  • kselftest pass/fail diffs within ±3 threshold
  • create-pr failures (avoids noise from branch-name typos, regression-induced skips, transient gh API errors)

Implementation:

  • notify-slack-kernelci.sh follows the create-pr-body-multiarch.sh pattern: named args, lives in .github/scripts/, fetched fresh from main by the calling workflow on each run.
  • Posts via slackapi/slack-github-action pinned to SHA 45a88b9581bfab2566dc881e2cd66d334e621e2c (v3.0.3) using the org-wide GH_BOT_SLACK_TOKEN secret.
  • Channel ID stored as repo variable SLACK_CHANNEL_LINUX_KERNEL so the destination can change without code edits.
  • Message includes mention, failed-stage summary, and branch/commit/PR/run links.

Prereqs (already in place):

  • vars.SLACK_CHANNEL_LINUX_KERNEL set on the repo
  • GH_BOT_SLACK_TOKEN org secret scoped to kernel-src-tree
  • Bot user is a member of #linux-kernel

Copilot AI review requested due to automatic review settings May 12, 2026 20:07
@shreeya-patel98 shreeya-patel98 self-assigned this May 12, 2026
@shreeya-patel98 shreeya-patel98 requested review from a team May 12, 2026 20:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automated Slack notifications to the kernelCI multi-arch pipeline so that failures on supported release branches alert the #linux-kernel Slack channel, while successful runs remain silent.

Changes:

  • Introduces a new notify-slack job in the multi-arch workflow that detects failed stages/regressions and posts a Slack message.
  • Adds a .github/scripts/notify-slack-kernelci.sh helper to build a chat.postMessage payload consumed by slackapi/slack-github-action.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
.github/workflows/kernel-build-and-test-multiarch.yml Adds a post-run Slack notification job gated by branch whitelist and failure classification.
.github/scripts/notify-slack-kernelci.sh New Bash utility to generate the Slack API JSON payload for posting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/kernel-build-and-test-multiarch.yml
Comment thread .github/workflows/kernel-build-and-test-multiarch.yml
Comment thread .github/workflows/kernel-build-and-test-multiarch.yml Outdated
Comment thread .github/workflows/kernel-build-and-test-multiarch.yml
Comment thread .github/scripts/notify-slack-kernelci.sh Outdated
Notify the #linux-kernel Slack channel when the kernelCI pipeline
fails on a supported release branch. Successful runs stay silent.

Triggers a notification:
  - build / boot failures
  - kselftest or LTP execution infrastructure failures
  - kselftest regressions (>±3 test diff vs baseline)
  - pre-setup / matrix-setup infra failures

Stays silent for:
  - successful runs
  - branches whose base is not in VALID_BASES
  - [skip ci] runs (skip_ci sentinel from pre-setup)
  - LTP regressions (intentionally not classified — LTP runs
    informationally per existing pipeline policy: continue-on-error
    on test-ltp/compare-ltp, no PR-blocking in create-pr)
  - LTP test failures within tolerance
  - kselftest pass/fail diffs within ±3 threshold
  - create-pr failures (avoids noise from branch-name typos,
    regression-induced skips, transient gh API errors)

Implementation:
  - notify-slack-kernelci.sh follows the create-pr-body-multiarch.sh
    pattern: named args, lives in .github/scripts/, fetched fresh
    from main by the calling workflow on each run.
  - Posts via slackapi/slack-github-action pinned to SHA
    45a88b9581bfab2566dc881e2cd66d334e621e2c (v3.0.3) using the
    org-wide GH_BOT_SLACK_TOKEN secret.
  - Channel ID stored as repo variable SLACK_CHANNEL_LINUX_KERNEL
    so the destination can change without code edits.
  - Message includes mention, failed-stage summary, and
    branch/commit/PR/run links.

Prereqs (already in place):
  - vars.SLACK_CHANNEL_LINUX_KERNEL set on the repo
  - GH_BOT_SLACK_TOKEN org secret scoped to kernel-src-tree
  - Bot user is a member of #linux-kernel
@shreeya-patel98 shreeya-patel98 force-pushed the shreeya_kernelci_slack branch from 9733c62 to be1fd83 Compare May 19, 2026 14:28
Copy link
Copy Markdown
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple little things. I think we want to update actions/checkout to 6.0.2 to match our other workflows


- name: Checkout kernel source
if: steps.decide.outputs.should_notify == 'true'
uses: actions/checkout@v4
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've updated all of the other workflows to actions/checkout 6.0.2, so we probably want to do that here. This is what it looks like in the clk-rebase workflow:

      - name: Checkout kernel-src-tree
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2

notify-slack:
name: Notify Slack on failure
runs-on: ubuntu-latest
needs: [pre-setup, setup, build, boot, test-kselftest, test-ltp, compare-kselftest, compare-ltp, create-pr]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't pretend to know exactly how this 'needs' list works, but:

Do we need create-pr in this list? I don't see where we ever fill FAILED_STAGES with a create-pr failure. Not a huge deal since I guess that'd only be a small delay

If the aarch64 build fails are we waiting until the all the stages run for x86_64 before we get a notification?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants