Skip to content

fix(docs): skip ms.date updates and republish for metadata-only changes#2154

Open
MSBrett wants to merge 1 commit into
devfrom
fix/skip-metadata-only-doc-updates
Open

fix(docs): skip ms.date updates and republish for metadata-only changes#2154
MSBrett wants to merge 1 commit into
devfrom
fix/skip-metadata-only-doc-updates

Conversation

@MSBrett
Copy link
Copy Markdown
Contributor

@MSBrett MSBrett commented May 19, 2026

Problem

The update-mslearn-dates GitHub Action and Publish-Toolkit.ps1 both treated any change to a docs markdown file as a content change. Editing only YAML frontmatter (or other metadata-only fields) was enough to bump ms.date on PR merge and to produce a docs republish PR full of metadata-only churn.

Fix

Three coordinated changes that gate updates on actual article-body changes.

Workflow — .github/workflows/update-mslearn-dates.yml

  • Strip YAML frontmatter from both the working copy and the pull_request.base.sha version of each changed file and cmp the remaining article body.
  • Skip the ms.date rewrite when the bodies match.
  • Skip files that no longer exist.

Publish-Toolkit — src/scripts/Publish-Toolkit.ps1

  • Add Get-DocsArticleBody — strips frontmatter, markdownlint and prettier ignore comments, normalizes trailing whitespace and blank-line runs.
  • Add Remove-DocsMetadataOnlyChanges — walks every docs/*.md in the publish target and git restores files whose normalized body matches HEAD.
  • Invoked from Set-RepoContent when -Template is 'docs'.

Tests — src/powershell/Tests/Unit/Action.UpdateMsLearnDates.Tests.ps1

  • Three Pester assertions pinning the workflow to github.event.pull_request.base.sha, the strip_frontmatter helper, and the cmp -s body-diff guard.

Effect

  • PRs that touch only frontmatter no longer bump ms.date.
  • Docs republish branches no longer contain metadata-only file changes.
  • Pester tests fail loudly if the workflow guards are removed.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The update-mslearn-dates GitHub Action and Publish-Toolkit both treated
any change to a docs file as a content change, so editing only YAML
frontmatter (or other metadata-only fields) was enough to bump ms.date
and trigger a docs republish.

Workflow (.github/workflows/update-mslearn-dates.yml)
- Strip YAML frontmatter from both the working copy and the
  pull_request.base.sha version of each changed file and cmp the
  remaining article body. Skip the ms.date rewrite when the bodies
  match. Skip files that no longer exist.

Publish-Toolkit (src/scripts/Publish-Toolkit.ps1)
- Add Get-DocsArticleBody to strip frontmatter, markdownlint and
  prettier ignore comments, and normalize trailing whitespace and
  blank-line runs.
- Add Remove-DocsMetadataOnlyChanges to walk every docs/*.md in the
  publish target and git-restore files whose normalized body matches
  HEAD. Invoked from Set-RepoContent when -Template is 'docs'.

Tests (src/powershell/Tests/Unit/Action.UpdateMsLearnDates.Tests.ps1)
- Add three Pester assertions that pin the workflow to
  github.event.pull_request.base.sha, the strip_frontmatter helper,
  and the cmp -s body-diff guard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MSBrett MSBrett requested a review from flanakin as a code owner May 19, 2026 13:25
Copilot AI review requested due to automatic review settings May 19, 2026 13:25
@microsoft-github-policy-service microsoft-github-policy-service Bot added the Needs: Review 👀 PR that is ready to be reviewed label May 19, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the docs publishing and update-mslearn-dates workflow to avoid treating metadata-only edits (YAML frontmatter/ignore directives) as content changes, preventing unnecessary ms.date bumps and republish churn.

Changes:

  • Workflow: compare PR base SHA vs working copy after stripping frontmatter; skip ms.date updates when article bodies match and skip deleted files.
  • Publish toolkit: normalize/strip docs metadata and auto-revert files whose article body matches HEAD for docs templates.
  • Tests: add Pester assertions to pin the workflow gating logic.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
src/scripts/Publish-Toolkit.ps1 Adds body-normalization and a cleanup pass to revert metadata-only docs changes during publish.
src/powershell/Tests/Unit/Action.UpdateMsLearnDates.Tests.ps1 Adds assertions ensuring the workflow uses base SHA + body-diff guardrails.
.github/workflows/update-mslearn-dates.yml Implements frontmatter-stripped body comparison against PR base SHA before updating ms.date.

Comment on lines +245 to +267
Set-Location $docsPath
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot

Get-ChildItem $docsPath -Recurse -Filter "*.md" | ForEach-Object {
$repoFile = [System.IO.Path]::GetRelativePath($repoRoot, $_.FullName).Replace("\", "/")
git cat-file -e "HEAD:$repoFile" 2>$null

if ($LASTEXITCODE -ne 0)
{
return
}

$previousContent = [string]::Join("`n", (git show "HEAD:$repoFile"))
$currentContent = Get-Content $_.FullName -Raw

if ((Get-DocsArticleBody $previousContent) -eq (Get-DocsArticleBody $currentContent))
{
git restore --source HEAD -- $repoFile
}
}

Pop-Location
Comment on lines +212 to +240
function Get-DocsArticleBody([string]$content)
{
$lines = $content -split "`r?`n"
$start = 0

if ($lines.Count -gt 0 -and $lines[0] -eq "---")
{
for ($i = 1; $i -lt $lines.Count; $i++)
{
if ($lines[$i] -eq "---")
{
$start = $i + 1
break
}
}
}

$ignoreLines = @(
"<!-- markdownlint-disable-next-line MD025 -->",
"<!-- prettier-ignore-start -->",
"<!-- prettier-ignore-end -->"
)

$body = $lines[$start..($lines.Count - 1)] `
| Where-Object { $ignoreLines -notcontains $_.Trim() } `
| ForEach-Object { $_.TrimEnd() }

return (($body -join "`n") -replace "`n{3,}", "`n`n").Trim()
}
Comment on lines +249 to +259
Get-ChildItem $docsPath -Recurse -Filter "*.md" | ForEach-Object {
$repoFile = [System.IO.Path]::GetRelativePath($repoRoot, $_.FullName).Replace("\", "/")
git cat-file -e "HEAD:$repoFile" 2>$null

if ($LASTEXITCODE -ne 0)
{
return
}

$previousContent = [string]::Join("`n", (git show "HEAD:$repoFile"))
$currentContent = Get-Content $_.FullName -Raw
return
}

$previousContent = [string]::Join("`n", (git show "HEAD:$repoFile"))

if ((Get-DocsArticleBody $previousContent) -eq (Get-DocsArticleBody $currentContent))
{
git restore --source HEAD -- $repoFile
Copy link
Copy Markdown
Collaborator

@RolandKrummenacher RolandKrummenacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of additional notes on top of Copilot's review:

1. Asymmetric "what counts as body" between workflow and publish script

The workflow's strip_frontmatter (awk) only removes the ---...--- block. The PowerShell Get-DocsArticleBody in Publish-Toolkit.ps1 also strips:

  • <!-- markdownlint-disable-next-line MD025 -->
  • <!-- prettier-ignore-start --> / <!-- prettier-ignore-end -->
  • Trailing whitespace
  • Triple+ blank lines collapsed to two

If a PR changes only ignore-comments or whitespace, the workflow treats it as a body change and bumps ms.date, then at publish time Remove-DocsMetadataOnlyChanges reverts the file to HEAD (with the OLD ms.date). Net result is correct, but the ms.date bump on the PR is wasted churn — and the two definitions will drift over time.

Worth extracting body normalization to a single source of truth (either both call the same helper, or both apply the same rule set).

2. Get-DocsArticleBody / Remove-DocsMetadataOnlyChanges have no behavior tests

The new Pester tests follow the repo convention of regex-matching workflow YAML, which is fine for the workflow guard. But the two new PowerShell functions are pure-ish and easy to unit test:

It 'Get-DocsArticleBody strips frontmatter'
It 'Get-DocsArticleBody strips markdownlint/prettier ignore lines'
It 'Get-DocsArticleBody normalizes blank-line runs to max two'
It 'Get-DocsArticleBody trims trailing whitespace per line'
It 'Get-DocsArticleBody returns "" for empty or frontmatter-only input'

These would also be the most useful guard against the asymmetry in #1 — if both sides are required to satisfy a shared test fixture, drift becomes visible.

Copy link
Copy Markdown
Collaborator

@flanakin flanakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [AI][Claude Opus 4.7] PR Review

Summary: Focused, well-scoped fix that achieves its stated goal. Workflow guards and the publish-side cleanup work together correctly. No blockers. Re-emphasizing two items already raised by @RolandKrummenacher at the line level for code-review visibility, plus two additional suggestions.

⚠️ Should fix (2)

  1. Asymmetric body normalization between workflow (strip_frontmatter only) and Get-DocsArticleBody (frontmatter + ignore comments + whitespace) — already raised in @RolandKrummenacher's top-level comment; pinning at the line for context.
  2. No behavior tests for Get-DocsArticleBody / Remove-DocsMetadataOnlyChanges — also from @RolandKrummenacher's comment.

💡 Suggestions (2)

  1. Get-DocsArticleBody produces a reversed range when a file is frontmatter-only ($start == $lines.Count).
  2. Remove-DocsMetadataOnlyChanges relies on cwd for git restore to resolve paths — git -C $repoRoot would be more robust and self-documenting.

Comment on lines +229 to +239
$ignoreLines = @(
"<!-- markdownlint-disable-next-line MD025 -->",
"<!-- prettier-ignore-start -->",
"<!-- prettier-ignore-end -->"
)

$body = $lines[$start..($lines.Count - 1)] `
| Where-Object { $ignoreLines -notcontains $_.Trim() } `
| ForEach-Object { $_.TrimEnd() }

return (($body -join "`n") -replace "`n{3,}", "`n`n").Trim()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [AI][Claude Opus 4.7] ⚠️ Should fix

The workflow's strip_frontmatter (awk) only removes the ---...--- block, but Get-DocsArticleBody here additionally strips markdownlint/prettier ignore comments and normalizes trailing whitespace + blank-line runs. The two definitions disagree on "what counts as body."

A PR that changes only an ignore-comment or whitespace is seen as a real body change by the workflow (so ms.date gets bumped), then reverted to HEAD at publish time (with the OLD ms.date). Net result is correct today, but:

  1. The PR carries a wasted ms.date bump that will be silently undone.
  2. The two normalization implementations will drift over time.

Consider a single source of truth — e.g., a small helper script the workflow sources, or matching the workflow's awk to also drop the ignore-comment lines. (Re-raising @RolandKrummenacher's top-level point at the line level for visibility.)

| ForEach-Object { $_.TrimEnd() }

return (($body -join "`n") -replace "`n{3,}", "`n`n").Trim()
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [AI][Claude Opus 4.7] ⚠️ Should fix

Get-DocsArticleBody is pure and easy to unit test, but there are no Pester tests covering it. The new tests in Action.UpdateMsLearnDates.Tests.ps1 regex-match the workflow YAML (fine for the workflow guard) but don't exercise this function at all.

Minimal coverage worth adding (paraphrasing @RolandKrummenacher's suggestion):

It 'Get-DocsArticleBody strips frontmatter'
It 'Get-DocsArticleBody strips markdownlint/prettier ignore lines'
It 'Get-DocsArticleBody collapses 3+ blank lines to 2'
It 'Get-DocsArticleBody trims trailing whitespace per line'
It 'Get-DocsArticleBody returns "" for empty or frontmatter-only input'

If both the workflow's strip_frontmatter and this function are required to satisfy a shared fixture, the asymmetry above becomes visible the moment one drifts.

"<!-- prettier-ignore-end -->"
)

$body = $lines[$start..($lines.Count - 1)] `
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [AI][Claude Opus 4.7] 💡 Suggestion

Edge case: if a file is frontmatter-only (closing --- is the last line), $start equals $lines.Count and $lines[$start..($lines.Count - 1)] becomes $lines[Count..(Count-1)], which PowerShell evaluates as a reversed range and returns the last element instead of an empty slice. The function would then return that line (likely empty), which happens to still compare equal across versions, so the visible blast radius is small — but the logic is wrong.

Guard with a length check, e.g.:

Suggested change
$body = $lines[$start..($lines.Count - 1)] `
if ($start -ge $lines.Count)
{
return ""
}
$body = $lines[$start..($lines.Count - 1)] `

(Existing pipeline continues from | Where-Object ....)

Comment on lines +244 to +248
Push-Location
Set-Location $docsPath
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [AI][Claude Opus 4.7] 💡 Suggestion

The function changes the process cwd (Set-Location $repoRoot) so that git restore --source HEAD -- $repoFile later resolves $repoFile against the correct repo. It works, but it's implicit — anyone reading the loop has to remember the cwd was changed 15 lines earlier.

Passing -C $repoRoot to each git call would make the intent explicit and remove the need to change cwd:

git -C $repoRoot cat-file -e "HEAD:$repoFile" 2>$null
...
$previousContent = [string]::Join("`n", (git -C $repoRoot show "HEAD:$repoFile"))
...
git -C $repoRoot restore --source HEAD -- $repoFile

Minor — purely a readability/robustness improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs: Review 👀 PR that is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants