Add Storage TSGs: physical disk add (HowTo) + CanPool=False (Troubleshoot)#281
Add Storage TSGs: physical disk add (HowTo) + CanPool=False (Troubleshoot)#281AlBurns-MSFT wants to merge 2 commits intoAzure:mainfrom
Conversation
…hoot) Adds two new Storage TSGs derived from a customer disk-add engagement: - HowTo-Storage-AddPhysicalDisksToS2DPool.md End-to-end safe procedure for online capacity expansion: pre-checks, symmetric insertion, automatic vs manual pool claim, monitoring storage jobs, capacity confirmation, and final validation. - Troubleshoot-Storage-PhysicalDiskCanPoolFalse.md Resolution paths for every common CannotPoolReason value (In a Pool, Verification in progress / failed, Hardware/Firmware not compliant, Offline, Stale metadata), plus data-collection checklist for support. Both TSGs follow the HowTo-Template and Troubleshoot-Template and reference the existing Troubleshooting-Storage-With-Support-Diagnostics-Tool TSG. Internal tracking: msazure/One #37486687. Drafts were reviewed for public safety (no telemetry, KQL, customer names, or ARM URIs). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@microsoft-github-policy-service agree company="Microsoft" |
There was a problem hiding this comment.
Pull request overview
Adds two new Storage troubleshooting guides (TSGs) to document a safe disk-add workflow for S2D on Azure Local and a reason-by-reason resolution map for Get-PhysicalDisk cases where newly inserted disks show CanPool=False. Updates the Storage component index to include the new guides.
Changes:
- Added a HowTo guide describing the end-to-end, “safe sequence” procedure for online capacity expansion by adding physical disks to an existing S2D pool.
- Added a Troubleshoot guide mapping common
CannotPoolReasonvalues to recommended verification and remediation steps (including guarded destructive actions). - Updated the Storage README to index the two new TSGs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
TSG/Storage/HowTo-Storage-AddPhysicalDisksToS2DPool.md |
New HowTo for safe, step-by-step disk addition to an existing S2D pool, including pre-checks, monitoring, and validation. |
TSG/Storage/Troubleshoot-Storage-PhysicalDiskCanPoolFalse.md |
New Troubleshoot guide for diagnosing and resolving CanPool=False scenarios by CannotPoolReason. |
TSG/Storage/README.md |
Adds index links to the two new storage TSGs. |
| <table border="1" cellpadding="6" cellspacing="0" style="border-collapse:collapse; margin-bottom:1em;"> | ||
| <tr> | ||
| <th style="text-align:left; width: 180px;">Component</th> | ||
| <td><strong>Storage</strong></td> | ||
| </tr> | ||
| <tr> | ||
| <th style="text-align:left; width: 180px;">Topic</th> | ||
| <td><strong>Storage Spaces Direct</strong>: Add physical disks to an existing pool for online capacity expansion</td> | ||
| </tr> | ||
| <tr> | ||
| <th style="text-align:left; width: 180px;">Applicable Scenarios</th> | ||
| <td><strong>Day 2 Operations</strong>: Capacity expansion / Add disk</td> | ||
| </tr> | ||
| </table> |
There was a problem hiding this comment.
Thanks. Resolved in a6de64a. The HowTo template doesn't include an Affected Versions row, so the row was never present in this file — only the Troubleshoot file has it (the Troubleshoot template requires it). Updated the PR description so it no longer claims both files include the field. The Troubleshoot file's value was changed from All versions to All Azure Local releases (Storage Spaces Direct) to convey the actual scope.
| ```powershell | ||
| # Add the eligible disks to the target pool | ||
| Add-PhysicalDisk -StoragePoolFriendlyName $pool.FriendlyName -PhysicalDisks $eligibleDisks | ||
| ``` |
There was a problem hiding this comment.
Resolved in a6de64a. Replaced the manual-add snippet with a defensive variant that (a) throws if the non-primordial pool count is not exactly 1 (the documented healthy state for S2D — Enable-ClusterStorageSpacesDirect creates a single pool per cluster), and (b) requires the operator to enumerate intended new disks by serial number rather than blindly piping all CanPool=True disks into Add-PhysicalDisk.
| ```powershell | ||
| # Manually add the eligible disks to the target pool | ||
| Add-PhysicalDisk -StoragePoolFriendlyName $pool.FriendlyName -PhysicalDisks $eligibleDisks | ||
| ``` |
There was a problem hiding this comment.
Resolved in a6de64a. Same hardening applied here as in the HowTo: pool-count guard plus serial-number enumeration before Add-PhysicalDisk runs.
…arify version applicability - Both files: replace manual Add-PhysicalDisk snippet with a defensive variant that validates exactly one non-primordial pool and requires the operator to enumerate intended new disks by serial number, so Add-PhysicalDisk cannot accidentally claim unintended CanPool=True disks. - Troubleshoot file: clarify Affected Versions from 'All versions' to 'All Azure Local releases (Storage Spaces Direct)' to convey scope rather than appearing version-agnostic. Addresses Copilot review threads on PR Azure#281. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…arify version applicability - Both files: replace manual Add-PhysicalDisk snippet with a defensive variant that validates exactly one non-primordial pool and requires the operator to enumerate intended new disks by serial number, so Add-PhysicalDisk cannot accidentally claim unintended CanPool=True disks. - Troubleshoot file: clarify Affected Versions from 'All versions' to 'All Azure Local releases (Storage Spaces Direct)' to convey scope rather than appearing version-agnostic. Addresses Copilot review threads on PR Azure#281. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a6de64a to
267151c
Compare
Summary
Adds two new Storage TSGs. Both files follow the repo HowTo and Troubleshoot templates and update the Storage component README.
TSG/Storage/HowTo-Storage-AddPhysicalDisksToS2DPool.mdTSG/Storage/Troubleshoot-Storage-PhysicalDiskCanPoolFalse.mdCannotPoolReasonvalue when newly inserted disks are not claimed.TSG/Storage/README.mdWhat's in each TSG
HowTo: Add physical disks to an existing Azure Local cluster
Covers the full safe sequence:
Add-PhysicalDiskpath when needed.Get-StorageJobmust be empty).Troubleshoot: Physical disks not claimed after insertion (
CanPool=False)Covers each
CannotPoolReasonvalue with a dedicated step:In a Pool(no fix needed; verify pool membership)Verification in progress(wait, do not reset)Verification failed(cluster + storage state checks;Start-AzsSupportStorageDiagnostic)Hardware not compliant(vendor escalation; do not bypass)Firmware not compliant(firmware alignment via vendor)Offline/ read-only (identify exact disk first;Set-Disk -IsOffline $false)Reset-PhysicalDiskonly after positive confirmation the disk is wipeable)Includes a data-collection checklist for opening a support case.
Why this matters
Disk-add operations are routine but storage-infrastructure-changing. The most common failure mode in the field is a
CanPool=Falseafter insertion with no clear next step, which leads to either premature destructive actions (Reset-PhysicalDiskagainst the wrong disk) or unnecessary support escalations. These two TSGs give operators a single canonical safe sequence and a reason-by-reason resolution map.Conventions followed
TSG/Templates/HowTo-Template.mdandTSG/Templates/Troubleshoot-Template.md.CONTRIBUTING.md(Type-Topic-Specifics.md).All Azure Local releases (Storage Spaces Direct)because the cmdlet surface andCannotPoolReasonenum are platform-stable across releases; the issue is scoped to S2D-backed clusters.#comments on every PowerShell command per the template pattern.> [!IMPORTANT],> [!WARNING],> [!CAUTION],> [!NOTE]) for risk callouts perTSG/Templates/Markdown-Snippets.md.Troubleshooting-Storage-With-Support-Diagnostics-Tool.mdTSG (verified path).<disk-serial-number>,<disk-number>,<unique-id>,<serial-number-1>); no real telemetry, KQL, customer names, or ARM URIs.Validation
Start-AzsSupportStorageDiagnosticcmdlet referenced in the troubleshoot TSG is documented in the existingTroubleshooting-Storage-With-Support-Diagnostics-Tool.md.Reset-PhysicalDiskis gated behind explicit identity confirmation and an in-text warning, consistent with the repo's coding-standards guidance ("All code snippets MUST be safe to execute in a production environment").Add-PhysicalDisksnippet (in both TSGs) requires exactly one non-primordial pool and explicit serial-number enumeration of the intended new disks, so it cannot accidentally claim otherCanPool=Truedisks. The single-pool invariant matches the documented S2D design (Enable-ClusterStorageSpacesDirectcreates one pool per cluster; Microsoft Learn explicitly recommends one pool per cluster).