fix(cpu): add legacy telemetry fallbacks#2011
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughCpuTopologyService adds hwmon-based sensor reading for CPU temperature and power telemetry. Temperature collection now scans ChangesCPU Telemetry Robustness
🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 104: Reformat the indentation for the assignment to sensorLabel so it
matches the surrounding code style in cpu-topology.service.ts: locate the line
containing "sensorLabel = (await readFile(labelFile,
'utf8')).trim().toLowerCase();" (inside the function that reads sensor labels
using readFile and labelFile) and adjust whitespace/indentation to align with
the enclosing block and project formatter rules, then run the project's code
formatter/Prettier to ensure the Test API check passes.
- Around line 109-128: The current logic pushes all detected package-like
readings (packageTemps) into temps causing multiple entries per socket; update
the code around packageTemps/coreTemps handling in cpu-topology.service (the
block that currently does temps.push(...packageTemps) and
temps.push(Math.max(...coreTemps))) to collapse package-like sensors to a single
value per hwmon device: for each hwmon device choose one package temperature
preferring a sensor labeled 'tdie' first, then 'tctl', then fallback to
Math.max(packageTemps) if neither label exists, and push that single chosen
package temp into temps (otherwise keep the existing fallback of using
Math.max(coreTemps) when no package readings exist). Ensure the selection uses
the sensorLabel info collected earlier so you only add one package value per
device before calling generateTelemetry().
- Around line 272-273: The current code that writes power readings into
results[0][`${chipName}:${f}`] forces all hwmon chips into package 0; instead,
determine the correct package index for each hwmon device (the same way the
temperature path does) and write the reading into results[packageIndex] so each
socket remains distinct. Locate the generator that produces package mappings
(the temperature mapping logic used by generateTelemetry()) and reuse or call
that mapping for the hwmon chipName to compute packageIndex, ensure
results[packageIndex] is initialized (like results[0] was), and then set
results[packageIndex][`${chipName}:${f}`] = rounded. Ensure you do not collapse
multiple chips into package 0.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 017bf826-ae72-4a8e-91b8-2976641389d1
📒 Files selected for processing (1)
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2011 +/- ##
==========================================
- Coverage 52.68% 52.62% -0.07%
==========================================
Files 1033 1033
Lines 71688 71795 +107
Branches 8211 8209 -2
==========================================
+ Hits 37771 37783 +12
- Misses 33791 33886 +95
Partials 126 126 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts (2)
314-314: 💤 Low valueUnit conversion heuristic is fragile for edge cases.
The threshold of
1000assumes values above it are microwatts. While this works for realistic CPU power (10–200W = 10M–200M µW), it mishandles sub-milliwatt readings (e.g., 800 µW would become 800 W). Standard hwmonpower*_inputis documented as microwatts, so unconditional division may be safer—or add a brief comment explaining the heuristic rationale.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts` at line 314, The current fragile heuristic in cpu-topology.service.ts sets watts with "const watts = parsed > 1000 ? parsed / 1_000_000 : parsed;", which misclassifies sub-milliwatt readings; change this to always treat hwmon power*_input as microwatts by dividing parsed by 1_000_000 unconditionally (i.e., compute watts = parsed / 1_000_000) and add a short comment by the watts calculation explaining hwmon uses µW so we convert to W, removing the >1000 heuristic; update any downstream uses of watts if necessary.
132-141: ⚖️ Poor tradeoffTemperature array indexing may misalign with package IDs on multi-socket systems.
tempsis populated in hwmon enumeration order (pushed sequentially), whilepowerDatais keyed by extracted package IDs. In line 355 ofgenerateTelemetry(), accessingtemps[pkgId]assumes the array index equals the physical package ID, which may not hold if hwmon devices enumerate out of order.For single-socket legacy CPUs (the PR target), this is not a practical issue since only one hwmon device exists. For multi-socket deployments, consider refactoring to return
Record<number, number>keyed by package ID (similar to the power path) to ensure correctness across all configurations.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts` around lines 132 - 141, The temps array is built in hwmon enumeration order (pushed from packageTdieTemps/packageTctlTemps/packageTemps/coreTemps) but generateTelemetry() later indexes it by extracted package ID (pkgId), which can misalign on multi-socket systems; change the temps collection to a map keyed by the actual package ID (e.g., Record<number, number>) instead of a positional array so lookups use the package ID, and update the code paths that populate temps (the block that currently pushes values) and the consumer in generateTelemetry() to use temps[pkgId] from the map; mirror the approach used by powerData for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 314: The current fragile heuristic in cpu-topology.service.ts sets watts
with "const watts = parsed > 1000 ? parsed / 1_000_000 : parsed;", which
misclassifies sub-milliwatt readings; change this to always treat hwmon
power*_input as microwatts by dividing parsed by 1_000_000 unconditionally
(i.e., compute watts = parsed / 1_000_000) and add a short comment by the watts
calculation explaining hwmon uses µW so we convert to W, removing the >1000
heuristic; update any downstream uses of watts if necessary.
- Around line 132-141: The temps array is built in hwmon enumeration order
(pushed from packageTdieTemps/packageTctlTemps/packageTemps/coreTemps) but
generateTelemetry() later indexes it by extracted package ID (pkgId), which can
misalign on multi-socket systems; change the temps collection to a map keyed by
the actual package ID (e.g., Record<number, number>) instead of a positional
array so lookups use the package ID, and update the code paths that populate
temps (the block that currently pushes values) and the consumer in
generateTelemetry() to use temps[pkgId] from the map; mirror the approach used
by powerData for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b97f7eec-0971-498f-9fa7-b09601722a83
📒 Files selected for processing (1)
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 32b7af310a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| continue; | ||
| } | ||
|
|
||
| if (!/fam15h_power|zenpower|amd_energy|rapl/i.test(chipName)) { |
There was a problem hiding this comment.
Handle amd_energy fallback with energy counters
When getPackagePower() falls back to hwmon, this method whitelists amd_energy chips but then only reads power*_input files. The amd_energy driver exposes energy*_input counters (not instantaneous power*_input), so on systems that rely on amd_energy and lack readable /sys/class/powercap entries, the fallback still returns no CPU package power (-1). Please either compute power from energy deltas for amd_energy or exclude it from this fallback path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 118: The conditional expression in the else-if branch checking
sensorLabel, f, and label (the clause: else if (!sensorLabel &&
/^temp1_input$/i.test(f) && /k10temp/i.test(label)) ) is too long for the
formatter; reflow it across multiple lines (one sub-condition per line with
proper indentation and parentheses if needed) inside the same surrounding
function/method so it meets the project's formatting rules and linter; locate
the else-if in cpu-topology.service.ts and split the condition into wrapped
lines while preserving the exact boolean logic and regex tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f1186747-2008-446b-a128-69002c24e7c8
📒 Files selected for processing (1)
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
|
This plugin has been deployed to Cloudflare R2 and is available for testing. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 369259732e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!Number.isFinite(packageIndex)) { | ||
| packageIndex = nextFallbackPackageIndex; | ||
| nextFallbackPackageIndex += 1; |
There was a problem hiding this comment.
Defer fallback package index assignment until power data exists
Assigning packageIndex before checking for any power*_input files can shift all subsequent package IDs when a matched chip has no power readings (for example amd_energy, which is whitelisted but typically exposes energy counters only). In that case this block consumes index 0 without adding results, the next valid chip is assigned 1, and generateTelemetry() reports package power under the wrong package ID, producing -1 for the real first package.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This is also a good suggestion I believe
elibosley
left a comment
There was a problem hiding this comment.
Seems safe but plz look at Codex reviews
For CPUs that don't support RAP'L and Temps use a fallback for these legancy CPUs.
Summary by CodeRabbit