Skip to content

fix(cpu): add legacy telemetry fallbacks#2011

Open
SimonFair wants to merge 5 commits into
mainfrom
fix/legacy-cpu-support
Open

fix(cpu): add legacy telemetry fallbacks#2011
SimonFair wants to merge 5 commits into
mainfrom
fix/legacy-cpu-support

Conversation

@SimonFair
Copy link
Copy Markdown
Contributor

@SimonFair SimonFair commented May 19, 2026

For CPUs that don't support RAP'L and Temps use a fallback for these legancy CPUs.

Summary by CodeRabbit

  • Bug Fixes
    • More robust CPU temperature collection: direct hwmon scanning, improved sensor classification (package, tdie, tctl, core), validation/logging of invalid values, and deriving package temperature from the hottest available source.
    • Improved CPU power telemetry: hwmon-based fallback when primary RAPL discovery fails or returns no data, unit normalization, rounded values, and per-package aggregation.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

CpuTopologyService adds hwmon-based sensor reading for CPU temperature and power telemetry. Temperature collection now scans /sys/class/hwmon for temp*_input and optional *_label to classify sensors; power collection prefers RAPL but falls back to hwmon via a new getPackagePowerFromHwmon() when discovery fails or yields no data.

Changes

CPU Telemetry Robustness

Layer / File(s) Summary
Temperature reading from hwmon sensors
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
getPackageTemps() scans /sys/class/hwmon for temp*_input sensors, parses millidegree integer values, reads optional *_label files to classify package/tdie/tctl/core sensors, logs invalid temperatures, and derives package temperature from hottest tdie→tctl→package-labeled sensor or from the hottest core if no package sensors exist.
Hwmon-based power reading implementation
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
getPackagePowerFromHwmon() probes hwmon chips by name, reads power*_input values, normalizes units (for large readings), rounds to two decimals, infers packageIndex from chip name or matching *_label values with a sequential fallback, records per-sensor readings under results[packageIndex][\${chipName}:${f}`], and computes a totalper package; returns{}` on overall discovery/read failures.
Power telemetry fallback control flow
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
getPackagePower() now immediately falls back to getPackagePowerFromHwmon() when RAPL discovery throws or finds no domains, and also falls back when RAPL processing returns an empty results object.

🎯 4 (Complex) | ⏱️ ~45 minutes

🐰 Sensors hum beneath the board, I hop to hear their tune,
I read each temp and watt by name beneath the moon,
When RAPL goes silent, hwmon takes the lead,
Carrots, chips, and labels find the package that they need,
A rabbit's code that listens — tiny tunes in noon.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix(cpu): add legacy telemetry fallbacks' directly aligns with the main objective of adding fallback behavior for CPU telemetry on legacy CPUs that lack RAPL or native temperature support.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/legacy-cpu-support

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 104: Reformat the indentation for the assignment to sensorLabel so it
matches the surrounding code style in cpu-topology.service.ts: locate the line
containing "sensorLabel = (await readFile(labelFile,
'utf8')).trim().toLowerCase();" (inside the function that reads sensor labels
using readFile and labelFile) and adjust whitespace/indentation to align with
the enclosing block and project formatter rules, then run the project's code
formatter/Prettier to ensure the Test API check passes.
- Around line 109-128: The current logic pushes all detected package-like
readings (packageTemps) into temps causing multiple entries per socket; update
the code around packageTemps/coreTemps handling in cpu-topology.service (the
block that currently does temps.push(...packageTemps) and
temps.push(Math.max(...coreTemps))) to collapse package-like sensors to a single
value per hwmon device: for each hwmon device choose one package temperature
preferring a sensor labeled 'tdie' first, then 'tctl', then fallback to
Math.max(packageTemps) if neither label exists, and push that single chosen
package temp into temps (otherwise keep the existing fallback of using
Math.max(coreTemps) when no package readings exist). Ensure the selection uses
the sensorLabel info collected earlier so you only add one package value per
device before calling generateTelemetry().
- Around line 272-273: The current code that writes power readings into
results[0][`${chipName}:${f}`] forces all hwmon chips into package 0; instead,
determine the correct package index for each hwmon device (the same way the
temperature path does) and write the reading into results[packageIndex] so each
socket remains distinct. Locate the generator that produces package mappings
(the temperature mapping logic used by generateTelemetry()) and reuse or call
that mapping for the hwmon chipName to compute packageIndex, ensure
results[packageIndex] is initialized (like results[0] was), and then set
results[packageIndex][`${chipName}:${f}`] = rounded. Ensure you do not collapse
multiple chips into package 0.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 017bf826-ae72-4a8e-91b8-2976641389d1

📥 Commits

Reviewing files that changed from the base of the PR and between 329f66f and 6a297e8.

📒 Files selected for processing (1)
  • api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts

Comment thread api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts Outdated
Comment thread api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts
Comment thread api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 7.14286% with 117 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.62%. Comparing base (329f66f) to head (3692597).

Files with missing lines Patch % Lines
...i/graph/resolvers/info/cpu/cpu-topology.service.ts 7.14% 117 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2011      +/-   ##
==========================================
- Coverage   52.68%   52.62%   -0.07%     
==========================================
  Files        1033     1033              
  Lines       71688    71795     +107     
  Branches     8211     8209       -2     
==========================================
+ Hits        37771    37783      +12     
- Misses      33791    33886      +95     
  Partials      126      126              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts (2)

314-314: 💤 Low value

Unit conversion heuristic is fragile for edge cases.

The threshold of 1000 assumes values above it are microwatts. While this works for realistic CPU power (10–200W = 10M–200M µW), it mishandles sub-milliwatt readings (e.g., 800 µW would become 800 W). Standard hwmon power*_input is documented as microwatts, so unconditional division may be safer—or add a brief comment explaining the heuristic rationale.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts` at line
314, The current fragile heuristic in cpu-topology.service.ts sets watts with
"const watts = parsed > 1000 ? parsed / 1_000_000 : parsed;", which
misclassifies sub-milliwatt readings; change this to always treat hwmon
power*_input as microwatts by dividing parsed by 1_000_000 unconditionally
(i.e., compute watts = parsed / 1_000_000) and add a short comment by the watts
calculation explaining hwmon uses µW so we convert to W, removing the >1000
heuristic; update any downstream uses of watts if necessary.

132-141: ⚖️ Poor tradeoff

Temperature array indexing may misalign with package IDs on multi-socket systems.

temps is populated in hwmon enumeration order (pushed sequentially), while powerData is keyed by extracted package IDs. In line 355 of generateTelemetry(), accessing temps[pkgId] assumes the array index equals the physical package ID, which may not hold if hwmon devices enumerate out of order.

For single-socket legacy CPUs (the PR target), this is not a practical issue since only one hwmon device exists. For multi-socket deployments, consider refactoring to return Record<number, number> keyed by package ID (similar to the power path) to ensure correctness across all configurations.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts` around
lines 132 - 141, The temps array is built in hwmon enumeration order (pushed
from packageTdieTemps/packageTctlTemps/packageTemps/coreTemps) but
generateTelemetry() later indexes it by extracted package ID (pkgId), which can
misalign on multi-socket systems; change the temps collection to a map keyed by
the actual package ID (e.g., Record<number, number>) instead of a positional
array so lookups use the package ID, and update the code paths that populate
temps (the block that currently pushes values) and the consumer in
generateTelemetry() to use temps[pkgId] from the map; mirror the approach used
by powerData for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 314: The current fragile heuristic in cpu-topology.service.ts sets watts
with "const watts = parsed > 1000 ? parsed / 1_000_000 : parsed;", which
misclassifies sub-milliwatt readings; change this to always treat hwmon
power*_input as microwatts by dividing parsed by 1_000_000 unconditionally
(i.e., compute watts = parsed / 1_000_000) and add a short comment by the watts
calculation explaining hwmon uses µW so we convert to W, removing the >1000
heuristic; update any downstream uses of watts if necessary.
- Around line 132-141: The temps array is built in hwmon enumeration order
(pushed from packageTdieTemps/packageTctlTemps/packageTemps/coreTemps) but
generateTelemetry() later indexes it by extracted package ID (pkgId), which can
misalign on multi-socket systems; change the temps collection to a map keyed by
the actual package ID (e.g., Record<number, number>) instead of a positional
array so lookups use the package ID, and update the code paths that populate
temps (the block that currently pushes values) and the consumer in
generateTelemetry() to use temps[pkgId] from the map; mirror the approach used
by powerData for consistency.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b97f7eec-0971-498f-9fa7-b09601722a83

📥 Commits

Reviewing files that changed from the base of the PR and between 04ea837 and 32b7af3.

📒 Files selected for processing (1)
  • api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts

@SimonFair SimonFair marked this pull request as ready for review May 20, 2026 14:28
@SimonFair SimonFair marked this pull request as draft May 20, 2026 14:28
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 32b7af310a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

continue;
}

if (!/fam15h_power|zenpower|amd_energy|rapl/i.test(chipName)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle amd_energy fallback with energy counters

When getPackagePower() falls back to hwmon, this method whitelists amd_energy chips but then only reads power*_input files. The amd_energy driver exposes energy*_input counters (not instantaneous power*_input), so on systems that rely on amd_energy and lack readable /sys/class/powercap entries, the fallback still returns no CPU package power (-1). Please either compute power from energy deltas for amd_energy or exclude it from this fallback path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonFair this is worth a look I think

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts`:
- Line 118: The conditional expression in the else-if branch checking
sensorLabel, f, and label (the clause: else if (!sensorLabel &&
/^temp1_input$/i.test(f) && /k10temp/i.test(label)) ) is too long for the
formatter; reflow it across multiple lines (one sub-condition per line with
proper indentation and parentheses if needed) inside the same surrounding
function/method so it meets the project's formatting rules and linter; locate
the else-if in cpu-topology.service.ts and split the condition into wrapped
lines while preserving the exact boolean logic and regex tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f1186747-2008-446b-a128-69002c24e7c8

📥 Commits

Reviewing files that changed from the base of the PR and between 32b7af3 and 2eff257.

📒 Files selected for processing (1)
  • api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts

Comment thread api/src/unraid-api/graph/resolvers/info/cpu/cpu-topology.service.ts Outdated
@github-actions
Copy link
Copy Markdown
Contributor

This plugin has been deployed to Cloudflare R2 and is available for testing.
Download it at this URL:

https://preview.dl.unraid.net/unraid-api/tag/PR2011/dynamix.unraid.net.plg

@SimonFair SimonFair marked this pull request as ready for review May 20, 2026 18:31
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 369259732e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +305 to +307
if (!Number.isFinite(packageIndex)) {
packageIndex = nextFallbackPackageIndex;
nextFallbackPackageIndex += 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer fallback package index assignment until power data exists

Assigning packageIndex before checking for any power*_input files can shift all subsequent package IDs when a matched chip has no power readings (for example amd_energy, which is whitelisted but typically exposes energy counters only). In that case this block consumes index 0 without adding results, the next valid chip is assigned 1, and generateTelemetry() reports package power under the wrong package ID, producing -1 for the real first package.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a good suggestion I believe

@SimonFair SimonFair requested a review from elibosley May 20, 2026 18:57
Copy link
Copy Markdown
Member

@elibosley elibosley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems safe but plz look at Codex reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants