Skip to content

docs(monitoring): use alloy instead of promtail#260

Open
ma-hartma wants to merge 5 commits into
mainfrom
monitoring-alloy
Open

docs(monitoring): use alloy instead of promtail#260
ma-hartma wants to merge 5 commits into
mainfrom
monitoring-alloy

Conversation

@ma-hartma

@ma-hartma ma-hartma commented May 6, 2026

Copy link
Copy Markdown

Description

Update monitoring docs to reflect metal-stack/metal-roles#552.
Only merge this, after metal-stack/metal-roles#592 and metal-stack/metal-roles#595 are merged and released.

Used AI-Tools ✨

  • Claude Sonnet 4.6 for wording

References:

@ma-hartma ma-hartma requested a review from a team as a code owner May 6, 2026 14:07
@metal-robot metal-robot Bot added the area: documentation Affects the documentation area. label May 6, 2026
@metal-robot metal-robot Bot added this to Development May 6, 2026
@netlify

netlify Bot commented May 6, 2026

Copy link
Copy Markdown

Deploy Preview for metal-stack-io ready!

Name Link
🔨 Latest commit a367764
🔍 Latest deploy log https://app.netlify.com/projects/metal-stack-io/deploys/6a2912e125d01c00081b9920
😎 Deploy Preview https://deploy-preview-260--metal-stack-io.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@ma-hartma ma-hartma marked this pull request as draft May 8, 2026 13:44
@ma-hartma ma-hartma marked this pull request as ready for review May 19, 2026 16:08

@vknabel vknabel left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits. Waiting for review and merge of actual changes before a full approve.

Comment thread docs/04-For Operators/05-monitoring.md Outdated
Comment thread docs/04-For Operators/05-monitoring.md Outdated
Comment thread docs/04-For Operators/05-monitoring.md Outdated

@Gerrit91 Gerrit91 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the docs accordingly. This section really misses a lot of information.


![Monitoring Stack](monitoring-stack.svg)

The diagram above shows the full monitoring and logging stack: partition hosts ship logs to Loki and expose metrics for Prometheus scraping; control-plane and Gardener seed Alloy instances push both logs and self-metrics centrally; Grafana provides unified dashboards and alerting across all tiers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think regarding Thanos this diagram is kind of wrong, too. Can you maybe correct it, too? Add that partition prometheuses remote write their metrics to it.


#### Gardener

Gardener ships with a built-in logging stack (Vali + fluent-bit per seed). The metal-stack deployment disables this stack and instead uses Alloy to forward all logs centrally — giving platform operators a single place to query infrastructure logs across all Gardener clusters.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gardener logging must not be disabled. It's just that we do it. Users can always decide to enable it. We just deploy our own logging in addition to their logging because we have an own centralized control plane.

- `grafana-dashboard-sonic-exporter`

and also some gardener related dashboards:
Metrics are supplied by

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not the only metrics, these are just additional metrics exporters that we have.

- `ipmi-exporter`
- `sonic-exporter`
- `metal-core`
- `frr-exporter`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the frr-exporter is deployable for some reason, maybe we should remove it from the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: documentation Affects the documentation area.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants