Skip to content

Latest commit

 

History

History
592 lines (403 loc) · 13.9 KB

File metadata and controls

592 lines (403 loc) · 13.9 KB
title API Reference
description Full REST API documentation with authentication, endpoint details, and curl examples.
sidebar_position 3

API Reference

Clustersight exposes a REST API at /api/v1/. All responses use the ApiResponse envelope unless noted.


Authentication

When CLUSTERSIGHT_PASSWORD is set, all endpoints (except /api/v1/health) require an Authorization header.

Token derivation

The bearer token is the lowercase hex SHA-256 of the plaintext password:

# Derive the token from your password
echo -n "your_password" | sha256sum | cut -d' ' -f1
# Example output: 5e884898da28047151d0e56f8dc629277...

Making authenticated requests

curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/clusters

Dev mode (no authentication)

When CLUSTERSIGHT_PASSWORD is not set, no Authorization header is required. A warning is logged on startup (auth_mode: "dev_mode_no_auth").


Base URL & Versioning

All endpoints are prefixed with /api/v1/.

Base URL (default Docker mapping): http://localhost:3001/api/v1

Response envelope

All endpoints return the ApiResponse envelope:

{
  "data": <payload>,
  "meta": { "cache_hit": false },
  "error": null
}
Field Type Description
data any Response payload (object, array, or null)
meta object Metadata — may include cache_hit: true for cached panel responses
error object | null Error details on failure; null on success

Endpoints Overview

Method Path Description Auth
GET /health Health check + version No
GET /overview All clusters with health + alert summary Yes
GET /clusters List configured clusters Yes
POST /clusters Add a new cluster Yes
POST /clusters/test Test connection & discover topology Yes
GET /clusters/{id} Get a single cluster Yes
PUT /clusters/{id} Update cluster configuration Yes
DELETE /clusters/{id} Remove a cluster Yes
GET /panels/health-score Cluster health score + components Yes
GET /panels/replication Replication lag time-series Yes
GET /panels/merges Active merges time-series Yes
GET /panels/disk Disk usage per disk Yes
GET /panels/mutations Active mutations Yes
GET /panels/broken-parts Broken/detached parts count Yes
GET /panels/keeper Keeper connection health Yes
GET /panels/keeper-nodes Per-node Keeper TCP status Yes
GET /panels/zookeeper ZooKeeper session health Yes
GET /panels/compression Per-table compression ratios Yes
GET /panels/errors Error event time-series Yes
GET /alerts/rules/metrics Valid metric keys for custom rules Yes
GET /alerts/rules List alert rules Yes
POST /alerts/rules Create a custom alert rule Yes
PUT /alerts/rules/{id} Update an alert rule Yes
DELETE /alerts/rules/{id} Delete a custom alert rule Yes
GET /alerts/history Alert history with filters Yes
POST /alerts/{id}/acknowledge Acknowledge an alert Yes
POST /alerts/{id}/snooze Snooze an alert Yes
POST /alerts/{id}/escalate Escalate alert to Slack Yes
GET /alerts/summary Unresolved alert count Yes
GET /alerts/{id} Single alert detail Yes
GET /queries/slow Slow queries analysis Yes
POST /queries/explain Fetch EXPLAIN for a query Yes
GET /queries/failed Failed queries breakdown Yes
GET /queries/parts-distribution Parts size distribution Yes
GET /settings Application settings Yes
PUT /settings Update application settings Yes
POST /settings/test-notification Send a test Slack notification Yes

Health

GET /api/v1/health

Health check. Does not require authentication. Safe to use as a liveness probe.

Response:

{
  "data": { "version": "0.1.0" },
  "meta": {},
  "error": null
}

curl:

curl http://localhost:3001/api/v1/health

Cluster Overview

GET /api/v1/overview

Returns a summary of all active clusters: health score, alert counts, collector status. Powers the Cluster Overview page.

Response: Array of ClusterOverviewItem inside data.

{
  "data": [
    {
      "id": "abc123",
      "name": "production",
      "host": "clickhouse.internal",
      "port": 8123,
      "health_score": 87.4,
      "grade": "B+",
      "active_alert_count": 1,
      "collector_status": "online",
      "last_collected_at": "2026-03-22T14:00:00Z",
      "logical_cluster_count": 3
    }
  ],
  "meta": {},
  "error": null
}

Clusters

GET /api/v1/clusters

List all configured clusters.

curl:

curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/clusters

Response: Array of ClusterResponse objects.


POST /api/v1/clusters

Add a new cluster.

Request body:

{
  "host": "clickhouse.internal",
  "port": 8123,
  "username": "clustersight_ro",
  "password": "readonly_password",
  "name": "production"
}

Response: ClusterResponse with assigned id.


POST /api/v1/clusters/test

Test a connection before saving. Returns topology information (discovered clusters, Keeper nodes).

Request body: Same as POST /clusters.

Response:

{
  "data": {
    "success": true,
    "message": "Connected successfully",
    "clusters": ["cluster1", "cluster2"],
    "keeper_nodes": ["keeper1:9181", "keeper2:9181"]
  },
  "meta": {},
  "error": null
}

GET /api/v1/clusters/{id}

Get a single cluster by ID.


PUT /api/v1/clusters/{id}

Update cluster configuration (host, port, credentials, name).


DELETE /api/v1/clusters/{id}

Soft-delete a cluster. The cluster is marked inactive; its data is retained.

Response:

{ "data": { "id": "abc123", "deleted": true }, "meta": {}, "error": null }

Dashboard Panels

Each panel is a dedicated endpoint. Panel data is cached for CACHE_TTL seconds (default: 30).

GET /api/v1/panels/health-score

Returns the current health score, grade, trend, and per-component breakdown.

Response:

{
  "data": {
    "overall_score": 87.4,
    "grade": "B+",
    "trend": "stable",
    "previous_score": 85.0,
    "components": {
      "replication": { "score": 100.0, "label": "Replication" },
      "storage":     { "score": 82.0,  "label": "Storage" },
      "errors":      { "score": 95.0,  "label": "Errors" },
      "infrastructure": { "score": 80.0, "label": "Infrastructure" },
      "queries":     { "score": 70.0,  "label": "Queries" }
    },
    "calculated_at": "2026-03-22T14:00:00Z"
  },
  "meta": { "cache_hit": false },
  "error": null
}

No ?range parameter — returns the most recently computed score.


GET /api/v1/panels/replication

Query params: ?range=1h|6h|24h|7d (default: 1h)

Returns replication lag time-series per replicated table.


GET /api/v1/panels/merges

Query params: ?range=1h|6h|24h|7d (default: 1h)

Returns active merge counts and estimated completion times.


GET /api/v1/panels/disk

No range param. Returns current disk usage per disk path (used bytes, free bytes, usage %).


GET /api/v1/panels/mutations

No range param. Returns active mutations with estimated parts remaining.


GET /api/v1/panels/broken-parts

No range param. Returns count of detached/broken parts per table.


GET /api/v1/panels/keeper

No range param. Returns Keeper connection status (ONLINE / OFFLINE).


GET /api/v1/panels/keeper-nodes

No range param. Returns per-node TCP health status (status: 1 = reachable, 0 = unreachable).


GET /api/v1/panels/zookeeper

No range param. Returns ZooKeeper/Keeper session health metrics.


GET /api/v1/panels/compression

No range param. Returns per-table compression ratios (compressed / uncompressed bytes).


GET /api/v1/panels/errors

Query params: ?range=<seconds> (integer, 60–604800, default: 3600)

Note: Unlike replication and merges, the errors panel uses an integer seconds range, not the 1h|6h|24h|7d enum.

Returns error event count time-series from system.errors.

curl:

# Last 6 hours (21600 seconds)
curl -H "Authorization: Bearer <token>" \
  "http://localhost:3001/api/v1/panels/errors?range=21600"

Alerts

GET /api/v1/alerts/rules

List all alert rules (built-in and custom).

curl:

curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/alerts/rules

POST /api/v1/alerts/rules

Create a custom alert rule.

Request body:

{
  "name": "My Custom Rule",
  "metric_key": "disk.usage_pct",
  "operator": ">",
  "threshold": 70.0,
  "severity": "warning",
  "cooldown_sec": 3600
}

Get valid metric keys: GET /api/v1/alerts/rules/metrics


PUT /api/v1/alerts/rules/{id}

Update an existing rule. Built-in rules can have their threshold, severity, and cooldown updated but cannot be deleted.


DELETE /api/v1/alerts/rules/{id}

Delete a custom rule. Returns {"id": "...", "deleted": true}. Built-in rules (rule_type: "built_in") cannot be deleted.


GET /api/v1/alerts/history

List alert history with optional filters.

Query params:

Param Type Default Description
severity string (all) Filter by warning or critical
status string (all) Filter by active, acknowledged, or resolved
hours int (1–168) 24 Lookback window in hours
limit int (1–200) 50 Maximum results to return
offset int (≥ 0) 0 Pagination offset

curl:

# Active critical alerts in the last 24 hours
curl -H "Authorization: Bearer <token>" \
  "http://localhost:3001/api/v1/alerts/history?status=active&severity=critical&hours=24"

POST /api/v1/alerts/{id}/acknowledge

Mark an alert as acknowledged. Sets status to acknowledged.


POST /api/v1/alerts/{id}/snooze

Snooze an alert for a specified duration.

Request body:

{ "duration_minutes": 60 }

Response: { "snoozed_until": "<ISO timestamp>", "rule_id": "..." }


POST /api/v1/alerts/{id}/escalate

Escalate an alert to Slack immediately, regardless of notification cooldown.


GET /api/v1/alerts/summary

Returns the count of currently unresolved alerts.

Response:

{ "data": { "count": 3 }, "meta": {}, "error": null }

GET /api/v1/alerts/{id}

Get full detail for a single alert, including metric history and fix command.


Query Inspector

GET /api/v1/queries/slow

Returns the slowest queries from system.query_log.

Query params: ?hours=<1–168> (default: 24), ?limit=<1–200> (default: 50)

curl:

# Top 20 slowest queries in the last 6 hours
curl -H "Authorization: Bearer <token>" \
  "http://localhost:3001/api/v1/queries/slow?hours=6&limit=20"

POST /api/v1/queries/explain

Fetch the EXPLAIN output for a specific query by its hash.

Request body:

{ "query_hash": "abc123def456..." }

GET /api/v1/queries/failed

Returns failed queries grouped by error type.

Query params: ?hours=<1–168> (default: 24), ?limit=<1–200> (default: 50)


GET /api/v1/queries/parts-distribution

Returns parts size distribution across tables.

Query params: ?limit=<1–500> (default: 100)


Settings

GET /api/v1/settings

Get current application settings. The slack_webhook_url field is masked in the response.


PUT /api/v1/settings

Update application settings.

Request body (all fields optional):

{
  "slack_webhook_url": "https://hooks.slack.com/services/...",
  "collection_interval_sec": 30,
  "tier1_enabled": true,
  "tier2_enabled": true,
  "retention_raw_days": 7,
  "retention_hourly_days": 30,
  "retention_daily_days": 365
}

POST /api/v1/settings/test-notification

Send a test Slack notification using the configured webhook URL.

Response: { "data": { "sent": true }, "meta": {}, "error": null }


Error Codes

HTTP Status Meaning in Clustersight
401 Unauthorized Missing or invalid Authorization header. Provide Bearer <sha256-token> or disable password protection.
404 Not Found The requested resource (cluster, alert, rule) does not exist or has been deleted.
409 Conflict Duplicate resource — e.g., a cluster with the same host:port already exists.
422 Unprocessable Entity Validation error. Response body contains field-level details: { "error": { "detail": [...] } }
503 Service Unavailable ClickHouse is unreachable. The panel or query endpoint could not connect to the cluster. Verify ClickHouse is running and accessible.

curl Examples

List all clusters

TOKEN=$(echo -n "your_password" | sha256sum | cut -d' ' -f1)

curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:3001/api/v1/clusters

Get health score

curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:3001/api/v1/panels/health-score

List active alerts (last 24 hours)

curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:3001/api/v1/alerts/history?status=active&hours=24"

Add a cluster

curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"host":"clickhouse.internal","port":8123,"username":"clustersight_ro","password":"pw","name":"prod"}' \
  http://localhost:3001/api/v1/clusters

Acknowledge an alert

curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  http://localhost:3001/api/v1/alerts/<alert-id>/acknowledge