Skip to content

telemetryflow/telemetryflow-uptime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
TelemetryFlow Logo

TelemetryFlow Uptime - AI-Powered Uptime & Status Page Platform

Version License NestJS Vue 3 TypeScript PostgreSQL ClickHouse Redis DDD RBAC Docker


Overview

TelemetryFlow Uptime is a full-stack monorepo providing Uptime Monitoring, Status Pages, IAM, Alerting, Audit, Tenancy, and System management. Built with NestJS backend and Vue 3 frontend, extracted and adapted from the TelemetryFlow Platform monolith.

Architecture

Monorepo Structure (pnpm + Turborepo)

telemetryflow-uptime/
├── backend/                    # NestJS API (@telemetryflow/backend)
│   └── src/
│       ├── main.ts
│       ├── app.module.ts
│       ├── modules/
│       │   ├── monitoring/
│       │   │   ├── uptime/         # Uptime Monitoring (HTTP/TCP/Ping checks)
│       │   │   └── status-page/    # Status Pages & Incident Management
│       │   ├── iam/            # Identity & Access Management
│       │   ├── auth/           # Authentication (JWT, MFA, Sessions)
│       │   ├── sso/            # Single Sign-On (SAML, OAuth2)
│       │   ├── api-keys/       # API Key Management
│       │   ├── audit/          # Audit Logging (ClickHouse)
│       │   ├── tenancy/        # Multi-tenancy Management
│       │   ├── alerting/       # Alert Rules Engine
│       │   ├── query/          # TFQL Query Engine
│       │   ├── llm/            # AI Assistant (LLM Integration)
│       │   ├── retention/      # Data Retention Policies
│       │   ├── notification/   # Notification Service
│       │   ├── data-masking/   # Data Masking Policies
│       │   └── cache/          # Caching Service
│       ├── shared/             # Shared utilities (queue, domain primitives)
│       ├── database/           # PostgreSQL + ClickHouse migrations/seeds
│       ├── logger/             # Winston logger module
├── frontend/                   # Vue 3 SPA (@telemetryflow/viz)
│   └── src/
│       ├── views/              # Page components
│       │   └── monitoring/     # Uptime & Status Page views
│       ├── components/         # Reusable components
│       │   └── monitoring/     # Uptime graphs, check history, response time charts
│       ├── layouts/            # SideNav layout
│       ├── api/                # API client modules
│       ├── store/              # Pinia stores
│       ├── router/             # Vue Router
│       └── composables/        # Vue composables
├── config/                     # Service configs (PostgreSQL, ClickHouse)
├── docs/                       # Documentation
├── docker-compose.yml          # Full stack deployment
├── turbo.json                  # Turborepo task config
├── pnpm-workspace.yaml         # Workspace definition
└── package.json                # Root scripts (turbo-based)

Sidebar Navigation (Frontend)

Section Items
Home Overview Dashboard
Monitor Uptime Monitoring, Status Pages
Alert Alerts, Alert Rules
IAM Users, Roles, Permissions, Matrix, Groups
Tenancy Tenants, Organizations, Workspaces, Regions
System Setup, Channels, Notifications, AI Assistant, API Keys, PII Masking, Audit Logs, Retention
Account Profile, Security, Sessions, Preferences
Public Status Page (/status/:slug)

Features

Uptime Monitoring

  • Multi-protocol checks: HTTP(S), TCP, ICMP Ping, DNS
  • Configurable intervals: Per-monitor check frequency (10s–5min)
  • Response time tracking: Latency measurement with ClickHouse time-series storage
  • Uptime calculation: Real-time uptime percentage per monitor (24h, 7d, 30d, 90d)
  • SSL/TLS monitoring: Certificate expiry tracking
  • Keyword matching: HTTP body content validation
  • Geographic distributed checks: Multi-region monitoring support
  • Scheduled maintenance windows: Planned downtime exclusion from uptime calculations
  • Domain events: Event-driven architecture for check results, state changes
  • CQRS pattern: Separate read/write operations with dedicated check scheduler

Status Pages

  • Public status pages: Custom-branded pages at /status/:slug
  • Incident management: Create, update, and resolve incidents with timeline
  • Incident timeline: Real-time incident updates with status history
  • Component status: Individual service/component status tracking
  • Subscriber notifications: Email subscriptions for status updates
  • Custom branding: Logo, colors, and custom domain support
  • Uptime graphs: Visual uptime history on public pages
  • Scheduled maintenances: Planned maintenance window display

IAM Module

  • Multi-tenant architecture: Tenant -> Organization -> Workspace hierarchy
  • User management: Complete CRUD with role-based access
  • 5-Tier RBAC System: Super Admin, Administrator, Developer, Viewer, Demo
  • Role-Based Access Control: Hierarchical roles with 22+ permissions
  • Group management: User groups with permission inheritance
  • Region support: Multi-region tenant deployment
  • CQRS pattern: Separate read/write operations (33 commands, 18 queries)
  • Domain events: Event-driven architecture (25+ events)

Retention

  • Retention Policies: Configurable data retention per data type (logs, metrics, traces, audit)
  • Automated Enforcement: Scheduled cleanup via cron jobs
  • Archive Support: Optional archival before deletion

Alerting Module

  • Alert Rules Engine: Define threshold-based alert rules via TFQL
  • Alert Management: Real-time alert tracking and history
  • Notification Integration: Alerts trigger notifications

AI Assistant (LLM)

  • LLM Integration: Multi-provider support (OpenAI, Anthropic, etc.)
  • Encrypted Configuration: Secure API key storage
  • Chat Interface: AI-powered assistant in the frontend

Authentication & Security

  • JWT Authentication: Secure token-based auth with refresh tokens
  • MFA Support: TOTP-based multi-factor authentication
  • SSO Integration: SAML and OAuth2 providers
  • API Key Management: Scoped API keys with permissions
  • Data Masking: Policy-based data redaction
  • Argon2 Password Hashing: Industry-standard password security

Observability

  • Swagger/OpenAPI: Interactive API documentation at /api
  • Winston Logging: Structured logging with multiple transports
  • Health Checks: Built-in health endpoint

Quick Start

Prerequisites

  • Node.js 18+
  • pnpm 8+
  • Docker & Docker Compose

One-Command Setup

# Start all services
docker-compose --profile all up -d

# Or start specific profiles
docker-compose --profile uptime up -d                      # Uptime services only

Docker Profiles

Available profiles:

  • uptime - Backend, Frontend, PostgreSQL, ClickHouse, Redis, NATS
  • tools - Portainer
  • all - Everything

See docs/DOCKER_SETUP.md for details.

Manual Setup

# Clone repository
git clone https://github.com/telemetryflow/telemetryflow-uptime.git
cd telemetryflow-uptime
pnpm install

# Configure environment
cp .env.example .env
# Edit .env with your configuration

# Generate secrets
pnpm run generate:secrets

# Start infrastructure (PostgreSQL + ClickHouse + Redis + NATS)
docker-compose up -d

# Run migrations and seeds
pnpm run db:migrate:seed

# Start development (backend + frontend)
pnpm run dev

Development

Available Scripts

# Development
pnpm dev                  # Start backend + frontend (turbo)
pnpm dev:backend          # Backend only
pnpm dev:frontend         # Frontend only

# Build
pnpm build                # Build all (turbo)
pnpm build:backend        # Backend only
pnpm build:frontend       # Frontend only

# Database
pnpm db:migrate           # Run all migrations (PostgreSQL + ClickHouse)
pnpm db:migrate:seed      # Run migrations + seeds (full setup)
pnpm db:fresh             # Full reset (cleanup + migrate + seed)
pnpm db:seed              # Seed all data
pnpm db:cleanup           # Clean all databases

# Docker
pnpm docker:up            # Start all containers
pnpm docker:down          # Stop all containers
pnpm docker:logs          # View logs
pnpm docker:clean         # Clean volumes

# Code Quality
pnpm lint                 # Lint backend code
pnpm lint:fix             # Lint and fix

# Security
pnpm generate:secrets     # Generate JWT & Session secrets

5-Tier RBAC System

Role Hierarchy

  1. Super Administrator (Global)

    • Platform management across all organizations
    • All permissions
  2. Administrator (Organization-scoped)

    • Full CRUD within organization
    • Cannot manage platform
  3. Developer (Organization-scoped)

    • Create/Read/Update (no delete)
    • Cannot manage users/roles
  4. Viewer (Organization-scoped)

    • Read-only access
    • Cannot modify resources
  5. Demo (Demo org only)

    • Developer access in demo organization
    • Isolated from production data

Default Users

Email Password Role Tier
superadmin.telemetryflow@telemetryflow.id SuperAdmin@123456 Super Administrator 1
administrator.telemetryflow@telemetryflow.id Admin@123456 Administrator 2
developer.telemetryflow@telemetryflow.id Developer@123456 Developer 3
viewer.telemetryflow@telemetryflow.id Viewer@123456 Viewer 4
demo.telemetryflow@telemetryflow.id Demo@123456 Demo 5

API Documentation

Once running, access Swagger UI at: http://localhost:3000/api

Key Endpoints

  • Uptime Monitors: /api/v2/uptime/monitors - Monitor CRUD and check management
  • Uptime Checks: /api/v2/uptime/monitors/:id/checks - Check history and results
  • Status Pages: /api/v2/status-pages - Status page management
  • Incidents: /api/v2/status-pages/:id/incidents - Incident management
  • Public Status: /api/v2/status/public/:slug - Public status page (no auth)
  • Subscribers: /api/v2/status-pages/:id/subscribers - Status page subscriptions
  • Auth: /api/v2/auth - Login, logout, refresh, MFA
  • Users: /api/v2/iam/users - User management
  • Roles: /api/v2/iam/roles - Role management
  • Permissions: /api/v2/iam/permissions - Permission management
  • Tenants: /api/v2/iam/tenants - Tenant management
  • Organizations: /api/v2/iam/organizations - Organization management
  • Workspaces: /api/v2/iam/workspaces - Workspace management
  • Groups: /api/v2/iam/groups - Group management
  • Regions: /api/v2/iam/regions - Region management
  • API Keys: /api/v2/api-keys - API key management
  • SSO: /api/v2/sso - SSO provider management
  • Audit: /api/v2/audit - Audit log queries
  • Alerts: /api/v2/alerting/alerts - Alert management
  • Alert Rules: /api/v2/alerting/alert-rules - Alert rule management
  • LLM: /api/v2/llm - AI Assistant configuration
  • Data Masking: /api/v2/data-masking - Data masking policies
  • Retention: /api/v2/retention - Data retention policies
  • Health: /health - Health check

Docker Deployment

Docker Images

Image Registry Purpose
telemetryflow/telemetryflow-uptime Docker Hub NestJS Backend
telemetryflow/telemetryflow-uptime-viz Docker Hub Vue 3 Frontend (Nginx)

Services

Service IP Port(s) Purpose
Frontend 172.145.145.15 8080 Vue 3 SPA (Nginx)
Backend 172.145.145.10 3000 NestJS API
PostgreSQL 172.145.145.20 5432 IAM + config data
Redis 172.145.145.50 6379 Cache + queues
NATS 172.145.145.55 4222 Event streaming
ClickHouse 172.145.145.40 8123/9000 Audit + uptime

Network: 172.145.0.0/16


Technology Stack

Backend

  • Framework: NestJS 11.x
  • Language: TypeScript 5.9
  • Database: PostgreSQL 16 + ClickHouse
  • ORM: TypeORM 0.3
  • Cache: Redis 7
  • Messaging: NATS JetStream
  • Queues: BullMQ
  • Architecture: DDD + CQRS
  • API Docs: Swagger/OpenAPI
  • Logger: Winston
  • Password Hashing: Argon2

Frontend

  • Framework: Vue 3 + Vite
  • State: Pinia
  • Routing: Vue Router 4
  • UI: Custom component library

Infrastructure

  • Containerization: Docker + Docker Compose
  • Package Manager: pnpm (workspaces)
  • Build Orchestration: Turborepo

Comparison with TelemetryFlow Platform

Feature Platform Uptime
Modules 25+ 15
Frontend Full observability UI Uptime + Status Page + IAM + Alerts + AI
Uptime Monitoring Yes Yes
Status Pages Yes Yes
Telemetry Ingestion Metrics, Logs, Traces No
Alerts Yes Yes
AI Assistant Yes Yes
Retention Yes Yes
Agent Management Yes No
Dashboards Full dashboard builder Overview dashboards
Architecture Monolith Monorepo (Backend + Frontend)
Databases PostgreSQL + ClickHouse PostgreSQL + ClickHouse
Docker Images Backend + Frontend Backend + Frontend
Infrastructure Redis + NATS + Loki + OS Redis + NATS

See docs/ARCHITECTURE.md for detailed architecture diagrams.


Documentation

Core

Architecture & Modules

API & Frontend

Security & Alerting

Configuration


Contributing

See CONTRIBUTING.md for detailed guidelines.

Security

See SECURITY.md for security policy and vulnerability reporting.

License

Apache-2.0 License - see LICENSE file for details

Acknowledgments

Extracted from TelemetryFlow Platform - Community Enterprise Observability Platform (CEOP).


Built with ❤️ by Telemetri Data Indonesia collaboration with Kiro