TelemetryFlow Uptime is a full-stack monorepo providing Uptime Monitoring, Status Pages, IAM, Alerting, Audit, Tenancy, and System management. Built with NestJS backend and Vue 3 frontend, extracted and adapted from the TelemetryFlow Platform monolith.
telemetryflow-uptime/
├── backend/ # NestJS API (@telemetryflow/backend)
│ └── src/
│ ├── main.ts
│ ├── app.module.ts
│ ├── modules/
│ │ ├── monitoring/
│ │ │ ├── uptime/ # Uptime Monitoring (HTTP/TCP/Ping checks)
│ │ │ └── status-page/ # Status Pages & Incident Management
│ │ ├── iam/ # Identity & Access Management
│ │ ├── auth/ # Authentication (JWT, MFA, Sessions)
│ │ ├── sso/ # Single Sign-On (SAML, OAuth2)
│ │ ├── api-keys/ # API Key Management
│ │ ├── audit/ # Audit Logging (ClickHouse)
│ │ ├── tenancy/ # Multi-tenancy Management
│ │ ├── alerting/ # Alert Rules Engine
│ │ ├── query/ # TFQL Query Engine
│ │ ├── llm/ # AI Assistant (LLM Integration)
│ │ ├── retention/ # Data Retention Policies
│ │ ├── notification/ # Notification Service
│ │ ├── data-masking/ # Data Masking Policies
│ │ └── cache/ # Caching Service
│ ├── shared/ # Shared utilities (queue, domain primitives)
│ ├── database/ # PostgreSQL + ClickHouse migrations/seeds
│ ├── logger/ # Winston logger module
├── frontend/ # Vue 3 SPA (@telemetryflow/viz)
│ └── src/
│ ├── views/ # Page components
│ │ └── monitoring/ # Uptime & Status Page views
│ ├── components/ # Reusable components
│ │ └── monitoring/ # Uptime graphs, check history, response time charts
│ ├── layouts/ # SideNav layout
│ ├── api/ # API client modules
│ ├── store/ # Pinia stores
│ ├── router/ # Vue Router
│ └── composables/ # Vue composables
├── config/ # Service configs (PostgreSQL, ClickHouse)
├── docs/ # Documentation
├── docker-compose.yml # Full stack deployment
├── turbo.json # Turborepo task config
├── pnpm-workspace.yaml # Workspace definition
└── package.json # Root scripts (turbo-based)
| Section | Items |
|---|---|
| Home | Overview Dashboard |
| Monitor | Uptime Monitoring, Status Pages |
| Alert | Alerts, Alert Rules |
| IAM | Users, Roles, Permissions, Matrix, Groups |
| Tenancy | Tenants, Organizations, Workspaces, Regions |
| System | Setup, Channels, Notifications, AI Assistant, API Keys, PII Masking, Audit Logs, Retention |
| Account | Profile, Security, Sessions, Preferences |
| Public | Status Page (/status/:slug) |
- Multi-protocol checks: HTTP(S), TCP, ICMP Ping, DNS
- Configurable intervals: Per-monitor check frequency (10s–5min)
- Response time tracking: Latency measurement with ClickHouse time-series storage
- Uptime calculation: Real-time uptime percentage per monitor (24h, 7d, 30d, 90d)
- SSL/TLS monitoring: Certificate expiry tracking
- Keyword matching: HTTP body content validation
- Geographic distributed checks: Multi-region monitoring support
- Scheduled maintenance windows: Planned downtime exclusion from uptime calculations
- Domain events: Event-driven architecture for check results, state changes
- CQRS pattern: Separate read/write operations with dedicated check scheduler
- Public status pages: Custom-branded pages at
/status/:slug - Incident management: Create, update, and resolve incidents with timeline
- Incident timeline: Real-time incident updates with status history
- Component status: Individual service/component status tracking
- Subscriber notifications: Email subscriptions for status updates
- Custom branding: Logo, colors, and custom domain support
- Uptime graphs: Visual uptime history on public pages
- Scheduled maintenances: Planned maintenance window display
- Multi-tenant architecture: Tenant -> Organization -> Workspace hierarchy
- User management: Complete CRUD with role-based access
- 5-Tier RBAC System: Super Admin, Administrator, Developer, Viewer, Demo
- Role-Based Access Control: Hierarchical roles with 22+ permissions
- Group management: User groups with permission inheritance
- Region support: Multi-region tenant deployment
- CQRS pattern: Separate read/write operations (33 commands, 18 queries)
- Domain events: Event-driven architecture (25+ events)
- Retention Policies: Configurable data retention per data type (logs, metrics, traces, audit)
- Automated Enforcement: Scheduled cleanup via cron jobs
- Archive Support: Optional archival before deletion
- Alert Rules Engine: Define threshold-based alert rules via TFQL
- Alert Management: Real-time alert tracking and history
- Notification Integration: Alerts trigger notifications
- LLM Integration: Multi-provider support (OpenAI, Anthropic, etc.)
- Encrypted Configuration: Secure API key storage
- Chat Interface: AI-powered assistant in the frontend
- JWT Authentication: Secure token-based auth with refresh tokens
- MFA Support: TOTP-based multi-factor authentication
- SSO Integration: SAML and OAuth2 providers
- API Key Management: Scoped API keys with permissions
- Data Masking: Policy-based data redaction
- Argon2 Password Hashing: Industry-standard password security
- Swagger/OpenAPI: Interactive API documentation at
/api - Winston Logging: Structured logging with multiple transports
- Health Checks: Built-in health endpoint
- Node.js 18+
- pnpm 8+
- Docker & Docker Compose
# Start all services
docker-compose --profile all up -d
# Or start specific profiles
docker-compose --profile uptime up -d # Uptime services onlyAvailable profiles:
uptime- Backend, Frontend, PostgreSQL, ClickHouse, Redis, NATStools- Portainerall- Everything
See docs/DOCKER_SETUP.md for details.
# Clone repository
git clone https://github.com/telemetryflow/telemetryflow-uptime.git
cd telemetryflow-uptime
pnpm install
# Configure environment
cp .env.example .env
# Edit .env with your configuration
# Generate secrets
pnpm run generate:secrets
# Start infrastructure (PostgreSQL + ClickHouse + Redis + NATS)
docker-compose up -d
# Run migrations and seeds
pnpm run db:migrate:seed
# Start development (backend + frontend)
pnpm run dev# Development
pnpm dev # Start backend + frontend (turbo)
pnpm dev:backend # Backend only
pnpm dev:frontend # Frontend only
# Build
pnpm build # Build all (turbo)
pnpm build:backend # Backend only
pnpm build:frontend # Frontend only
# Database
pnpm db:migrate # Run all migrations (PostgreSQL + ClickHouse)
pnpm db:migrate:seed # Run migrations + seeds (full setup)
pnpm db:fresh # Full reset (cleanup + migrate + seed)
pnpm db:seed # Seed all data
pnpm db:cleanup # Clean all databases
# Docker
pnpm docker:up # Start all containers
pnpm docker:down # Stop all containers
pnpm docker:logs # View logs
pnpm docker:clean # Clean volumes
# Code Quality
pnpm lint # Lint backend code
pnpm lint:fix # Lint and fix
# Security
pnpm generate:secrets # Generate JWT & Session secrets-
Super Administrator (Global)
- Platform management across all organizations
- All permissions
-
Administrator (Organization-scoped)
- Full CRUD within organization
- Cannot manage platform
-
Developer (Organization-scoped)
- Create/Read/Update (no delete)
- Cannot manage users/roles
-
Viewer (Organization-scoped)
- Read-only access
- Cannot modify resources
-
Demo (Demo org only)
- Developer access in demo organization
- Isolated from production data
| Password | Role | Tier | |
|---|---|---|---|
| superadmin.telemetryflow@telemetryflow.id | SuperAdmin@123456 | Super Administrator | 1 |
| administrator.telemetryflow@telemetryflow.id | Admin@123456 | Administrator | 2 |
| developer.telemetryflow@telemetryflow.id | Developer@123456 | Developer | 3 |
| viewer.telemetryflow@telemetryflow.id | Viewer@123456 | Viewer | 4 |
| demo.telemetryflow@telemetryflow.id | Demo@123456 | Demo | 5 |
Once running, access Swagger UI at: http://localhost:3000/api
- Uptime Monitors:
/api/v2/uptime/monitors- Monitor CRUD and check management - Uptime Checks:
/api/v2/uptime/monitors/:id/checks- Check history and results - Status Pages:
/api/v2/status-pages- Status page management - Incidents:
/api/v2/status-pages/:id/incidents- Incident management - Public Status:
/api/v2/status/public/:slug- Public status page (no auth) - Subscribers:
/api/v2/status-pages/:id/subscribers- Status page subscriptions - Auth:
/api/v2/auth- Login, logout, refresh, MFA - Users:
/api/v2/iam/users- User management - Roles:
/api/v2/iam/roles- Role management - Permissions:
/api/v2/iam/permissions- Permission management - Tenants:
/api/v2/iam/tenants- Tenant management - Organizations:
/api/v2/iam/organizations- Organization management - Workspaces:
/api/v2/iam/workspaces- Workspace management - Groups:
/api/v2/iam/groups- Group management - Regions:
/api/v2/iam/regions- Region management - API Keys:
/api/v2/api-keys- API key management - SSO:
/api/v2/sso- SSO provider management - Audit:
/api/v2/audit- Audit log queries - Alerts:
/api/v2/alerting/alerts- Alert management - Alert Rules:
/api/v2/alerting/alert-rules- Alert rule management - LLM:
/api/v2/llm- AI Assistant configuration - Data Masking:
/api/v2/data-masking- Data masking policies - Retention:
/api/v2/retention- Data retention policies - Health:
/health- Health check
| Image | Registry | Purpose |
|---|---|---|
telemetryflow/telemetryflow-uptime |
Docker Hub | NestJS Backend |
telemetryflow/telemetryflow-uptime-viz |
Docker Hub | Vue 3 Frontend (Nginx) |
| Service | IP | Port(s) | Purpose |
|---|---|---|---|
| Frontend | 172.145.145.15 | 8080 | Vue 3 SPA (Nginx) |
| Backend | 172.145.145.10 | 3000 | NestJS API |
| PostgreSQL | 172.145.145.20 | 5432 | IAM + config data |
| Redis | 172.145.145.50 | 6379 | Cache + queues |
| NATS | 172.145.145.55 | 4222 | Event streaming |
| ClickHouse | 172.145.145.40 | 8123/9000 | Audit + uptime |
Network: 172.145.0.0/16
- Framework: NestJS 11.x
- Language: TypeScript 5.9
- Database: PostgreSQL 16 + ClickHouse
- ORM: TypeORM 0.3
- Cache: Redis 7
- Messaging: NATS JetStream
- Queues: BullMQ
- Architecture: DDD + CQRS
- API Docs: Swagger/OpenAPI
- Logger: Winston
- Password Hashing: Argon2
- Framework: Vue 3 + Vite
- State: Pinia
- Routing: Vue Router 4
- UI: Custom component library
- Containerization: Docker + Docker Compose
- Package Manager: pnpm (workspaces)
- Build Orchestration: Turborepo
| Feature | Platform | Uptime |
|---|---|---|
| Modules | 25+ | 15 |
| Frontend | Full observability UI | Uptime + Status Page + IAM + Alerts + AI |
| Uptime Monitoring | Yes | Yes |
| Status Pages | Yes | Yes |
| Telemetry Ingestion | Metrics, Logs, Traces | No |
| Alerts | Yes | Yes |
| AI Assistant | Yes | Yes |
| Retention | Yes | Yes |
| Agent Management | Yes | No |
| Dashboards | Full dashboard builder | Overview dashboards |
| Architecture | Monolith | Monorepo (Backend + Frontend) |
| Databases | PostgreSQL + ClickHouse | PostgreSQL + ClickHouse |
| Docker Images | Backend + Frontend | Backend + Frontend |
| Infrastructure | Redis + NATS + Loki + OS | Redis + NATS |
See docs/ARCHITECTURE.md for detailed architecture diagrams.
- CHANGELOG.md - Version history
- CONTRIBUTING.md - Contribution guidelines
- SECURITY.md - Security policy
- docs/ARCHITECTURE.md - System architecture with mermaid diagrams
- docs/UPTIME_MODULE.md - Uptime monitoring module documentation
- docs/STATUS_PAGE_MODULE.md - Status page module documentation
- docs/DATABASE_SCHEMA.md - Database schema documentation
- docs/DOCKER_SETUP.md - Docker deployment guide
- docs/API_REFERENCE.md - API reference documentation
- docs/FRONTEND_GUIDE.md - Frontend development guide
- docs/SECURITY_ARCHITECTURE.md - Security architecture
- docs/UPTIME_ALERTING.md - Alerting & notification guide
- docs/EMAIL_TEMPLATES.md - Email template customization
- config/README.md - All service configurations
See CONTRIBUTING.md for detailed guidelines.
See SECURITY.md for security policy and vulnerability reporting.
Apache-2.0 License - see LICENSE file for details
Extracted from TelemetryFlow Platform - Community Enterprise Observability Platform (CEOP).
Built with ❤️ by Telemetri Data Indonesia collaboration with Kiro