SchedSim is a comprehensive simulation and testing platform designed for Kubernetes scheduling teams. It enables full-cycle verification of scheduler plugins (kube-scheduler, crane-scheduler, descheduler, pooling-scheduler), covering functional correctness, algorithm effectiveness, performance benchmarking, and component coordination.
- Visual Scenario Orchestration: Design complex testing scenarios using a DAG-based visual editor (React Flow).
- Hybrid Simulation Environment: Supports Kind + KWOK for large-scale node simulation (100s-1000s of nodes) with minimal resource usage.
- Cluster Mirroring: Capture state from production clusters and replay it in a local simulation environment for realistic testing.
- Multi-Scheduler Support: Built-in support for
kube-scheduler,crane-scheduler,descheduler, andpooling-scheduler. - Chaos Engineering: Inject faults (node failure, network partition) to verify scheduler resilience.
- Comprehensive Reporting: Generate detailed test reports with metrics, logs, and pass/fail criteria.
- Real-time Monitoring: WebSocket-based real-time execution logs and status updates.
- Go: 1.22+
- Node.js: 20+
- Docker: 20.10+
- Kind: v0.20.0+
- KWOK: v0.5.0+
-
Clone the repository
git clone https://github.com/gocrane/sched-sim.git cd sched-sim -
Start in Development Mode This starts both the Go backend (with hot reload via
airif available) and the React frontend (Vite).make dev
Access the Web UI at
http://localhost:5173and the API athttp://localhost:8080.
-
Build the Image
make docker
-
Run Container
docker run -d -p 8080:8080 --name schedsim schedsim:dev
Access the application at
http://localhost:8080.
-
Package Helm Chart
make helm-package
-
Install Chart
helm install schedsim ./deploy/helm/schedsim
SchedSim is configured via configs/schedsim.yaml. The default configuration includes:
server:
port: 8080
mode: debug # debug or release
database:
driver: sqlite # sqlite or postgres
dsn: "./data/schedsim.db"
environment:
kindImageRepo: "kindest/node"
kwokVersion: "v0.5.0"
maxConcurrentExecutions: 3See configs/schedsim.yaml for the full configuration reference.
SchedSim consists of a React+TypeScript frontend and a Go (Gin) backend.
- Frontend: React 18, Ant Design 5, React Flow, ECharts.
- Backend: Go 1.22, Gin Web Framework, GORM.
- Infrastructure: Kind (Kubernetes in Docker), KWOK (Kubernetes WithOut Kubelet).
- Storage: SQLite (default) or PostgreSQL.
For a detailed architecture view, see System Architecture.
Navigate to Environments to create a new simulation cluster. You can choose a "Standard" template (Kind+KWOK) and define the number of simulated nodes (e.g., 100 nodes).
Use the Scenario Editor to create a test flow:
- Deploy: Add a step to deploy a workload (Deployment/StatefulSet).
- Assert: Add a step to wait for "All Pods Scheduled".
- Validate: Add a step to check "Load Distribution" or "Resource Utilization".
Run the scenario against your created environment. Watch the real-time logs in the Execution view. Once finished, download the PDF/HTML report from the Reports center.
The backend exposes a RESTful API. Key endpoints include:
POST /api/v1/auth/login: User authentication.GET /api/v1/environments: List managed clusters.POST /api/v1/executions: Trigger a new test execution.GET /api/v1/reports: Retrieve test reports.GET /api/v1/ws: WebSocket connection for real-time updates.
See internal/api/router/router.go for all defined routes.