From b3c667c18b4d2c402acdedb6962c1e77951db5a1 Mon Sep 17 00:00:00 2001 From: GatewayJ <835269233@qq.com> Date: Thu, 30 Apr 2026 15:42:27 +0800 Subject: [PATCH 1/2] chore(git): stop tracking docs and ignore .codex --- .gitignore | 4 +- docs/DEVELOPMENT-NOTES.md | 230 ------ docs/DEVELOPMENT.md | 992 -------------------------- docs/POOL-STATUS-EXPLANATION.md | 285 -------- docs/RUSTFS-K8S-INTEGRATION.md | 405 ----------- docs/RUSTFS-OBJECT-STORAGE-USAGE.md | 664 ----------------- docs/architecture-decisions.md | 469 ------------ docs/multi-pool-use-cases.md | 430 ----------- docs/prd-tenant-events-sse.md | 127 ---- docs/tech-design-tenant-events-sse.md | 239 ------- 10 files changed, 3 insertions(+), 3842 deletions(-) delete mode 100755 docs/DEVELOPMENT-NOTES.md delete mode 100755 docs/DEVELOPMENT.md delete mode 100755 docs/POOL-STATUS-EXPLANATION.md delete mode 100755 docs/RUSTFS-K8S-INTEGRATION.md delete mode 100755 docs/RUSTFS-OBJECT-STORAGE-USAGE.md delete mode 100755 docs/architecture-decisions.md delete mode 100755 docs/multi-pool-use-cases.md delete mode 100644 docs/prd-tenant-events-sse.md delete mode 100644 docs/tech-design-tenant-events-sse.md diff --git a/.gitignore b/.gitignore index c3f77e0..754ba70 100755 --- a/.gitignore +++ b/.gitignore @@ -26,4 +26,6 @@ console-web/node_modules/ # Docs / summaries (local or generated) CONSOLE-INTEGRATION-SUMMARY.md SCRIPTS-UPDATE.md -AGENTS.md \ No newline at end of file +AGENTS.md +docs/ +.codex/ diff --git a/docs/DEVELOPMENT-NOTES.md b/docs/DEVELOPMENT-NOTES.md deleted file mode 100755 index 77a3e8b..0000000 --- a/docs/DEVELOPMENT-NOTES.md +++ /dev/null @@ -1,230 +0,0 @@ -# Development Notes - -> **Scope**: This file records **historical analysis sessions and design notes**. It is **not** the canonical development guide. For setup and quality gates, use [`DEVELOPMENT.md`](./DEVELOPMENT.md) and [`CONTRIBUTING.md`](../CONTRIBUTING.md). For current ports and operator behavior, see [`CLAUDE.md`](../CLAUDE.md) and the source tree. - -**Port terminology (do not confuse):** - -- **RustFS inside a Tenant** (Services created by the operator): S3 API **9000**, RustFS Console UI **9001** (see `src/types/v1alpha1/tenant/services.rs`). -- **Operator HTTP Console** (`cargo run -- console`, default **9090**): separate management API for the operator itself, not the same as the Tenant’s `{tenant}-console` Service. - -## Analysis Sessions - -### Initial Bug Analysis (2025-11-05) - -See [CHANGELOG.md](../CHANGELOG.md) for complete list of bugs found and fixed. - -**Key Discovery** (historical—**since fixed** in this repo): Through analysis of RustFS source and early operator output, several mismatches were found versus RustFS defaults, including: - -- Wrong **RustFS** service ports in older operator revisions (e.g. IO **90** instead of **9000**, console **9090** instead of **9001** for the in-cluster RustFS Console Service) -- Missing environment variables -- Non-standard volume paths - -**Methodology**: Analyzed RustFS repository at `~/git/rustfs` to verify correct implementation. - -### Multi-Pool Enhancements (2025-11-08) - -Added comprehensive Kubernetes scheduling capabilities to Pool struct. - -**Design Decision**: Use `SchedulingConfig` struct with `#[serde(flatten)]` -- Better code organization -- Maintains flat YAML structure -- Follows industry patterns (MongoDB, PostgreSQL operators) - -See [architecture-decisions.md](./architecture-decisions.md) for detailed rationale. - -### RustFS Architecture Deep Dive (2025-11-08) - -**Critical Finding**: All pools form ONE unified RustFS cluster, not independent storage tiers. - -#### How RustFS Actually Works - -From RustFS source code analysis (`~/git/rustfs`): - -**1. Unified Cluster Architecture** (`crates/ecstore/src/pools.rs`): -- All pools combined into ONE `RUSTFS_VOLUMES` environment variable -- Single distributed hash ring across all volumes -- No pool independence - -**2. Uniform Erasure Coding** (`crates/ecstore/src/erasure.rs`): -- Reed-Solomon erasure coding across ALL volumes -- Shards distributed uniformly (no preference for fast disks) -- Parity calculated for total drive count across all pools - -**3. No Storage Class Awareness** (`crates/ecstore/src/config/storageclass.rs`): -- Storage class controls PARITY levels (EC:4, EC:2), NOT disk selection -- Does NOT control data placement or prefer certain disks -- No hot/warm/cold data awareness - -**4. External Tiering Only** (`crates/ecstore/src/tier/tier.rs`): -- Tiering = transitioning to EXTERNAL cloud storage -- Types: `TierType::S3`, `TierType::Azure`, `TierType::GCS` -- NOT for internal disk class differentiation - -#### Performance Implications of Storage Class Mixing - -**Problem**: Mixing NVMe/SSD/HDD in one Tenant - -**What Actually Happens**: -- Object is erasure-coded into shards -- Shards distributed across ALL volumes (NVMe + SSD + HDD) -- Write completes when ALL shards written (limited by slowest = HDD) -- Read requires fetching shards (limited by slowest = HDD) -- **Result**: Entire cluster performs at HDD speed, NVMe wasted - -**Conclusion**: Do NOT mix storage classes for "performance tiers" - it doesn't work. - -#### Valid Multi-Pool Purposes - -✅ **What Works**: -- Cluster expansion (add pools for capacity) -- Geographic distribution (compliance/DR, not performance) -- Spot vs on-demand (compute cost, same storage class) -- Same class, different sizes (utilize mixed hardware) -- Resource differentiation (CPU/memory per pool) - -❌ **What Doesn't Work**: -- NVMe for hot data, HDD for cold data -- Storage performance tiering via multi-pool -- Automatic intelligent data placement - -**For Real Tiering**: Use RustFS lifecycle policies to external cloud storage (S3 Glacier, Azure Cool, GCS Nearline). - -## Design Principles - -### 1. Verify Against RustFS Source - -All implementation decisions verified against official RustFS source code, not assumptions. - -**Sources**: -- RustFS constants: `crates/config/src/constants/app.rs` -- RustFS config: `rustfs/src/config/mod.rs` -- RustFS Helm chart: `helm/rustfs/` - -### 2. Follow Kubernetes Conventions - -- Use recommended labels (`app.kubernetes.io/name`, etc.) -- Server-side apply for idempotency -- Owner references for garbage collection -- Industry-standard CRD patterns - -### 3. Backward Compatibility - -- All new fields are `Option` -- Use `#[serde(flatten)]` to avoid breaking YAML structure -- Maintain existing behavior by default - -### 4. User Experience First - -- Clear, accurate examples -- Prominent warnings about gotchas -- Comprehensive documentation -- Prevent costly mistakes (storage class mixing) - -## Testing Strategy - -### Unit Tests - -- Test resource structure creation -- Test field propagation (scheduling, RBAC, etc.) -- Test edge cases (None values, overrides) -- Currently: 47 library unit tests (run `cargo test --all` for the exact count), all passing - -### Integration Tests (Future) - -- Deploy actual Tenant -- Verify RustFS cluster formation -- Test multi-pool behavior -- Validate RUSTFS_VOLUMES expansion - -## Code Organization - -### Module Structure - -``` -src/ -├── types/ -│ └── v1alpha1/ -│ ├── pool.rs (SchedulingConfig + Pool) -│ ├── persistence.rs -│ ├── tenant.rs -│ └── tenant/ -│ ├── rbac.rs (RBAC factory methods) -│ ├── services.rs (Service factory methods) -│ └── workloads.rs (StatefulSet factory methods) -├── reconcile.rs (reconciliation logic) -└── context.rs (Kubernetes API wrapper) -``` - -### Pattern: Factory Methods - -Each resource type has a factory method on Tenant: -- `new_role()`, `new_service_account()`, `new_role_binding()` -- `new_io_service()`, `new_console_service()`, `new_headless_service()` -- `new_statefulset(pool)` - -This keeps logic organized and testable. - -## Common Pitfalls to Avoid - -### 1. Storage Class Mixing - -❌ **Don't**: Create pools with different storage classes for "performance tiering" -```yaml -pools: - - name: fast - storageClassName: nvme # ← Don't mix - - name: slow - storageClassName: hdd # ← Performance tiers -``` - -✅ **Do**: Use same storage class, different sizes -```yaml -pools: - - name: large - storageClassName: ssd # ← Same class - storage: 10Ti - - name: small - storageClassName: ssd # ← Same class - storage: 2Ti -``` - -### 2. Assuming Pool Independence - -❌ **Don't**: Think pools are independent clusters - -✅ **Do**: Understand all pools form ONE unified cluster via RUSTFS_VOLUMES - -### 3. Missing Required Fields - -Always set in operator (users don't need to): -- RUSTFS_VOLUMES (generated) -- RUSTFS_ADDRESS (auto-set) -- RUSTFS_CONSOLE_ADDRESS (auto-set) -- RUSTFS_CONSOLE_ENABLE (auto-set) - -## Future Enhancements - -### Planned - -- Status field population -- Configuration secret mounting -- Image pull policy application -- Health probes -- Per-pool status tracking - -### Under Consideration - -- Dynamic pool addition API -- Pool decommissioning automation -- Pool-specific service endpoints -- Advanced topology awareness - -## References - -- [Multi-Pool Use Cases](./multi-pool-use-cases.md) -- [Architecture Decisions](./architecture-decisions.md) -- [CHANGELOG](../CHANGELOG.md) - ---- - -**Last Updated**: 2026-03-28 diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md deleted file mode 100755 index c44b874..0000000 --- a/docs/DEVELOPMENT.md +++ /dev/null @@ -1,992 +0,0 @@ -# RustFS Operator Development Guide - -This guide will help you set up a local development environment for the RustFS Kubernetes Operator. - -## Documentation map - -- **Code quality and PR gates** (format, clippy, tests, console lint) are defined in [`CONTRIBUTING.md`](../CONTRIBUTING.md) and enforced by [`Makefile`](../Makefile). Run **`make pre-commit`** from the repo root before opening a PR. - -- **`just` vs `make`**: The [`Justfile`](../Justfile) provides optional tasks (`just pre-commit` runs `fmt` + clippy + `cargo check` + `cargo nextest`; it does **not** run `console-web` checks). For parity with [`CONTRIBUTING.md`](../CONTRIBUTING.md) and [`Makefile`](../Makefile), prefer **`make pre-commit`**. - -- **This guide** focuses on toolchain setup, clusters, and day-to-day workflows—not on duplicating the full command matrix from CONTRIBUTING. - -- **[`DEVELOPMENT-NOTES.md`](./DEVELOPMENT-NOTES.md)** records past analysis sessions; it is **not** a substitute for CONTRIBUTING or this file. - ---- - -## 📋 Prerequisites - -### Required Tools - -1. **Rust Toolchain** (1.91+) - - Project uses Rust Edition 2024 - - Required components: `rustfmt`, `clippy`, `rust-src`, `rust-analyzer` - -2. **Kubernetes Cluster** - - Kubernetes v1.27+ (current target: v1.30) - - For local development, use: - - [kind](https://kind.sigs.k8s.io/) (recommended) - - [minikube](https://minikube.sigs.k8s.io/) - - [k3s](https://k3s.io/) - - Docker Desktop (built-in Kubernetes) - -3. **kubectl** - - For interacting with Kubernetes clusters - -4. **Optional Tools** - - `just` - Task runner (project includes Justfile) - - `cargo-nextest` - Faster test runner - - `docker` - For building container images - - `OpenLens` - Kubernetes cluster management GUI - ---- - -## 🚀 Quick Start - -### 1. Install Rust Toolchain - -The project uses `rust-toolchain.toml` to automatically manage the Rust version: - -```bash -# If Rust is not installed yet -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh - -# Navigate to project directory (Rust will auto-install correct toolchain version) -cd ~/operator - -# Verify installation -rustc --version -cargo --version -``` - -The toolchain will automatically install: -- `rustfmt` - Code formatter -- `clippy` - Code linter -- `rust-src` - Rust source code -- `rust-analyzer` - IDE support - -### 2. Install Optional Development Tools - -```bash -# Install cargo-nextest (faster test runner) -cargo install cargo-nextest - -# Install just (task runner) -# macOS -brew install just - -# Linux -# Download from https://github.com/casey/just/releases -# Or use package manager -``` - -### 3. Clone the Project (if not already done) - -```bash -git clone https://github.com/rustfs/operator.git -cd operator -``` - -### 4. Verify Project Setup - -```bash -# Check Rust toolchain -rustc --version # Should be 1.91+ - -# Check project dependencies -cargo check - -# Run formatting check -cargo fmt --all --check - -# Run clippy check -cargo clippy --all-targets --all-features -- -D warnings -``` - ---- - -## 🔨 Building the Operator - -### How to Compile the Operator - -The operator can be built using Cargo (standard Rust build tool) or the Justfile task runner. - -#### Method 1: Using Cargo (Standard) - -```bash -# Debug build (faster compilation, larger binary, slower runtime) -cargo build - -# Release build (slower compilation, smaller binary, faster runtime) -cargo build --release - -# Binary locations: -# Debug: target/debug/operator -# Release: target/release/operator -``` - -#### Method 2: Using Justfile (Recommended) - -```bash -# Build Debug binary -just build - -# Build Release binary -just build MODE=release -``` - -#### Build Output - -After building, the operator binary will be located at: -- **Debug**: `target/debug/operator` -- **Release**: `target/release/operator` - -You can run it directly: -```bash -# Run debug binary -./target/debug/operator --help - -# Run release binary -./target/release/operator --help -``` - -#### Build Options - -```bash -# Format code before building -just fmt && just build - -# Run all checks before building (use make for full gate including console-web) -make pre-commit && just build MODE=release - -# Clean and rebuild -cargo clean && cargo build --release -``` - ---- - -## 🐳 Installing kind - -kind (Kubernetes in Docker) is the recommended tool for local Kubernetes development. - -### Installation - -#### macOS - -```bash -# Using Homebrew (recommended) -brew install kind - -# Verify installation -kind --version -``` - -#### Linux - -```bash -# Download binary from releases -curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64 -chmod +x ./kind -sudo mv ./kind /usr/local/bin/kind - -# Or using package manager (if available) -# Verify installation -kind --version -``` - -#### Windows - -```bash -# Using Chocolatey -choco install kind - -# Or download from: https://kind.sigs.k8s.io/docs/user/quick-start/ -``` - -### Creating a kind Cluster - -```bash -# Create a cluster named 'rustfs-dev' -kind create cluster --name rustfs-dev - -# Verify cluster is running -kubectl cluster-info --context kind-rustfs-dev - -# List clusters -kind get clusters - -# Check cluster nodes -kubectl get nodes -``` - -### kind Cluster Management - -#### Starting a Cluster - -```bash -# If cluster exists but is stopped, restart it -# Note: kind clusters run in Docker containers, so they persist until deleted -# To "restart", you may need to recreate if Docker was restarted - -# Check if cluster containers are running -docker ps | grep rustfs-dev - -# If containers are stopped, restart Docker or recreate cluster -kind create cluster --name rustfs-dev -``` - -#### Stopping a Cluster - -```bash -# kind clusters run in Docker containers -# To stop, you can stop Docker or delete the cluster - -# Stop Docker Desktop (macOS/Windows) -# Or stop Docker daemon (Linux) -sudo systemctl stop docker - -# Note: Stopping Docker will stop all kind clusters -``` - -#### Restarting a Cluster - -```bash -# If Docker was restarted, kind clusters may need to be recreated -# Check cluster status -kind get clusters - -# If cluster exists but kubectl can't connect, recreate it -kind delete cluster --name rustfs-dev -kind create cluster --name rustfs-dev - -# Restore kubectl context -kubectl cluster-info --context kind-rustfs-dev -``` - -#### Deleting a Cluster - -```bash -# Delete a specific cluster -kind delete cluster --name rustfs-dev - -# Delete all kind clusters -kind delete cluster --all - -# Verify deletion -kind get clusters -``` - -#### Advanced kind Configuration - -Create a custom kind configuration file `kind-config.yaml`: - -```yaml -kind: Cluster -apiVersion: kind.x-k8s.io/v1alpha4 -nodes: -- role: control-plane - kubeadmConfigPatches: - - | - kind: InitConfiguration - nodeRegistration: - kubeletExtraArgs: - node-labels: "ingress-ready=true" - extraPortMappings: - - containerPort: 80 - hostPort: 80 - protocol: TCP - - containerPort: 443 - hostPort: 443 - protocol: TCP -``` - -Create cluster with custom config: -```bash -kind create cluster --name rustfs-dev --config kind-config.yaml -``` - ---- - -## 🖥️ Installing OpenLens - -OpenLens is a powerful Kubernetes IDE for managing clusters visually. - -### Installation - -#### macOS - -```bash -# Using Homebrew -brew install --cask openlens - -# Or download from: https://github.com/MuhammedKalkan/OpenLens/releases -``` - -#### Linux - -```bash -# Download AppImage from releases -wget https://github.com/MuhammedKalkan/OpenLens/releases/latest/download/OpenLens-.AppImage -chmod +x OpenLens-.AppImage -./OpenLens-.AppImage - -# Or install via Snap -snap install openlens -``` - -#### Windows - -```bash -# Using Chocolatey -choco install openlens - -# Or download installer from: https://github.com/MuhammedKalkan/OpenLens/releases -``` - -### Connecting OpenLens to kind Cluster - -1. **Get kubeconfig path**: - ```bash - # kind stores kubeconfig in ~/.kube/config - # Or get specific context - kubectl config view --minify --context kind-rustfs-dev - ``` - -2. **Open OpenLens**: - - Click "Add Cluster" or "+" button - - Select "Add from kubeconfig" - - Navigate to `~/.kube/config` (or paste kubeconfig content) - - Select context: `kind-rustfs-dev` - - Click "Add" - -3. **Verify Connection**: - - You should see your kind cluster in the cluster list - - Click on it to view nodes, pods, services, etc. - -### Using OpenLens for Development - -- **View Resources**: Browse Tenants, Pods, StatefulSets, Services -- **View Logs**: Click on any Pod to see logs -- **Terminal Access**: Open terminal in Pods directly -- **Resource Editor**: Edit YAML files directly -- **Event Viewer**: Monitor Kubernetes events in real-time - ---- - -## 🏃 Installing and Running the Operator - -### Step 1: Install CRD (Custom Resource Definition) - -The operator requires the Tenant CRD to be installed in your cluster: - -```bash -# Generate CRD YAML -cargo run -- crd > tenant-crd.yaml - -# Or output directly to file -cargo run -- crd -f tenant-crd.yaml - -# Install CRD -kubectl apply -f tenant-crd.yaml - -# Verify CRD is installed -kubectl get crd tenants.rustfs.com - -# View CRD details -kubectl describe crd tenants.rustfs.com -``` - -### Step 2: Configure kubectl Access - -Ensure `kubectl` can access your cluster: - -```bash -# Check current context -kubectl config current-context - -# List all contexts -kubectl config get-contexts - -# Switch to correct context (if needed) -kubectl config use-context kind-rustfs-dev - -# Verify cluster connection -kubectl cluster-info -kubectl get nodes -``` - -### Step 3: Run Operator Locally (Development Mode) - -#### Option A: Run from Source (Recommended for Development) - -```bash -# Set log level (optional) -export RUST_LOG=debug -export RUST_LOG=operator=debug,kube=info - -# Run operator in debug mode -cargo run -- server - -# Or run in release mode (faster) -cargo run --release -- server -``` - -The operator will: -- Connect to your Kubernetes cluster -- Watch for Tenant CRD changes -- Reconcile resources (StatefulSets, Services, RBAC) - -#### Option B: Run Pre-built Binary - -```bash -# Build the binary first -cargo build --release - -# Run the binary -./target/release/operator server -``` - -#### Option C: Deploy as Pod in Cluster - -```bash -# Build Docker image -docker build -t rustfs/operator:dev . - -# Load image into kind cluster -kind load docker-image rustfs/operator:dev --name rustfs-dev - -# Deploy using Helm (see deploy/README.md) -helm install rustfs-operator deploy/rustfs-operator/ \ - --namespace rustfs-system \ - --create-namespace \ - --set image.tag=dev \ - --set image.pullPolicy=Never -``` - -### Step 4: Test the Operator - -In another terminal: - -```bash -# Create a test Tenant -kubectl apply -f examples/minimal-dev-tenant.yaml - -# Watch Tenant status -kubectl get tenant dev-minimal -w - -# View created resources -kubectl get pods -l rustfs.tenant=dev-minimal -kubectl get statefulset -l rustfs.tenant=dev-minimal -kubectl get svc -l rustfs.tenant=dev-minimal -kubectl get pvc -l rustfs.tenant=dev-minimal -``` - ---- - -## 🐛 Debugging the Operator - -### Debugging Methods - -#### 1. Local Development Debugging - -**Run with verbose logging**: -```bash -# Set detailed log levels -export RUST_LOG=debug -export RUST_LOG=operator=debug,kube=info,tracing=debug - -# Run operator -cargo run -- server -``` - -**Use a debugger** (VS Code): -1. Install "CodeLLDB" extension -2. Create `.vscode/launch.json`: -```json -{ - "version": "0.2.0", - "configurations": [ - { - "type": "lldb", - "request": "launch", - "name": "Debug Operator", - "cargo": { - "args": ["build", "--bin", "operator"], - "filter": { - "name": "operator", - "kind": "bin" - } - }, - "args": ["server"], - "cwd": "${workspaceFolder}", - "env": { - "RUST_LOG": "debug" - } - } - ] -} -``` -3. Set breakpoints and press F5 - -#### 2. Cluster-based Debugging - -**View operator logs** (if deployed in cluster): -```bash -# Get operator pod name -kubectl get pods -n rustfs-system - -# View logs -kubectl logs -f -n rustfs-system -l app.kubernetes.io/name=rustfs-operator - -# View logs with timestamps -kubectl logs -f -n rustfs-system -l app.kubernetes.io/name=rustfs-operator --timestamps - -# View previous logs (if pod restarted) -kubectl logs -f -n rustfs-system -l app.kubernetes.io/name=rustfs-operator --previous -``` - -**Debug operator pod**: -```bash -# Exec into operator pod -kubectl exec -it -n rustfs-system -- /bin/sh - -# Check environment variables -kubectl exec -n rustfs-system -- env -``` - -#### 3. Resource Debugging - -**Check reconciliation status**: -```bash -# View Tenant status -kubectl get tenant -o yaml - -# View Tenant events -kubectl describe tenant - -# View all events -kubectl get events --sort-by='.lastTimestamp' --all-namespaces - -# Watch events in real-time -kubectl get events --watch --all-namespaces -``` - -**Check created resources**: -```bash -# View StatefulSet details -kubectl get statefulset -l rustfs.tenant= -o yaml - -# View Pod status -kubectl get pods -l rustfs.tenant= -o wide - -# View Pod logs -kubectl logs -f -l rustfs.tenant= -``` - ---- - -## 📋 Logging and Log Locations - -### Log Levels - -The operator uses the `tracing` crate for structured logging. Log levels: - -- `ERROR` - Errors that need attention -- `WARN` - Warnings about potential issues -- `INFO` - General informational messages -- `DEBUG` - Detailed debugging information -- `TRACE` - Very detailed tracing (very verbose) - -### Setting Log Levels - -#### Environment Variables - -```bash -# Set global log level -export RUST_LOG=debug - -# Set per-module log levels -export RUST_LOG=operator=debug,kube=info,tracing=warn - -# Common configurations: -# Development -export RUST_LOG=operator=debug,kube=info - -# Production -export RUST_LOG=operator=info,kube=warn - -# Troubleshooting -export RUST_LOG=operator=trace,kube=debug -``` - -#### Log Location - -**When running locally**: -- Logs are output to **stdout/stderr** -- View in terminal where operator is running -- Can redirect to file: `cargo run -- server 2>&1 | tee operator.log` - -**When deployed in cluster**: -- Logs are stored in **Pod logs** -- View with: `kubectl logs -f -n rustfs-system` -- Logs persist until Pod is deleted -- Use log aggregation tools (e.g., Loki, Fluentd) for long-term storage - -### Viewing Logs - -#### Local Development - -```bash -# Terminal 1: Run operator with logging -export RUST_LOG=debug -cargo run -- server - -# Terminal 2: View logs in real-time (if redirected to file) -tail -f operator.log - -# Or use system log viewer (macOS) -log stream --predicate 'process == "operator"' -``` - -#### Cluster Deployment - -```bash -# View current logs -kubectl logs -f -n rustfs-system -l app.kubernetes.io/name=rustfs-operator - -# View logs with timestamps -kubectl logs -f -n rustfs-system -l app.kubernetes.io/name=rustfs-operator --timestamps - -# View last 100 lines -kubectl logs --tail=100 -n rustfs-system -l app.kubernetes.io/name=rustfs-operator - -# View logs since specific time -kubectl logs --since=10m -n rustfs-system -l app.kubernetes.io/name=rustfs-operator - -# View logs from previous container (if pod restarted) -kubectl logs --previous -n rustfs-system -l app.kubernetes.io/name=rustfs-operator - -# Export logs to file -kubectl logs -n rustfs-system -l app.kubernetes.io/name=rustfs-operator > operator.log -``` - -#### Using OpenLens - -1. Open OpenLens -2. Select your cluster -3. Navigate to **Workloads** → **Pods** -4. Find operator pod in `rustfs-system` namespace -5. Click on pod → **Logs** tab -6. View real-time logs with filtering options - -### Common Log Patterns - -**Successful reconciliation**: -``` -INFO reconcile: reconciled successful, object: -``` - -**Reconciliation errors**: -``` -ERROR reconcile: reconcile failed: -WARN error_policy: -``` - -**Resource creation**: -``` -DEBUG Creating StatefulSet -INFO StatefulSet created successfully -``` - -**Status updates**: -``` -DEBUG Updating tenant status: -``` - ---- - -## 🧪 Running Tests - -```bash -# Run all tests -cargo test - -# Use nextest (faster) -cargo nextest run - -# Or use just -just test - -# Run specific test -cargo test test_statefulset_no_update_needed - -# Run ignored tests (includes TLS tests) -cargo test -- --ignored - -# Run tests with output -cargo test -- --nocapture - -# Run tests in single thread (for debugging) -cargo test -- --test-threads=1 -``` - ---- - -## 🛠️ Development Workflow - -### Daily Development Process - -1. **Create feature branch** - ```bash - git checkout -b feature/your-feature-name - ``` - -2. **Write code** - -3. **Format code** - ```bash - cargo fmt --all - # or - just fmt - ``` - -4. **Run checks** - ```bash - make pre-commit - # For optional Just tasks instead, see "Documentation map" — `just pre-commit` differs (no console-web). - ``` - -5. **Run tests** - ```bash - cargo test - # or - just test - ``` - -6. **Test operator locally** - ```bash - # Terminal 1: Run operator - cargo run -- server - - # Terminal 2: Create test resources - kubectl apply -f examples/minimal-dev-tenant.yaml - kubectl get tenant -w - ``` - -7. **Commit code** - ```bash - git add . - git commit -m "feat: your feature description" - ``` - -### Code Quality Checks - -The project enforces strict code quality standards: - -```bash -# Run all checks (Rust + console-web; matches CONTRIBUTING / Makefile) -make pre-commit - -# Optional: Justfile tasks (no console-web in `just pre-commit`) -just fmt-check # Check formatting -just clippy # Code linting -just check # Compilation check -just test # Tests (cargo nextest) -``` - -**Note**: The project has `deny`-level clippy rules: -- `unwrap_used = "deny"` - Prohibits `unwrap()` -- `expect_used = "deny"` - Prohibits `expect()` - ---- - -## 🧹 Cleaning Up - -### Clean Test Resources - -```bash -# Delete test Tenant (automatically deletes all related resources) -kubectl delete tenant dev-minimal - -# Delete all Tenants -kubectl delete tenant --all -``` - -### Clean Cluster - -```bash -# Delete kind cluster -kind delete cluster --name rustfs-dev - -# Delete all kind clusters -kind delete cluster --all - -# minikube -minikube delete -``` - -### Clean Build Artifacts - -```bash -# Clean target directory -cargo clean - -# Clean and rebuild -cargo clean && cargo build -``` - ---- - -## 🐛 Troubleshooting - -### 1. Rust Version Mismatch - -**Problem**: `error: toolchain 'stable' is not installed` - -**Solution**: -```bash -# Navigate to project directory, rustup will auto-install correct toolchain -cd /path/to/operator -rustup show -``` - -### 2. Cannot Connect to Kubernetes Cluster - -**Problem**: `Failed to connect to Kubernetes API` - -**Solution**: -```bash -# Check kubectl configuration -kubectl config current-context -kubectl cluster-info - -# Ensure cluster is running -kubectl get nodes - -# For kind: check if cluster containers are running -docker ps | grep rustfs-dev -``` - -### 3. CRD Not Found - -**Problem**: `the server could not find the requested resource` - -**Solution**: -```bash -# Reinstall CRD -cargo run -- crd | kubectl apply -f - - -# Verify CRD is installed -kubectl get crd tenants.rustfs.com -``` - -### 4. Clippy Errors - -**Problem**: Clippy reports `unwrap_used` or `expect_used` errors - -**Solution**: -- Use `Result` and `?` operator -- Use `match` or `if let` to handle `Option` -- Use `snafu` for error handling - -### 5. Test Failures - -**Problem**: Tests cannot run or fail - -**Solution**: -```bash -# Run single test with detailed output -cargo test -- --nocapture test_name - -# Run all tests (including ignored) -cargo test -- --include-ignored -``` - -### 6. kind Cluster Issues - -**Problem**: Cannot connect to kind cluster after Docker restart - -**Solution**: -```bash -# Recreate cluster -kind delete cluster --name rustfs-dev -kind create cluster --name rustfs-dev - -# Restore kubectl context -kubectl cluster-info --context kind-rustfs-dev -``` - ---- - -## 📚 Useful Command Reference - -### Cargo Commands - -```bash -# Build -cargo build # Debug build -cargo build --release # Release build - -# Check -cargo check # Quick compilation check -cargo clippy # Code linting - -# Test -cargo test # Run tests -cargo test -- --ignored # Run ignored tests -cargo nextest run # Use nextest - -# Format -cargo fmt # Format code -cargo fmt --all --check # Check formatting - -# Documentation -cargo doc --open # Generate and open docs -``` - -### kubectl Commands - -```bash -# CRD operations -kubectl get crd # List all CRDs -kubectl get tenant # List all Tenants -kubectl describe tenant # View Tenant details - -# Resource operations -kubectl get pods -l rustfs.tenant= -kubectl get statefulset -l rustfs.tenant= -kubectl get svc -l rustfs.tenant= - -# Logs -kubectl logs -f -kubectl logs -f -l rustfs.tenant= - -# Events -kubectl get events --sort-by='.lastTimestamp' -``` - -### kind Commands - -```bash -# Cluster management -kind create cluster --name # Create cluster -kind delete cluster --name # Delete cluster -kind get clusters # List clusters -kind get nodes --name # List nodes - -# Image management -kind load docker-image --name # Load image -``` - ---- - -## 🎯 Next Steps - -- View [CONTRIBUTING.md](../CONTRIBUTING.md) for contribution guidelines -- View [DEVELOPMENT-NOTES.md](./DEVELOPMENT-NOTES.md) for development notes -- View [architecture-decisions.md](./architecture-decisions.md) for architecture decisions -- View [../examples/](../examples/) for usage examples - ---- - -**Happy coding!** 🚀 diff --git a/docs/POOL-STATUS-EXPLANATION.md b/docs/POOL-STATUS-EXPLANATION.md deleted file mode 100755 index a7dfed9..0000000 --- a/docs/POOL-STATUS-EXPLANATION.md +++ /dev/null @@ -1,285 +0,0 @@ -# Pool Status Structure Explanation - -This document explains what the `Pool` structure in `src/types/v1alpha1/status/pool.rs` represents. - ---- - -## 📋 Overview - -The `Pool` struct in `src/types/v1alpha1/status/pool.rs` represents the **runtime status** of a storage pool in a RustFS Tenant. It is part of the Tenant's status field and tracks the actual state of the StatefulSet that manages the pool's Pods. - ---- - -## 🔍 Key Concepts - -### Two Different `Pool` Types - -There are **two different** `Pool` structures in the codebase: - -1. **`src/types/v1alpha1/pool.rs::Pool`** - **Spec (Desired State)** - - User-defined configuration - - Part of `TenantSpec` - - Defines what the user wants (e.g., `servers: 4`, `volumesPerServer: 2`) - -2. **`src/types/v1alpha1/status/pool.rs::Pool`** - **Status (Actual State)** - - Runtime status information - - Part of `TenantStatus` - - Tracks what actually exists (e.g., `replicas: 4`, `ready_replicas: 3`) - -### Relationship - -``` -Tenant CRD -├── spec.pools[] ← User configuration (pool.rs::Pool) -│ └── name: "pool-0" -│ servers: 4 -│ volumesPerServer: 2 -│ -└── status.pools[] ← Runtime status (status/pool.rs::Pool) - └── ss_name: "tenant-pool-0" - state: "RolloutComplete" - replicas: 4 - ready_replicas: 4 -``` - ---- - -## 📊 Pool Status Structure - -### Fields Explained - -```rust -pub struct Pool { - /// Name of the StatefulSet for this pool - pub ss_name: String, - - /// Current state of the pool - pub state: PoolState, - - /// Total number of non-terminated pods targeted by this pool's StatefulSet - pub replicas: Option, - - /// Number of pods with Ready condition - pub ready_replicas: Option, - - /// Number of pods with current revision - pub current_replicas: Option, - - /// Number of pods with updated revision - pub updated_replicas: Option, - - /// Current revision hash of the StatefulSet - pub current_revision: Option, - - /// Update revision hash of the StatefulSet (different from current during rollout) - pub update_revision: Option, - - /// Last time the pool status was updated - pub last_update_time: Option, -} -``` - -### Field Details - -#### `ss_name: String` -- **Meaning**: The name of the StatefulSet that manages this pool -- **Format**: `{tenant-name}-{pool-name}` -- **Example**: `dev-minimal-dev-pool` -- **Purpose**: Used to identify and query the StatefulSet resource - -#### `state: PoolState` -- **Meaning**: Current operational state of the pool -- **Possible Values**: See `PoolState` enum below -- **Purpose**: Quick status indicator for monitoring and debugging - -#### `replicas: Option` -- **Meaning**: Total number of Pods that should exist (desired replicas) -- **Source**: `StatefulSet.status.replicas` -- **Example**: `4` means 4 Pods should exist -- **Purpose**: Track desired vs actual Pod count - -#### `ready_replicas: Option` -- **Meaning**: Number of Pods that are Ready (passing readiness probe) -- **Source**: `StatefulSet.status.readyReplicas` -- **Example**: `3` means 3 out of 4 Pods are ready -- **Purpose**: Determine if pool is fully operational - -#### `current_replicas: Option` -- **Meaning**: Number of Pods running the current (old) revision -- **Source**: `StatefulSet.status.currentReplicas` -- **Example**: During update, `2` means 2 Pods still on old version -- **Purpose**: Track rollout progress - -#### `updated_replicas: Option` -- **Meaning**: Number of Pods running the updated (new) revision -- **Source**: `StatefulSet.status.updatedReplicas` -- **Example**: During update, `2` means 2 Pods on new version -- **Purpose**: Track rollout progress - -#### `current_revision: Option` -- **Meaning**: Revision hash of the current StatefulSet template -- **Source**: `StatefulSet.status.currentRevision` -- **Example**: `"tenant-pool-0-abc123"` -- **Purpose**: Identify which template version Pods are running - -#### `update_revision: Option` -- **Meaning**: Revision hash of the updated StatefulSet template (during rollout) -- **Source**: `StatefulSet.status.updateRevision` -- **Example**: `"tenant-pool-0-def456"` -- **Purpose**: Identify which template version is being rolled out - -#### `last_update_time: Option` -- **Meaning**: Timestamp when this status was last updated -- **Format**: RFC3339 timestamp -- **Example**: `"2025-01-15T10:30:00Z"` -- **Purpose**: Track when status was last refreshed - ---- - -## 🎯 PoolState Enum - -The `PoolState` enum represents the operational state of a pool: - -```rust -pub enum PoolState { - Created, // PoolCreated - StatefulSet exists - NotCreated, // PoolNotCreated - StatefulSet doesn't exist or has 0 replicas - Initialized, // PoolInitialized - Pool is initialized but not all replicas ready - Updating, // PoolUpdating - Rollout in progress - RolloutComplete, // PoolRolloutComplete - All replicas ready and updated - RolloutFailed, // PoolRolloutFailed - Rollout failed - Degraded, // PoolDegraded - Some replicas not ready -} -``` - -### State Determination Logic - -The state is determined based on StatefulSet status: - -```rust -if desired == 0 { - PoolState::NotCreated -} else if ready == desired && updated == desired { - PoolState::RolloutComplete // All good! -} else if updated < desired || current < desired { - PoolState::Updating // Rollout in progress -} else if ready < desired { - PoolState::Degraded // Some Pods not ready -} else { - PoolState::Initialized // Initialized but not fully ready -} -``` - ---- - -## 🔄 How It's Used - -### 1. Status Collection - -During reconciliation, the operator: - -1. **Queries StatefulSets** for each pool in `spec.pools` -2. **Extracts status** from each StatefulSet -3. **Builds Pool status** using `build_pool_status()` method -4. **Aggregates** all pool statuses into `TenantStatus.pools[]` - -### 2. Status Update Flow - -``` -Reconciliation Loop - ↓ -For each pool in spec.pools: - ↓ -Get StatefulSet: {tenant-name}-{pool-name} - ↓ -Extract StatefulSet.status - ↓ -Build Pool status object - ↓ -Add to TenantStatus.pools[] - ↓ -Update Tenant.status -``` - -### 3. Example Status Output - -```yaml -apiVersion: rustfs.com/v1alpha1 -kind: Tenant -metadata: - name: dev-minimal -status: - currentState: "Ready" - availableReplicas: 4 - pools: - - ssName: "dev-minimal-dev-pool" - state: "PoolRolloutComplete" - replicas: 4 - readyReplicas: 4 - currentReplicas: 4 - updatedReplicas: 4 - currentRevision: "dev-minimal-dev-pool-abc123" - updateRevision: "dev-minimal-dev-pool-abc123" - lastUpdateTime: "2025-01-15T10:30:00Z" -``` - ---- - -## 💡 Use Cases - -### 1. Monitoring Pool Health - -```bash -# Check pool status -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[*].state}' - -# Check ready replicas -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[*].readyReplicas}' -``` - -### 2. Detecting Rollout Progress - -```bash -# Check if pool is updating -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[?(@.state=="PoolUpdating")]}' - -# Compare current vs updated replicas -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[*].currentReplicas}' -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[*].updatedReplicas}' -``` - -### 3. Debugging Issues - -```bash -# Check if pool is degraded -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[?(@.state=="PoolDegraded")]}' - -# View full pool status -kubectl get tenant dev-minimal -o jsonpath='{.status.pools[*]}' | jq -``` - ---- - -## 🔗 Related Code - -- **Status Collection**: `src/types/v1alpha1/tenant.rs::build_pool_status()` -- **Status Aggregation**: `src/reconcile.rs` (reconciliation loop) -- **Status Definition**: `src/types/v1alpha1/status.rs::Status` -- **Pool Spec**: `src/types/v1alpha1/pool.rs::Pool` - ---- - -## Summary - -**`status/pool.rs::Pool`** represents: -- ✅ **Runtime status** of a storage pool -- ✅ **StatefulSet status** information -- ✅ **Pod replica counts** and readiness -- ✅ **Rollout progress** during updates -- ✅ **Operational state** (Ready, Updating, Degraded, etc.) - -**Key Distinction**: -- `spec.pools[]` = What you want (configuration) -- `status.pools[]` = What actually exists (runtime status) - -This separation allows the operator to track the difference between desired and actual state, enabling proper reconciliation and status reporting. diff --git a/docs/RUSTFS-K8S-INTEGRATION.md b/docs/RUSTFS-K8S-INTEGRATION.md deleted file mode 100755 index ef4133e..0000000 --- a/docs/RUSTFS-K8S-INTEGRATION.md +++ /dev/null @@ -1,405 +0,0 @@ -# RustFS Encapsulation in Kubernetes - -This document explains in detail how the RustFS Kubernetes Operator encapsulates RustFS into Kubernetes and how it handles RustFS's dependency on system paths. - ---- - -## 📋 Project Overview - -### What Does This Project Do? - -**RustFS Kubernetes Operator** is a Kubernetes Operator that: - -1. **Automates RustFS Deployment**: Automatically creates and manages RustFS storage clusters through declarative configuration (CRD) -2. **Encapsulates Complexity**: Hides the complexity of Kubernetes resource creation (StatefulSet, Service, PVC, RBAC, etc.) -3. **Lifecycle Management**: Automatically handles creation, updates, scaling, and deletion of RustFS clusters -4. **Configuration Management**: Automatically generates environment variables and configurations required by RustFS - -### Core Value - -**Without Operator**, deploying RustFS requires manually creating: -- StatefulSet (managing Pods) -- PersistentVolumeClaim (storage volumes) -- Service (service discovery) -- RBAC (permissions) -- ConfigMap/Secret (configuration) -- Manually configuring `RUSTFS_VOLUMES` environment variable - -**With Operator**, you only need: -```yaml -apiVersion: rustfs.com/v1alpha1 -kind: Tenant -metadata: - name: my-rustfs -spec: - pools: - - name: primary - servers: 2 - persistence: - volumesPerServer: 2 -``` - -The Operator automatically creates all necessary resources! - ---- - -## 🔍 RustFS Path Dependency Problem - -### How Does RustFS Work? - -RustFS is a distributed object storage system that requires: - -1. **Local Storage Paths**: Each node needs to access local disk paths to store data - - Example: `/data/rustfs0`, `/data/rustfs1`, `/data/rustfs2`, `/data/rustfs3` - - These paths must exist and be writable - -2. **Network Communication**: Nodes need to communicate over the network to coordinate data distribution - - RustFS uses the `RUSTFS_VOLUMES` environment variable to discover other nodes - - Format: `http://node1:9000/data/rustfs{0...N} http://node2:9000/data/rustfs{0...N} ...` - -3. **Path Convention**: RustFS follows a specific path naming convention - - Base path + `/rustfs{index}` - - Example: `/data/rustfs0`, `/data/rustfs1` - -### Problems with Traditional Deployment - -Deploying RustFS on traditional servers: - -```bash -# 1. Create directories -mkdir -p /data/rustfs{0..3} - -# 2. Set permissions -chown -R rustfs:rustfs /data - -# 3. Configure environment variables -export RUSTFS_VOLUMES="http://node1:9000/data/rustfs{0...3} http://node2:9000/data/rustfs{0...3}" - -# 4. Start RustFS -rustfs server -``` - -**Problems**: -- ❌ Paths are hardcoded and inflexible -- ❌ Requires manual management of multiple nodes -- ❌ Difficult to use in container environments (container filesystems are ephemeral) -- ❌ Cannot leverage Kubernetes storage abstractions - ---- - -## ✅ Kubernetes Solution - -### Core Idea: Use PersistentVolume + VolumeMount - -Kubernetes solves the path dependency problem through the following mechanisms: - -1. **PersistentVolumeClaim (PVC)**: Abstracts storage without caring about the underlying implementation -2. **VolumeMount**: Mounts PVCs to specified paths in containers -3. **StatefulSet**: Ensures stable network identity and storage for Pods - -### Implementation Principles - -#### 1. Create PersistentVolumeClaim Templates - -The Operator creates PVCs for each volume: - -```rust -// Code location: src/types/v1alpha1/tenant/workloads.rs - -fn volume_claim_templates(&self, pool: &Pool) -> Result> { - // Create PVC template for each volume - // Example: vol-0, vol-1, vol-2, vol-3 - let templates: Vec<_> = (0..pool.persistence.volumes_per_server) - .map(|i| PersistentVolumeClaim { - metadata: ObjectMeta { - name: Some(format!("vol-{}", i)), // vol-0, vol-1, ... - .. - }, - spec: Some(PersistentVolumeClaimSpec { - access_modes: Some(vec!["ReadWriteOnce".to_string()]), - resources: Some(VolumeResourceRequirements { - requests: Some(resources), - .. - }), - .. - }), - .. - }) - .collect(); -} -``` - -**Generated PVCs**: -```yaml -# StatefulSet automatically creates these PVCs for each Pod -# Pod 0: dev-minimal-dev-pool-0-vol-0, dev-minimal-dev-pool-0-vol-1, ... -# Pod 1: dev-minimal-dev-pool-1-vol-0, dev-minimal-dev-pool-1-vol-1, ... -``` - -#### 2. Mount PVCs to Container Paths - -The Operator creates VolumeMounts to mount PVCs to paths expected by RustFS: - -```rust -// Code location: src/types/v1alpha1/tenant/workloads.rs - -let base_path = pool.persistence.path.as_deref().unwrap_or("/data"); -let mut volume_mounts: Vec = (0..pool.persistence.volumes_per_server) - .map(|i| VolumeMount { - name: format!("vol-{}", i), // Corresponds to PVC name - mount_path: format!("{}/rustfs{}", base_path, i), // /data/rustfs0, /data/rustfs1, ... - .. - }) - .collect(); -``` - -**Result**: -- PVC `vol-0` → mounted to `/data/rustfs0` -- PVC `vol-1` → mounted to `/data/rustfs1` -- PVC `vol-2` → mounted to `/data/rustfs2` -- PVC `vol-3` → mounted to `/data/rustfs3` - -#### 3. Automatically Generate RUSTFS_VOLUMES Environment Variable - -The Operator automatically generates `RUSTFS_VOLUMES` to tell RustFS how to find other nodes: - -```rust -// Code location: src/types/v1alpha1/tenant/workloads.rs - -fn rustfs_volumes_env_value(&self) -> Result { - // Generated format: - // http://{tenant}-{pool}-{0...servers-1}.{service}.{namespace}.svc.cluster.local:9000{path}/rustfs{0...volumes-1} - - format!( - "http://{tenant}-{pool}-{{0...{}}}.{service}.{namespace}.svc.cluster.local:9000{}/rustfs{{0...{}}}", - servers - 1, - base_path, // /data - volumes_per_server - 1 - ) -} -``` - -**Example Output** (2 servers, 2 volumes each): -``` -http://dev-minimal-dev-pool-{0...1}.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs{0...1} -``` - -**Expanded**: -``` -http://dev-minimal-dev-pool-0.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs0 -http://dev-minimal-dev-pool-0.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs1 -http://dev-minimal-dev-pool-1.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs0 -http://dev-minimal-dev-pool-1.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs1 -``` - ---- - -## 🏗️ Complete Architecture Diagram - -``` -User creates Tenant CRD - ↓ -Operator reconciliation loop - ↓ -┌─────────────────────────────────────────┐ -│ 1. Create RBAC Resources │ -│ - Role │ -│ - ServiceAccount │ -│ - RoleBinding │ -└─────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────┐ -│ 2. Create Services │ -│ - IO Service (port 9000) │ -│ - Console Service (port 9001) │ -│ - Headless Service (DNS) │ -└─────────────────────────────────────────┘ - ↓ -┌─────────────────────────────────────────┐ -│ 3. Create StatefulSet for each Pool │ -│ ├─ Pod Template │ -│ │ ├─ Container: rustfs/rustfs │ -│ │ ├─ VolumeMounts: │ -│ │ │ ├─ vol-0 → /data/rustfs0 │ -│ │ │ ├─ vol-1 → /data/rustfs1 │ -│ │ │ ├─ vol-2 → /data/rustfs2 │ -│ │ │ └─ vol-3 → /data/rustfs3 │ -│ │ └─ Env: │ -│ │ └─ RUSTFS_VOLUMES=... │ -│ └─ VolumeClaimTemplates: │ -│ ├─ vol-0 (10Gi) │ -│ ├─ vol-1 (10Gi) │ -│ ├─ vol-2 (10Gi) │ -│ └─ vol-3 (10Gi) │ -└─────────────────────────────────────────┘ - ↓ -Kubernetes creates resources - ↓ -┌─────────────────────────────────────────┐ -│ StatefulSet Controller creates Pods │ -│ ├─ Pod: dev-minimal-dev-pool-0 │ -│ │ ├─ PVC: dev-minimal-dev-pool-0-vol-0 -│ │ ├─ PVC: dev-minimal-dev-pool-0-vol-1 -│ │ ├─ PVC: dev-minimal-dev-pool-0-vol-2 -│ │ └─ PVC: dev-minimal-dev-pool-0-vol-3 -│ └─ Pod: dev-minimal-dev-pool-1 │ -│ ├─ PVC: dev-minimal-dev-pool-1-vol-0 -│ ├─ PVC: dev-minimal-dev-pool-1-vol-1 -│ ├─ PVC: dev-minimal-dev-pool-1-vol-2 -│ └─ PVC: dev-minimal-dev-pool-1-vol-3 -└─────────────────────────────────────────┘ - ↓ -Storage Provider (StorageClass) creates PV - ↓ -Pod starts, RustFS accesses mounted paths -``` - ---- - -## 🔄 Data Persistence Flow - -### Data Persists Across Pod Restarts - -1. **StatefulSet Guarantees**: - - Stable Pod names: `dev-minimal-dev-pool-0` - - Stable PVC names: `dev-minimal-dev-pool-0-vol-0` - - Even if Pod restarts, PVCs remain unchanged - -2. **Storage Persistence**: - ``` - Pod deleted → PVC retained → Pod recreated → PVC remounted → Data restored - ``` - -3. **Path Consistency**: - - PVCs are always mounted to the same paths (`/data/rustfs0`) - - RustFS doesn't need to know what the underlying storage is (local disk, network storage, cloud storage) - ---- - -## 💡 Key Design Decisions - -### 1. Why Use StatefulSet? - -- ✅ **Stable Network Identity**: Pods have stable DNS names for `RUSTFS_VOLUMES` -- ✅ **Ordered Deployment**: Can control Pod startup order -- ✅ **Stable Storage**: Each Pod has independent PVCs, data persists when Pod is recreated - -### 2. Why Use VolumeClaimTemplates? - -- ✅ **Automation**: No need to manually create PVCs -- ✅ **Dynamic Creation**: StatefulSet automatically creates PVCs for each Pod -- ✅ **Naming Convention**: PVC names are associated with Pod names - -### 3. Why Are Paths `/data/rustfs{0...N}`? - -- ✅ **RustFS Convention**: Follows RustFS path naming conventions -- ✅ **Configurable**: Users can customize the base path via `persistence.path` -- ✅ **Clear**: Path names clearly indicate the volume's purpose - ---- - -## 📝 Practical Example - -### User Configuration - -```yaml -apiVersion: rustfs.com/v1alpha1 -kind: Tenant -metadata: - name: my-rustfs -spec: - pools: - - name: primary - servers: 2 - persistence: - volumesPerServer: 2 - path: /data # Optional, defaults to /data -``` - -### Resources Generated by Operator - -#### StatefulSet - -```yaml -apiVersion: apps/v1 -kind: StatefulSet -metadata: - name: my-rustfs-primary -spec: - replicas: 2 - serviceName: my-rustfs-hl - template: - spec: - containers: - - name: rustfs - image: rustfs/rustfs:latest - env: - - name: RUSTFS_VOLUMES - value: "http://my-rustfs-primary-{0...1}.my-rustfs-hl.default.svc.cluster.local:9000/data/rustfs{0...1}" - volumeMounts: - - name: vol-0 - mountPath: /data/rustfs0 - - name: vol-1 - mountPath: /data/rustfs1 - volumeClaimTemplates: - - metadata: - name: vol-0 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi - - metadata: - name: vol-1 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi -``` - -#### Actually Created Pods and PVCs - -**Pod 0**: -- Pod name: `my-rustfs-primary-0` -- PVC: `my-rustfs-primary-0-vol-0` → mounted to `/data/rustfs0` -- PVC: `my-rustfs-primary-0-vol-1` → mounted to `/data/rustfs1` - -**Pod 1**: -- Pod name: `my-rustfs-primary-1` -- PVC: `my-rustfs-primary-1-vol-0` → mounted to `/data/rustfs0` -- PVC: `my-rustfs-primary-1-vol-1` → mounted to `/data/rustfs1` - ---- - -## 🎯 Summary - -### Solution to RustFS Path Dependency - -| Problem | Traditional Approach | Kubernetes Approach | -|---------|---------------------|---------------------| -| **Path Management** | Manually create directories | VolumeMount automatically mounts | -| **Storage Abstraction** | Direct use of local disk | PVC abstraction, supports multiple storage backends | -| **Data Persistence** | Depends on physical disk | PVC ensures data persistence | -| **Multi-node Coordination** | Manually configure IPs | Headless Service + DNS | -| **Configuration Management** | Manually set environment variables | Operator automatically generates | - -### Core Advantages - -1. **Declarative Configuration**: Users only declare "what they want", Operator handles "how to do it" -2. **Storage Abstraction**: Doesn't care if the underlying storage is local disk, NFS, cloud storage, or others -3. **Automation**: Automatically creates, configures, and manages all resources -4. **Portability**: Same configuration can run on any Kubernetes cluster -5. **Scalability**: Easily add nodes and scale storage - ---- - -## 🔗 Related Documentation - -- [Architecture Decisions](./architecture-decisions.md) -- [Development Notes](./DEVELOPMENT-NOTES.md) -- [Usage Examples](../examples/README.md) - ---- - -**Key Understanding**: RustFS does depend on system paths, but Kubernetes uses the VolumeMount mechanism to "disguise" persistent storage as filesystem paths, making RustFS think it's accessing local disk when it's actually accessing Kubernetes-managed persistent storage. This is the core idea of containerized storage systems! diff --git a/docs/RUSTFS-OBJECT-STORAGE-USAGE.md b/docs/RUSTFS-OBJECT-STORAGE-USAGE.md deleted file mode 100755 index 488dc82..0000000 --- a/docs/RUSTFS-OBJECT-STORAGE-USAGE.md +++ /dev/null @@ -1,664 +0,0 @@ -# RustFS Object Storage Configuration and Usage Guide - -This document explains in detail the meaning of RustFS configuration parameters and how to use RustFS as an object storage system. - ---- - -## 📋 Configuration Parameters Explained - -### Example Configuration - -```yaml -pools: - - name: dev-pool - servers: 1 # Number of server nodes - persistence: - volumesPerServer: 4 # Number of storage volumes per server -``` - -### Parameter Meanings - -#### `servers: 1` - -**Meaning**: Number of server nodes in the RustFS cluster - -- **Purpose**: Determines how many Pods to create (each Pod represents a RustFS server node) -- **Examples**: - - `servers: 1` → Creates 1 Pod (single node, suitable for development) - - `servers: 4` → Creates 4 Pods (4-node cluster, suitable for production) - - `servers: 16` → Creates 16 Pods (large-scale cluster) - -**Actual Effect**: -- Operator creates a StatefulSet with replicas = `servers` -- Each Pod runs a RustFS server instance -- Pod naming format: `{tenant-name}-{pool-name}-{0...servers-1}` - -#### `volumesPerServer: 4` - -**Meaning**: Number of storage volumes on each server node - -- **Purpose**: Determines how many persistent storage volumes each Pod mounts -- **Examples**: - - `volumesPerServer: 4` → Each Pod has 4 storage volumes - - `volumesPerServer: 8` → Each Pod has 8 storage volumes - -**Actual Effect**: -- Operator creates `volumesPerServer` PVCs for each Pod -- Each PVC is mounted to container paths: `/data/rustfs0`, `/data/rustfs1`, `/data/rustfs2`, `/data/rustfs3` -- PVC naming format: `{pod-name}-vol-0`, `{pod-name}-vol-1`, ... - -#### Total Storage Volume Count - -**Calculation Formula**: `Total volumes = servers × volumesPerServer` - -**Total volumes for example configuration**: -``` -servers: 1 -volumesPerServer: 4 -→ Total volumes = 1 × 4 = 4 storage volumes -``` - -**Minimum Requirement**: `servers × volumesPerServer >= 4` - -This is RustFS's Erasure Coding requirement, which needs at least 4 storage volumes to function properly. - ---- - -## 🏗️ Actually Created Resources - -### What Does the Example Configuration Create? - -```yaml -pools: - - name: dev-pool - servers: 1 - persistence: - volumesPerServer: 4 -``` - -#### 1. StatefulSet - -```yaml -apiVersion: apps/v1 -kind: StatefulSet -metadata: - name: dev-minimal-dev-pool -spec: - replicas: 1 # servers: 1 - template: - spec: - containers: - - name: rustfs - image: rustfs/rustfs:latest - env: - - name: RUSTFS_VOLUMES - value: "http://dev-minimal-dev-pool-{0...0}.dev-minimal-hl.default.svc.cluster.local:9000/data/rustfs{0...3}" - volumeMounts: - - name: vol-0 - mountPath: /data/rustfs0 - - name: vol-1 - mountPath: /data/rustfs1 - - name: vol-2 - mountPath: /data/rustfs2 - - name: vol-3 - mountPath: /data/rustfs3 - volumeClaimTemplates: - - metadata: - name: vol-0 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi - - metadata: - name: vol-1 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi - - metadata: - name: vol-2 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi - - metadata: - name: vol-3 - spec: - accessModes: ["ReadWriteOnce"] - resources: - requests: - storage: 10Gi -``` - -#### 2. PersistentVolumeClaims (PVCs) - -``` -dev-minimal-dev-pool-0-vol-0 (10Gi) -dev-minimal-dev-pool-0-vol-1 (10Gi) -dev-minimal-dev-pool-0-vol-2 (10Gi) -dev-minimal-dev-pool-0-vol-3 (10Gi) -``` - -**Total Storage Capacity**: 4 × 10Gi = 40Gi (default 10Gi per volume) - -#### 3. Pod - -``` -dev-minimal-dev-pool-0 -``` - -Paths mounted inside the Pod: -- `/data/rustfs0` ← PVC `dev-minimal-dev-pool-0-vol-0` -- `/data/rustfs1` ← PVC `dev-minimal-dev-pool-0-vol-1` -- `/data/rustfs2` ← PVC `dev-minimal-dev-pool-0-vol-2` -- `/data/rustfs3` ← PVC `dev-minimal-dev-pool-0-vol-3` - ---- - -## 💾 How RustFS Object Storage Works - -### 1. Data Distribution Mechanism - -RustFS uses **Erasure Coding** to distribute data: - -``` -User uploads object - ↓ -RustFS splits object into data shards - ↓ -Calculates parity shards - ↓ -Distributes data and parity shards across all storage volumes - ↓ -Data redundantly stored, can recover even if some volumes fail -``` - -**Example** (with 4 volumes): -- Object is split into 2 data shards + 2 parity shards -- Each shard is stored on a different volume -- Even if 2 volumes fail, data can still be recovered from the remaining 2 volumes - -### 2. Role of Storage Volumes - -Each storage volume (`/data/rustfs0`, `/data/rustfs1`, ...): -- **Stores data shards**: Part of the object's data -- **Stores metadata**: Object metadata, indexes, etc. -- **Participates in erasure coding**: Works with other volumes to provide data redundancy - -### 3. Why Are At Least 4 Volumes Required? - -RustFS's erasure coding algorithm requires: -- **Minimum data shards**: At least 2 data shards -- **Minimum parity shards**: At least 2 parity shards -- **Total**: At least 4 shards → At least 4 storage volumes - -**Configuration Examples**: -- ✅ `servers: 1, volumesPerServer: 4` → 4 volumes (minimum configuration) -- ✅ `servers: 2, volumesPerServer: 2` → 4 volumes (minimum configuration) -- ✅ `servers: 4, volumesPerServer: 1` → 4 volumes (minimum configuration) -- ❌ `servers: 1, volumesPerServer: 2` → 2 volumes (insufficient, won't work) -- ❌ `servers: 2, volumesPerServer: 1` → 2 volumes (insufficient, won't work) - ---- - -## 🚀 How to Use RustFS Object Storage - -### 1. Deploy RustFS Cluster - -```bash -# Apply configuration -kubectl apply -f examples/minimal-dev-tenant.yaml - -# Wait for Pods to be ready -kubectl wait --for=condition=ready pod -l rustfs.tenant=dev-minimal --timeout=300s - -# Check status -kubectl get tenant dev-minimal -kubectl get pods -l rustfs.tenant=dev-minimal -``` - -### 2. Access S3 API - -RustFS provides S3-compatible object storage API. The Service type created by the Operator is `ClusterIP`, which means: - -- **Inside cluster**: Can directly use Service DNS names to access (**no port-forward needed**) -- **Outside cluster**: Requires port forwarding, Ingress, or LoadBalancer - -#### Method 1: Cluster-Internal Access (Recommended for Production) - -**Use Case**: Applications running inside Kubernetes cluster accessing RustFS - -**Service DNS Name Format**: -- S3 API: `http://rustfs.{namespace}.svc.cluster.local:9000` -- Console UI: `http://{tenant-name}-console.{namespace}.svc.cluster.local:9001` - -**Example** (in Pod or cluster-internal application): - -```bash -# Use Service DNS name (no port-forward needed) -# S3 API endpoint -http://rustfs.default.svc.cluster.local:9000 - -# Console UI endpoint -http://dev-minimal-console.default.svc.cluster.local:9001 -``` - -**Using MinIO Client (Cluster-Internal)**: -```bash -# Execute in Pod inside cluster -mc alias set rustfs http://rustfs.default.svc.cluster.local:9000 rustfsadmin rustfsadmin -mc mb rustfs/my-bucket -mc cp file.txt rustfs/my-bucket/ -``` - -**Using AWS CLI (Cluster-Internal)**: -```bash -# Execute in Pod inside cluster -aws --endpoint-url http://rustfs.default.svc.cluster.local:9000 s3 ls -aws --endpoint-url http://rustfs.default.svc.cluster.local:9000 s3 mb s3://my-bucket -``` - -**Using Python SDK (Cluster-Internal)**: -```python -import boto3 -from botocore.client import Config - -# Use Service DNS (no port-forward needed) -s3 = boto3.client( - 's3', - endpoint_url='http://rustfs.default.svc.cluster.local:9000', # Cluster-internal DNS - aws_access_key_id='rustfsadmin', - aws_secret_access_key='rustfsadmin', - config=Config(signature_version='s3v4'), - region_name='us-east-1' -) -``` - -#### Method 2: Port Forwarding (Local Development/Testing) - -**Use Case**: Accessing RustFS in cluster from local machine (development, testing, debugging) - -⚠️ **Note**: This method requires keeping the `kubectl port-forward` command running - -```bash -# Terminal 1: Forward S3 API port (9000) -kubectl port-forward svc/rustfs 9000:9000 - -# Terminal 2: Use localhost to access (requires port forwarding) -mc alias set devlocal http://localhost:9000 rustfsadmin rustfsadmin -mc mb devlocal/my-bucket -mc cp file.txt devlocal/my-bucket/ -``` - -**Using MinIO Client (Requires port-forward)**: -```bash -# Must execute port forwarding first -kubectl port-forward svc/rustfs 9000:9000 - -# Then use localhost -mc alias set devlocal http://localhost:9000 rustfsadmin rustfsadmin -mc mb devlocal/my-bucket -mc cp /path/to/file.txt devlocal/my-bucket/ -mc ls devlocal/my-bucket -``` - -**Using AWS CLI (Requires port-forward)**: -```bash -# Must execute port forwarding first -kubectl port-forward svc/rustfs 9000:9000 - -# Then use localhost -export AWS_ACCESS_KEY_ID=rustfsadmin -export AWS_SECRET_ACCESS_KEY=rustfsadmin -aws --endpoint-url http://localhost:9000 s3 ls -aws --endpoint-url http://localhost:9000 s3 mb s3://my-bucket -aws --endpoint-url http://localhost:9000 s3 cp file.txt s3://my-bucket/ -``` - -**Using Python SDK (Requires port-forward)**: -```python -import boto3 -from botocore.client import Config - -# Must execute first: kubectl port-forward svc/rustfs 9000:9000 -s3 = boto3.client( - 's3', - endpoint_url='http://localhost:9000', # Requires port forwarding - aws_access_key_id='rustfsadmin', - aws_secret_access_key='rustfsadmin', - config=Config(signature_version='s3v4'), - region_name='us-east-1' -) -``` - -#### Method 3: Using Ingress (Recommended for Production) - -**Use Case**: Production environment, requires HTTPS and domain name access - -Create Ingress resource: - -```yaml -apiVersion: networking.k8s.io/v1 -kind: Ingress -metadata: - name: rustfs-ingress - namespace: default -spec: - rules: - - host: rustfs.example.com - http: - paths: - - path: / - pathType: Prefix - backend: - service: - name: rustfs - port: - number: 9000 -``` - -Then access using domain name: -```bash -mc alias set production https://rustfs.example.com rustfsadmin rustfsadmin -``` - -#### Method 4: Using LoadBalancer (Cloud Environments) - -**Use Case**: Cloud environments (AWS, GCP, Azure), requires external IP - -Modify Service type (requires manual modification or Helm values): - -```yaml -# Note: Operator creates ClusterIP by default, need to manually change to LoadBalancer -apiVersion: v1 -kind: Service -metadata: - name: rustfs -spec: - type: LoadBalancer # Change to LoadBalancer - ports: - - port: 9000 -``` - -Then access using external IP: -```bash -# Get external IP -kubectl get svc rustfs - -# Use external IP -mc alias set production http://:9000 rustfsadmin rustfsadmin -``` - ---- - -### Access Method Comparison - -| Access Method | Requires port-forward? | Use Case | Endpoint Example | -|--------------|----------------------|----------|------------------| -| **Cluster-Internal** | ❌ **No** | Cluster-internal applications | `http://rustfs.default.svc.cluster.local:9000` | -| **Port Forwarding** | ✅ **Yes** | Local development/testing | `http://localhost:9000` | -| **Ingress** | ❌ No | Production environment (HTTPS) | `https://rustfs.example.com` | -| **LoadBalancer** | ❌ No | Cloud environments | `http://:9000` | - ---- - -### 3. Access Web Console - -#### Cluster-Internal Access (No port-forward needed) - -```bash -# In Pod inside cluster -curl http://dev-minimal-console.default.svc.cluster.local:9001 -``` - -#### Port Forwarding Access (Requires port-forward) - -```bash -# Forward console port (9001) -kubectl port-forward svc/dev-minimal-console 9001:9001 - -# Open in browser -open http://localhost:9001 -``` - -**Default Credentials**: -- Username: `rustfsadmin` -- Password: `rustfsadmin` - -⚠️ **Must change default credentials in production!** - ---- - -## 📊 Configuration Examples Comparison - -### Development Environment (Minimal Configuration) - -```yaml -pools: - - name: dev-pool - servers: 1 # 1 node - persistence: - volumesPerServer: 4 # 4 volumes per node -``` - -**Result**: -- 1 Pod -- 4 PVCs (10Gi each) -- Total storage: 40Gi -- **Use Case**: Local development, testing, learning - -### Production Environment (High Availability) - -```yaml -pools: - - name: production - servers: 8 # 8 nodes - persistence: - volumesPerServer: 4 # 4 volumes per node - volumeClaimTemplate: - resources: - requests: - storage: 100Gi # 100Gi per volume -``` - -**Result**: -- 8 Pods (distributed across multiple nodes) -- 32 PVCs (100Gi each) -- Total storage: 3.2Ti -- **Use Case**: Production environment, high availability, large capacity - -### Multi-Pool Configuration (Scaling Storage) - -```yaml -pools: - - name: pool-0 - servers: 4 - persistence: - volumesPerServer: 4 # 16 volumes - - - name: pool-1 - servers: 4 - persistence: - volumesPerServer: 4 # 16 volumes -``` - -**Result**: -- 8 Pods (2 StatefulSets) -- 32 PVCs -- **All pools form a unified cluster** -- **Use Case**: Need to scale storage capacity - ---- - -## 🔍 Data Storage Flow - -### Write Data Flow - -``` -1. Client uploads object to S3 API (port 9000) - ↓ -2. RustFS receives object - ↓ -3. RustFS uses erasure coding algorithm: - - Splits object into data shards - - Calculates parity shards - ↓ -4. Distributes shards across multiple storage volumes: - - /data/rustfs0 ← Data shard 1 - - /data/rustfs1 ← Data shard 2 - - /data/rustfs2 ← Parity shard 1 - - /data/rustfs3 ← Parity shard 2 - ↓ -5. Data persisted to PVC (underlying storage) -``` - -### Read Data Flow - -``` -1. Client requests object - ↓ -2. RustFS locates object shard positions - ↓ -3. Reads shards from multiple storage volumes: - - /data/rustfs0 → Data shard 1 - - /data/rustfs1 → Data shard 2 - - /data/rustfs2 → Parity shard 1 (if needed) - ↓ -4. Uses erasure coding algorithm to reconstruct complete object - ↓ -5. Returns object to client -``` - -### Failure Recovery Flow - -``` -Scenario: /data/rustfs0 volume failure - ↓ -1. RustFS detects volume unavailable - ↓ -2. Reads data and parity shards from other volumes: - - /data/rustfs1 → Data shard 2 - - /data/rustfs2 → Parity shard 1 - - /data/rustfs3 → Parity shard 2 - ↓ -3. Uses erasure coding algorithm to reconstruct lost data shard - ↓ -4. When volume recovers, automatically rebuilds data -``` - ---- - -## 📈 Capacity Planning - -### Storage Capacity Calculation - -**Formula**: `Total capacity = servers × volumesPerServer × single volume capacity` - -**Example**: -```yaml -servers: 4 -volumesPerServer: 4 -volumeClaimTemplate: - resources: - requests: - storage: 100Gi -``` - -**Calculation**: -- Total volumes: 4 × 4 = 16 volumes -- Total capacity: 16 × 100Gi = 1.6Ti - -### Usable Capacity - -Due to erasure coding redundancy, **usable capacity < total capacity**: - -- **EC:2** (2 data shards + 2 parity shards): Usable capacity = Total capacity × 50% -- **EC:4** (4 data shards + 4 parity shards): Usable capacity = Total capacity × 50% - -**Example**: -- Total capacity: 1.6Ti -- Usable capacity: Approximately 800Gi (50%) - -### Performance Considerations - -- **More volumes**: Better parallel I/O, higher throughput -- **More nodes**: Better load distribution, higher availability -- **Storage type**: SSD > HDD (performance) - ---- - -## 🎯 Use Cases - -### 1. Application Data Storage - -```yaml -# Use RustFS as object storage backend in application configuration -apiVersion: v1 -kind: ConfigMap -metadata: - name: app-config -data: - S3_ENDPOINT: "http://rustfs.default.svc.cluster.local:9000" - S3_BUCKET: "app-data" - S3_ACCESS_KEY: "rustfsadmin" - S3_SECRET_KEY: "rustfsadmin" -``` - -### 2. Backup Storage - -```yaml -# Velero backup to RustFS -apiVersion: velero.io/v1 -kind: BackupStorageLocation -metadata: - name: rustfs-backup -spec: - provider: aws - objectStorage: - bucket: velero-backups - prefix: backups - config: - region: us-east-1 - s3ForcePathStyle: "true" - s3Url: "http://rustfs.default.svc.cluster.local:9000" -``` - -### 3. CI/CD Build Artifact Storage - -```yaml -# GitLab CI configuration -build: - script: - - aws s3 cp build.tar.gz s3://artifacts/myapp/ --endpoint-url http://rustfs:9000 -``` - ---- - -## 🔗 Related Documentation - -- [RustFS Kubernetes Integration](./RUSTFS-K8S-INTEGRATION.md) -- [Development Environment Setup](./DEVELOPMENT.md) -- [Usage Examples](../examples/README.md) - ---- - -## Summary - -**Configuration Meanings**: -- `servers: 1` → 1 RustFS server node (Pod) -- `volumesPerServer: 4` → 4 storage volumes per node -- **Total volumes** = 1 × 4 = 4 volumes (meets minimum requirement) - -**Object Storage Usage**: -1. RustFS provides S3-compatible API (port 9000) -2. Data is distributed across all storage volumes via erasure coding -3. Supports standard S3 clients and SDKs -4. Provides Web console (port 9001) for management - -**Key Understanding**: -- Storage volumes are RustFS's physical storage units -- Multiple volumes provide data redundancy and performance -- At least 4 volumes are required for normal operation (erasure coding requirement) diff --git a/docs/architecture-decisions.md b/docs/architecture-decisions.md deleted file mode 100755 index 40568f7..0000000 --- a/docs/architecture-decisions.md +++ /dev/null @@ -1,469 +0,0 @@ -# Architecture Decisions - -This document records key architectural decisions made in the design of the RustFS Kubernetes Operator. - ---- - -## ADR-001: StatefulSet Per Pool - -**Status**: Accepted - -**Context**: -Each Tenant can have multiple Pools with different configurations. We needed to decide how to represent pools in Kubernetes. - -**Options Considered**: - -1. **Single StatefulSet with all pools**: One StatefulSet with complex pod indexing -2. **StatefulSet per pool**: Separate StatefulSet for each pool -3. **Deployment per pool**: Use Deployments instead of StatefulSets - -**Decision**: Create one StatefulSet per pool. - -**Rationale**: - -- **Independent Scaling**: Each pool can be scaled independently -- **Different Configurations**: Each pool can have different resources, node selectors, etc. -- **Clear Ownership**: Pool-specific labels and selectors -- **StatefulSet Benefits**: Stable network identity, ordered/parallel deployment -- **Kubernetes Native**: Standard pattern for distributed stateful applications - -**Implementation**: -- StatefulSet name: `{tenant-name}-{pool-name}` -- Pod naming: `{tenant-name}-{pool-name}-{index}` -- Shared headless service for DNS across all pools - -**Tradeoffs**: -- More Kubernetes resources (one StatefulSet per pool) -- Slightly more complex reconciliation loop -- Better flexibility and independence per pool - ---- - -## ADR-002: Unified RUSTFS_VOLUMES for All Pools - -**Status**: Accepted - -**Context**: -RustFS requires a RUSTFS_VOLUMES environment variable. With multiple pools, we needed to decide how to configure this. - -**Options Considered**: - -1. **Separate RUSTFS_VOLUMES per pool**: Each pool runs independent RustFS cluster -2. **Combined RUSTFS_VOLUMES**: All pools in one unified cluster -3. **Configurable**: Let users choose - -**Decision**: Combine all pools into single RUSTFS_VOLUMES, forming one unified cluster. - -**Rationale**: - -- **RustFS Design**: RustFS is designed for single-cluster architecture -- **Erasure Coding**: Maximum redundancy across all volumes -- **Resource Efficiency**: Single cluster is more efficient than multiple -- **Simpler Operation**: One S3 endpoint, not multiple -- **Follows RustFS Patterns**: Official Helm chart uses same approach - -**Implementation**: -```rust -fn rustfs_volumes_env_value(&self) -> Result { - let volume_specs: Vec = self.spec.pools.iter() - .map(|pool| { /* generate pool spec */ }) - .collect(); - Ok(volume_specs.join(" ")) // Space-separated -} -``` - -**Tradeoffs**: -- Cannot have pool-independent clusters within one Tenant -- All pools share performance characteristics -- Simpler user model (one cluster, not N clusters) - -**Consequences**: -- Storage class mixing across pools degrades performance to slowest tier -- Multi-pool is for scheduling/placement, not storage isolation -- For separate clusters, users should create separate Tenants - ---- - -## ADR-003: SchedulingConfig Struct with Flattened Serialization - -**Status**: Accepted - -**Context**: -Pools need Kubernetes scheduling fields (nodeSelector, affinity, tolerations, etc.). We needed to decide how to structure these in the Pool CRD. - -**Options Considered**: - -1. **Individual fields in Pool**: Each scheduling field as separate Pool field -2. **PodTemplateSpec**: Full Kubernetes PodTemplateSpec override -3. **SchedulingConfig with flatten**: Grouped struct, flat YAML - -**Decision**: Use `SchedulingConfig` struct with `#[serde(flatten)]`. - -**Rationale**: - -**Why Not PodTemplateSpec**: -- No industry precedent (no operators use this pattern) -- Complex merging logic (how to merge containers, volumes, etc.) -- Users could set fields that break operator assumptions -- Hard to validate with CEL - -**Why Not Individual Fields**: -- Code organization suffers -- Harder to maintain -- No clear grouping of related fields - -**Why SchedulingConfig**: -- ✅ Industry standard (MongoDB, PostgreSQL operators use similar pattern) -- ✅ Better code organization -- ✅ Flat YAML structure (backward compatible) -- ✅ Type-safe with clear scope -- ✅ Can add methods/validation to SchedulingConfig -- ✅ Reusable if needed elsewhere - -**Implementation**: -```rust -pub struct SchedulingConfig { - pub node_selector: Option>, - pub affinity: Option, - // ... other fields -} - -pub struct Pool { - pub name: String, - pub servers: i32, - pub persistence: PersistenceConfig, - - #[serde(flatten)] // Key: maintains flat YAML - pub scheduling: SchedulingConfig, -} -``` - -**YAML Structure** (unchanged from individual fields): -```yaml -pools: - - name: my-pool - servers: 4 - nodeSelector: {...} # Still flat - affinity: {...} # Still flat - resources: {...} # Still flat -``` - -**Tradeoffs**: -- Code access is `pool.scheduling.field` vs `pool.field` (one extra level) -- Better organization worth the extra level - ---- - -## ADR-004: Server-Side Apply for Resource Management - -**Status**: Accepted - -**Context**: -Resources created by the operator need to be managed declaratively and idempotently. - -**Options Considered**: - -1. **Create/Update pattern**: Check if exists, create or update -2. **Server-side apply**: Kubernetes server-side apply -3. **Client-side apply**: kubectl-style apply - -**Decision**: Use server-side apply. - -**Rationale**: - -- **Idempotent**: Safe to call repeatedly -- **Field Ownership**: Operator owns specific fields, other managers can own others -- **Conflict Resolution**: Kubernetes handles conflicts -- **Declarative**: Matches Kubernetes philosophy -- **Official Pattern**: Recommended by Kubernetes sig-api-machinery - -**Implementation**: -```rust -pub async fn apply(&self, resource: &T, namespace: &str) -> Result { - let api: Api = Api::namespaced(self.client.clone(), namespace); - api.patch( - &resource.name_any(), - &PatchParams::apply("rustfs-operator"), // Field manager - &Patch::Apply(resource), - ).await -} -``` - -**Field Manager**: `"rustfs-operator"` - -**Tradeoffs**: -- Requires understanding of field ownership -- More sophisticated than simple create/update -- Correct pattern for operators - ---- - -## ADR-005: Owner References for Garbage Collection - -**Status**: Accepted - -**Context**: -When a Tenant is deleted, all created resources (StatefulSets, Services, RBAC) should be deleted automatically. - -**Options Considered**: - -1. **Manual Cleanup**: Finalizers with manual deletion logic -2. **Owner References**: Kubernetes automatic garbage collection -3. **Hybrid**: Owner references + finalizers for external resources - -**Decision**: Use owner references for automatic garbage collection. - -**Rationale**: - -- **Kubernetes Native**: Built-in garbage collection -- **Automatic**: No manual cleanup code needed -- **Reliable**: Kubernetes guarantees cleanup -- **Standard Pattern**: Used by most operators -- **No External Resources**: We only create Kubernetes resources (no external systems to clean) - -**Implementation**: -```rust -pub fn new_owner_ref(&self) -> metav1::OwnerReference { - metav1::OwnerReference { - api_version: Self::api_version(&()).to_string(), - kind: Self::kind(&()).to_string(), - name: self.name(), - uid: self.meta().uid.clone().unwrap_or_default(), - controller: Some(true), - block_owner_deletion: Some(true), - } -} -``` - -All created resources include `owner_references: Some(vec![self.new_owner_ref()])`. - -**Tradeoffs**: -- No control over deletion order (Kubernetes decides) -- Fine for our use case (no external dependencies) - -**Future**: If we add external resources (cloud storage, DNS), add finalizers. - ---- - -## ADR-006: Pool-Level Priority Class Override - -**Status**: Accepted - -**Context**: -Both Tenant and Pool can specify priority class. We needed to decide precedence. - -**Options Considered**: - -1. **Tenant-only**: Pool cannot override -2. **Pool-only**: Ignore tenant-level -3. **Pool overrides tenant**: Pool takes precedence if set - -**Decision**: Pool-level priority class overrides tenant-level. - -**Rationale**: - -- **Flexibility**: Different pools can have different priorities -- **Use Case**: Critical pool on high priority, elastic pool on standard -- **Fallback**: Use tenant-level if pool-level not set -- **Principle**: More specific wins (pool more specific than tenant) - -**Implementation**: -```rust -priority_class_name: pool.scheduling.priority_class_name.clone() - .or_else(|| self.spec.priority_class_name.clone()), -``` - -**Example**: -```yaml -spec: - priorityClassName: standard # Tenant default - - pools: - - name: critical - priorityClassName: high # Override - - name: normal - # Uses tenant default (standard) -``` - ---- - -## ADR-007: Shared Services Across All Pools - -**Status**: Accepted - -**Context**: -Should each pool have its own services or share services? - -**Options Considered**: - -1. **Shared Services**: One set of services for all pools -2. **Per-Pool Services**: Separate services per pool -3. **Hybrid**: Shared API, separate console per pool - -**Decision**: Shared services across all pools. - -**Rationale**: - -- **Unified Cluster**: All pools form one RustFS cluster -- **Single S3 Endpoint**: Users access one S3 API, not multiple -- **Simpler**: Fewer resources, easier management -- **RustFS Design**: RustFS expects to be accessed as single cluster - -**Implementation**: -- One IO service (port 9000) for all pools -- One Console service (port 9001) for all pools -- One headless service for StatefulSet DNS - -**Service Selectors**: `rustfs.tenant={name}` (matches all pools) - -**Tradeoffs**: -- Cannot have pool-specific endpoints -- All pools accessed via same service -- Simpler for users (one endpoint to remember) - ---- - -## ADR-008: Automatic Environment Variable Management - -**Status**: Accepted - -**Context**: -RustFS requires specific environment variables. Should users set them or operator? - -**Decision**: Operator automatically sets required RustFS environment variables. - -**Rationale**: - -- **User Experience**: Users don't need to know RustFS internals -- **Correctness**: Operator ensures correct configuration -- **Consistency**: Same environment across all deployments -- **Override**: Users can still override if needed (their vars applied after) - -**Automatically Set**: -- `RUSTFS_VOLUMES` - Generated from pools -- `RUSTFS_ADDRESS` - 0.0.0.0:9000 -- `RUSTFS_CONSOLE_ADDRESS` - 0.0.0.0:9001 -- `RUSTFS_CONSOLE_ENABLE` - true - -**Implementation**: -```rust -let mut env_vars = Vec::new(); -env_vars.push(/* RUSTFS_VOLUMES */); -env_vars.push(/* RUSTFS_ADDRESS */); -env_vars.push(/* RUSTFS_CONSOLE_ADDRESS */); -env_vars.push(/* RUSTFS_CONSOLE_ENABLE */); - -// User vars can override -for user_env in &self.spec.env { - env_vars.retain(|e| e.name != user_env.name); - env_vars.push(user_env.clone()); -} -``` - -**Tradeoffs**: -- Less user control over these specific vars -- Better user experience (works out of box) -- Advanced users can still override - ---- - -## ADR-009: Label Strategy - -**Status**: Accepted - -**Context**: -Resources need labels for selection, grouping, and management. - -**Decision**: Use minimal selectors, comprehensive labels. - -**Rationale**: - -**Selectors** (stable, minimal): -```yaml -# Tenant selector -rustfs.tenant: {tenant-name} - -# Pool selector -rustfs.tenant: {tenant-name} -rustfs.pool: {pool-name} -``` - -**Labels** (comprehensive, can change): -```yaml -app.kubernetes.io/name: rustfs -app.kubernetes.io/instance: {tenant-name} -app.kubernetes.io/managed-by: rustfs-operator -app.kubernetes.io/component: storage # Pool resources -rustfs.tenant: {tenant-name} -rustfs.pool: {pool-name} # Pool resources -``` - -**Why Minimal Selectors**: -- Selectors are immutable in StatefulSet -- Cannot be changed without recreating resource -- Minimal selectors provide stability - -**Why Comprehensive Labels**: -- Labels can be added/changed -- Useful for grouping, monitoring, policies -- Follow Kubernetes recommended labels - ---- - -## ADR-010: RBAC Conditional Creation - -**Status**: Accepted - -**Context**: -Users may want to use custom ServiceAccounts. How should RBAC be handled? - -**Decision**: Conditional RBAC creation based on configuration. - -**Logic**: -```rust -let custom_sa = spec.service_account_name.is_some(); -let create_rbac = spec.create_service_account_rbac.unwrap_or(false); - -if !custom_sa || create_rbac { - // Create Role - if !custom_sa { - // Create ServiceAccount + RoleBinding - } else { - // Create RoleBinding only (bind custom SA) - } -} -``` - -**Scenarios**: -1. No custom SA → Create SA, Role, RoleBinding -2. Custom SA + `createServiceAccountRbac=true` → Create Role, RoleBinding -3. Custom SA + `createServiceAccountRbac=false` → Skip all RBAC - -**Rationale**: -- **Flexibility**: Support both managed and custom SA -- **Cloud Integration**: Allow workload identity (AWS IAM, GCP, Azure) -- **Security**: Users can provide their own RBAC with additional permissions -- **Default Simplicity**: Works out of box without custom SA - ---- - -## Future Architectural Decisions - -### Under Consideration - -1. **Finalizers for External Resources**: If we add external integrations -2. **Status Subresource Population**: When to update status, conflict handling -3. **Per-Pool Status Tracking**: Whether to track pool health separately -4. **Dynamic Pool Addition**: API for adding pools without recreation - ---- - -## Related Documents - -- [Multi-Pool Use Cases](./multi-pool-use-cases.md) - Valid multi-pool patterns -- [DEVELOPMENT-NOTES.md](./DEVELOPMENT-NOTES.md) - Implementation details and discoveries - ---- - -**Format**: Loosely based on [Architecture Decision Records (ADR)](https://adr.github.io/) -**Last Updated**: 2025-11-08 diff --git a/docs/multi-pool-use-cases.md b/docs/multi-pool-use-cases.md deleted file mode 100755 index c3b46b1..0000000 --- a/docs/multi-pool-use-cases.md +++ /dev/null @@ -1,430 +0,0 @@ -# Multi-Pool Use Cases - -## Overview - -The RustFS Operator supports multiple pools within a single Tenant, enabling advanced deployment patterns for capacity scaling, compliance, cost control, and high availability. - -## ⚠️ Critical Architecture Understanding - -### Unified Cluster Behavior - -**All pools in a Tenant form ONE unified RustFS erasure-coded cluster**, not independent storage tiers. - -**Key Points:** -1. **Single RUSTFS_VOLUMES**: All pools combined into one space-separated environment variable -2. **Uniform Data Distribution**: Erasure coding stripes data across ALL volumes in ALL pools -3. **No Storage Class Awareness**: RustFS does NOT prefer fast disks over slow disks -4. **Performance Limitation**: Cluster performs at the speed of the SLOWEST storage class - -### Common Misconception - -**WRONG Assumption:** -"I can create an NVMe pool for hot data and an HDD pool for cold data, and RustFS will intelligently place data on the appropriate tier." - -**REALITY:** -- RustFS has NO hot/warm/cold data awareness for internal pools -- ALL data is uniformly distributed across ALL volumes -- An object will have shards on NVMe AND HDD -- Read/write performance limited by slowest tier (HDD) -- Expensive NVMe provides ZERO performance benefit - -### What RustFS Tiering Actually Is - -**RustFS tiering** (from `crates/ecstore/src/tier/tier.rs`): -- Transitions data to **EXTERNAL** cloud storage (S3, Azure, GCS, MinIO) -- Configured via **bucket lifecycle policies** -- NOT for internal disk class differentiation - -**Example**: Transition old objects to AWS S3 Glacier for cost savings. - -## Architecture - -### Single Tenant, Multiple Pools - -Each Tenant can contain multiple Pools: -- **One StatefulSet per Pool** with independent configuration -- **Unified distributed cluster** via combined RUSTFS_VOLUMES -- **Shared services** (IO, Console, Headless) across all pools -- **Pool-specific scheduling** for node targeting and resource allocation - -### Example Structure - -```yaml -apiVersion: rustfs.com/v1alpha1 -kind: Tenant -metadata: - name: my-tenant -spec: - pools: - - name: pool-a - servers: 4 - persistence: {...} - nodeSelector: {storage-type: nvme} # Pool-A specific - resources: {requests: {cpu: "8"}} # Pool-A specific - - - name: pool-b - servers: 8 - persistence: {...} - nodeSelector: {storage-type: ssd} # Pool-B specific - resources: {requests: {cpu: "4"}} # Pool-B specific -``` - -## Per-Pool Configuration Options - -### Storage Configuration -- **Storage Class**: Different storage types per pool (NVMe, SSD, HDD) -- **Storage Size**: Different capacities per pool -- **Volume Count**: Different server/volume ratios -- **Mount Paths**: Custom paths (default: `/data/rustfs{N}`) - -### Kubernetes Scheduling -- **Node Selector**: Target specific nodes by labels -- **Affinity**: Complex node/pod affinity rules -- **Tolerations**: Schedule on tainted/dedicated nodes -- **Topology Spread**: Distribute across zones/regions -- **Resources**: CPU/memory requests and limits -- **Priority Class**: Override tenant-level priority (per-pool) - -### Metadata -- **Labels**: Custom PVC labels for each pool -- **Annotations**: Backup policies, monitoring tags - -## Use Case Examples - -### 1. Hardware-Targeted Pools - -**Scenario**: Different pools on NVMe, SSD, and HDD nodes - -**Example**: [hardware-pools-tenant.yaml](../examples/hardware-pools-tenant.yaml) - -**Benefits**: -- Performance optimization (hot data on NVMe) -- Cost optimization (cold data on HDD) -- Hardware utilization (match workload to hardware) -- Resource differentiation (more CPU/memory for NVMe) - -**Implementation**: -```yaml -pools: - - name: nvme-pool - nodeSelector: {storage-type: nvme} - resources: - requests: {cpu: "8", memory: "32Gi"} - - - name: hdd-pool - nodeSelector: {storage-type: hdd} - resources: - requests: {cpu: "2", memory: "8Gi"} -``` - -### 2. Geographic Distribution - -**Scenario**: Pools in different regions for compliance/latency - -**Example**: [geographic-pools-tenant.yaml](../examples/geographic-pools-tenant.yaml) - -**Benefits**: -- GDPR compliance (EU data stays in EU) -- Data sovereignty enforcement -- Low latency for regional users -- Disaster recovery across regions - -**Implementation**: -```yaml -pools: - - name: us-region - affinity: - nodeAffinity: - requiredDuringScheduling: - nodeSelectorTerms: - - matchExpressions: - - key: topology.kubernetes.io/region - operator: In - values: ["us-east-1"] - topologySpreadConstraints: - - maxSkew: 1 - topologyKey: topology.kubernetes.io/zone -``` - -### 3. Cost Optimization (Spot Instances) - -**Scenario**: Mix of on-demand and spot instances - -**Example**: [spot-instance-tenant.yaml](../examples/spot-instance-tenant.yaml) - -**Benefits**: -- 70-90% cost reduction -- Critical data on guaranteed instances -- Elastic capacity on cheaper spot instances -- Automatic failure handling via erasure coding - -**Implementation**: -```yaml -pools: - - name: critical-pool - nodeSelector: {instance-lifecycle: on-demand} - priorityClassName: system-cluster-critical - - - name: elastic-pool - nodeSelector: {instance-lifecycle: spot} - tolerations: - - key: "spot-instance" - operator: "Equal" - value: "true" - effect: "NoSchedule" -``` - -### 4. Workload Separation - -**Scenario**: Different pools for different workload types - -**Benefits**: -- Batch processing isolation from real-time -- Performance guarantees per workload -- Resource allocation by priority - -**Implementation**: -```yaml -pools: - - name: realtime-pool - servers: 4 - nodeSelector: {workload-type: realtime} - resources: - requests: {cpu: "8", memory: "32Gi"} - priorityClassName: high-priority - - - name: batch-pool - servers: 16 - nodeSelector: {workload-type: batch} - resources: - requests: {cpu: "4", memory: "16Gi"} -``` - -### 5. Multi-Tenant SaaS - -**Scenario**: Separate pools per customer tier - -**Benefits**: -- SLA guarantees for premium customers -- Hardware isolation -- Security boundaries -- Cost differentiation - -**Implementation**: -```yaml -pools: - - name: enterprise-pool - nodeSelector: {tenant-tier: enterprise} - tolerations: - - key: "enterprise-only" - effect: "NoSchedule" - resources: - requests: {cpu: "8", memory: "32Gi"} - - - name: standard-pool - nodeSelector: {tenant-tier: standard} - resources: - requests: {cpu: "2", memory: "8Gi"} -``` - -### 6. Failure Domain Separation - -**Scenario**: Pools distributed across availability zones - -**Benefits**: -- Survive entire zone failures -- Network locality within zones -- Balanced distribution - -**Implementation**: -```yaml -pools: - - name: zone-a - affinity: - nodeAffinity: - requiredDuringScheduling: - nodeSelectorTerms: - - matchExpressions: - - key: topology.kubernetes.io/zone - operator: In - values: ["us-east-1a"] - - - name: zone-b - affinity: - nodeAffinity: - requiredDuringScheduling: - nodeSelectorTerms: - - matchExpressions: - - key: topology.kubernetes.io/zone - operator: In - values: ["us-east-1b"] -``` - -## Technical Details - -### How Pools are Combined - -From `workloads.rs:37-66`: -```rust -fn rustfs_volumes_env_value(&self) -> Result { - let volume_specs: Vec = self.spec.pools.iter() - .map(|pool| { - format!( - "http://{}-{}-{{0...{}}}.{}.{}.svc.cluster.local:9000{}/rustfs{{0...{}}}", - tenant_name, pool_name, servers-1, headless, namespace, path, volumes-1 - ) - }) - .collect(); - Ok(volume_specs.join(" ")) // Space-separated -} -``` - -**Result**: All pools combined into single RUSTFS_VOLUMES environment variable. - -**Example** (2 pools): -``` -http://tenant-pool-a-{0...3}.tenant-hl.ns.svc.cluster.local:9000/data/rustfs{0...7} http://tenant-pool-b-{0...7}.tenant-hl.ns.svc.cluster.local:9000/data/rustfs{0...3} -``` - -RustFS then treats all 12 servers (4 from pool-a + 8 from pool-b) as a unified distributed cluster. - -### Scheduling Field Propagation - -From `workloads.rs:236-250`: -```rust -spec: Some(corev1::PodSpec { - service_account_name: Some(self.service_account_name()), - containers: vec![container], - scheduler_name: self.spec.scheduler.clone(), - priority_class_name: pool.priority_class_name.clone() - .or_else(|| self.spec.priority_class_name.clone()), - node_selector: pool.node_selector.clone(), - affinity: pool.affinity.clone(), - tolerations: pool.tolerations.clone(), - topology_spread_constraints: pool.topology_spread_constraints.clone(), - ..Default::default() -}), -``` - -**Pool-level** fields override or extend **tenant-level** settings. - -## Best Practices - -### 1. Ensure Minimum Viable Configuration - -Each pool must satisfy: `servers * volumesPerServer >= 4` - -**Example**: -- ✅ 1 server × 4 volumes = 4 ✓ -- ✅ 2 servers × 2 volumes = 4 ✓ -- ❌ 2 servers × 1 volume = 2 ✗ - -### 2. Plan for Failures - -With multi-pool, plan for worst-case: -- What if entire pool goes down? -- With erasure coding, can you afford to lose N/2 volumes? -- Ensure critical data has redundancy across pools - -### 3. Label Nodes Appropriately - -Use clear, consistent node labels: -```bash -# Good -kubectl label node storage-type=nvme -kubectl label node instance-lifecycle=spot -kubectl label node topology.kubernetes.io/region=us-east-1 - -# Avoid -kubectl label node type=1 # Unclear -``` - -### 4. Use Topology Spread for High Availability - -Distribute pool pods across failure domains: -```yaml -topologySpreadConstraints: - - maxSkew: 1 - topologyKey: topology.kubernetes.io/zone - whenUnsatisfiable: DoNotSchedule - labelSelector: - matchLabels: - rustfs.pool: my-pool-name -``` - -### 5. Monitor Pool Health Separately - -```bash -# Check pods per pool -kubectl get pods -l rustfs.pool=nvme-pool -kubectl get pods -l rustfs.pool=ssd-pool -kubectl get pods -l rustfs.pool=hdd-pool - -# Check distribution -kubectl get pods -l rustfs.tenant=my-tenant -o wide -``` - -## Limitations - -### Current Limitations - -1. **No Per-Pool Status**: Status tracking is tenant-level only -2. **Shared Services**: All pools share same IO/Console services -3. **Single RUSTFS_VOLUMES**: All pools in one environment variable -4. **No Dynamic Pool Addition**: Must update tenant spec (no hot-add) - -### Design Constraints - -1. **Erasure Coding**: Each pool must meet 4-volume minimum -2. **StatefulSet Per Pool**: Each pool creates separate StatefulSet -3. **Shared Headless Service**: All pools use same headless service for DNS -4. **Unified Cluster**: RustFS treats all pools as one cluster - -## Troubleshooting - -### Pool Pods Not Scheduling - -**Symptom**: Pods stuck in Pending state - -**Check**: -```bash -kubectl describe pod -# Look for: "0/N nodes are available: N node(s) didn't match Node-Selector" -``` - -**Solution**: Verify node labels match pool's nodeSelector - -### Uneven Distribution - -**Symptom**: All pods in one zone - -**Check**: -```bash -kubectl get pods -l rustfs.pool=my-pool -o wide -``` - -**Solution**: Add topology spread constraints to pool - -### Resource Starvation - -**Symptom**: Some pools not getting resources - -**Check**: -```bash -kubectl describe nodes -# Look for resource pressure -``` - -**Solution**: Set appropriate resource requests per pool - -## Related Documentation - -- [Pool Configuration](../docs/pool-configuration.md) -- [Hardware-Targeted Example](../examples/hardware-pools-tenant.yaml) -- [Geographic Distribution Example](../examples/geographic-pools-tenant.yaml) -- [Spot Instance Example](../examples/spot-instance-tenant.yaml) - ---- - -**Version**: v0.2.0 (with pool scheduling fields) -**Last Updated**: 2025-11-08 diff --git a/docs/prd-tenant-events-sse.md b/docs/prd-tenant-events-sse.md deleted file mode 100644 index 8ceb989..0000000 --- a/docs/prd-tenant-events-sse.md +++ /dev/null @@ -1,127 +0,0 @@ -# PRD:Tenant Events 多资源聚合与 SSE 推送 - -**状态:** 草案 -**范围:** Console / Operator -**更新:** 2026-03-29 - ---- - -## 1. 背景与问题 - -Tenant 详情页 **Events** 仅列出 `involvedObject.name` 等于 Tenant 名的 Kubernetes `Event`,**看不到** Pod、StatefulSet、PVC 等子资源上的事件。详情页多为 **客户端路由** + 全量 `loadTenant()`(或等价数据加载),**不一定**触发浏览器整页刷新;但 **Events 子视图内无法单独增量刷新事件列表**,默认需 **重新进入详情** 或依赖 **全量 `loadTenant()`** 才能更新事件相关数据,排障效率低。 - -**与 `kubectl describe` 的关系:** 此处事件与 `kubectl describe` 输出中 **Events** 小节为 **同一数据源**——均为集群中的 `Event`(Phase 1 以 `core/v1` 为主,见 §3)。对 Tenant / Pod / StatefulSet / PVC 分别执行 `kubectl describe …` 时看到的事件行,与本页按 §4 合并后的条目 **语义一致**(同一 `involvedObject` 上的同一条 Event)。差异仅在于:Console **合并多资源**、**去重**、**统一排序**并可能 **截断条数**(如默认 200),与逐条 describe 的展示顺序、是否全量不一定逐行相同。 - -## 2. 目标 - -1. 在同一视图展示 **归属于该 Tenant** 的多资源事件(**Tenant CR、Pod、StatefulSet、PVC**)。 -2. 通过 **SSE(Server-Sent Events)** 将 **合并后的事件快照** 推送到浏览器;**不**提供单独的 `GET .../events` HTTP 聚合接口(实现上可移除该路由及相关处理)。 - -**仅 SSE、移除 REST 的产品代价(强决策,评审必读):** - -- 去掉公开 `GET .../events` JSON 后,**脚本 / curl / 自动化**无法用单请求拉取合并后的列表(除非另加 **内部 / debug / 运维** 只读接口)。 -- **集成测试**更依赖 **SSE 客户端** 或 **浏览器环境**,成本高于纯 REST 断言。 - -**可选变体:** 若不接受对全部调用方删除 REST:可对 **用户 UI** 关闭 `GET .../events`,**保留只读运维 API**(单独鉴权或网络策略);与「完全删除」二选一,须在实现与评审中明确。代价与变体在 **§7** 展开。 - -## 3. Phase 1 非目标 - -- 不替代 K8s 审计日志或 RustFS 应用日志。 -- 首版不强制迁移到 `events.k8s.io`;若集群以 `core/v1` `Event` 为主可继续沿用。 -- 首版不引入 WebSocket(除非后续有强需求)。 - -## 4. 「归属于 Tenant」的判定 - -| 资源 | Phase 1 规则 | -|------|----------------| -| Tenant | `metadata.name == {tenant}`;事件侧须 **`involvedObject.name={tenant}` 且 `involvedObject.kind` 与 CRD 注册 Kind 一致(通常为 `Tenant`)**(见 §4.1)。 | -| Pod | 见 **§4.1**,与 Console **`GET .../pods`**(`list_pods`)同源。 | -| StatefulSet | 见 **§4.1**,与 Console **`GET .../pools`**(`list_pools`)所用 STS 同源。 | -| PersistentVolumeClaim | 见 **§4.1**;Console 无独立 PVC 列表 API,按与 Operator 一致的 **标签** 发现。 | - -### 4.1 与 Pod / StatefulSet / PVC 发现对齐(固定约定) - -合并事件所用的 **资源名白名单** 须与当前 Console 实现 **同一套 label 与命名规则**(同 namespace、同路径参数 `{tenant}`),避免 Events 与 Pods / Pools 页「各算各的」。 - -| 资源 | 与现有行为对齐方式 | -|------|---------------------| -| **Pod** | `Pod` 使用 **`ListParams` label:`rustfs.tenant=`**。与 `src/console/handlers/pods.rs` 中 **`list_pods`** 一致。 | -| **StatefulSet** | `StatefulSet` 使用 **同一 label:`rustfs.tenant=`**;STS 名称 **`{tenant}-{pool}`**(`pool` 来自 Tenant `spec.pools`),与 `src/console/handlers/pools.rs` 中 **`list_pools`** 一致。 | -| **PersistentVolumeClaim** | Operator 在 PVC 模板上注入 **`rustfs.tenant`**、**`rustfs.pool`** 等(见 `Tenant::pool_labels`,`src/types/v1alpha1/tenant/workloads.rs` 中 `volume_claim_templates`)。事件侧对 **`PersistentVolumeClaim`** 使用 **与 Pod 相同的租户标签 `rustfs.tenant=`** 列出名集合,即与 Operator 创建的 PVC 一致。 | - -**实现要求:** Events 合并逻辑应 **复用或抽取**与 `list_pods` / `list_pools` **相同的 label 字符串与 STS 命名公式**,禁止另写一套查询;变更 Pod/Pool 发现时,Events 须同步修改或共用模块。 - -**Tenant CR 自身:** `involvedObject.name={tenant}` 且 `involvedObject.kind` 与 CRD 注册 Kind 一致(通常为 `Tenant`)。**现状缺口:** `src/console/handlers/events.rs` 仅按 `involvedObject.name` 过滤,**未**约束 kind;本需求实现 **须补齐** kind 条件(field selector 若支持则联合 `involvedObject.kind`;否则 list/watch 后 **等价后滤**),避免同 namespace **同名不同 kind** 资源事件误混入。 - -**实现原则:** 与 **`list_pods` / `list_pools` 及 PVC 标签约定**(§4.1)一致。 - -**范围边界(必须):** SSE 路径中的 `{tenant}` 即当前详情页 Tenant;**仅**合并、展示 **该 Tenant** 下按上表判定的资源相关事件。**禁止**混入同 namespace 内其他 Tenant 的 Pod/STS/PVC 等事件;服务端以「当前 tenant 的发现集合」为白名单过滤,前端 **只渲染本页 tenant** 的数据,切换 Tenant 或离开页面须 **丢弃** 旧列表 state,避免串数据。 - -## 5. 功能需求 - -### 5.1 SSE:`GET /api/v1/namespaces/{ns}/tenants/{tenant}/events/stream` - -**不提供**单独的 `GET .../tenants/{tenant}/events` HTTP 聚合接口;合并后的事件列表 **仅**通过本 SSE 端点以 **JSON 快照** 下发(实现可删除既有 events REST 路由与 handler)。移除 REST 的 **代价** 与 **可选变体** 见 **§7**。 - -- **租户范围:** 快照中每条事件必须属于 **路径参数 `{tenant}`** 对应之发现集合(见 §4);不得包含其他 Tenant 资源的事件。 -- **合并**来源:**Tenant CR:** `involvedObject.name={tenant}` **且** `involvedObject.kind` 为 Tenant(或 CRD 等价 Kind;field selector 若不支持联合 kind 则见 §4.1 **后滤**);**另**合并 **该 tenant 范围内**每个 **Pod 名、StatefulSet 名、PVC 名** 对应的、**kind 匹配** 的 `involvedObject` 事件(服务端多次 list 再合并,或等价实现)。 -- **去重:** 优先 `metadata.uid`;否则用 `(kind, name, reason, firstTimestamp, message)` 弱去重。 -- **排序:** 按 `lastTimestamp` / `eventTime` 降序;**默认每帧快照最多 200 条**(常量可配置,需写入 API 说明)。 -- **错误:** 建立连接前或 Watch 无法启动等 **关键失败** 时返回 **明确 HTTP 错误**;不得在成功 `200`/建立流后长期以「空快照」掩盖失败(与现有 Console 错误策略一致)。 -- **鉴权:** 与现有 Console(JWT + 用户 K8s token)一致。 -- **Content-Type:** `text/event-stream`。 -- **行为:** 在 namespace 内 Watch `Event`(或等价),服务端仅按 **当前路径 `{tenant}`** 对应的 involvedObject 集合过滤后再推送;**不得**将无关 Tenant 的事件推入该连接。 -- **负载:** 每次事件推送 **完整快照** JSON:`{ "events": [ ... ] }`,字段约定写入 API 说明,同样 200 条上限。 -- **首包:** 连接建立后 **必须**尽快发送至少一条 **snapshot**,作为首屏数据源(无独立 REST 兜底)。 -- **断线:** 客户端 `EventSource` 退避重连;服务端用 `resourceVersion` 等在合理范围内恢复 Watch。 - -### 5.2 前端(console-web) - -- 进入 Events 标签:**建立 SSE**,以 **首包及后续快照** 更新 state(无单独 HTTP 拉取 events)。 -- **鉴权与 `EventSource`:** 当前 Session 为 **Cookie**(与现有 Console 一致)时,须 **同站 / 可携带凭证**(如 `credentials: 'include'` / `withCredentials: true`),并与 **CORS** 策略一致。**若将来**改为 **Authorization 头**:原生 `EventSource` **无法设置自定义 Header**;备选为继续依赖 **Cookie**、或 **query token**(泄露风险须单独评估),在设计与评审中明确。 -- SSE 失败:**非阻塞** toast,保留上次数据,提供 **重试** 或 **手动刷新**。 -- 表格列语义不变:**类型**、**对象** 的展示与筛选枚举见 **§5.3**;**对象** 列展示为 `Kind/Name`(`Name` 为资源名,非枚举)。 -- **仅当前 Tenant:** 列表与筛选结果 **不得**包含其他 Tenant 的事件;`tenant` 路由参数变化或卸载页面前 **清空** events state,避免残留。 -- **筛选(客户端):** 在 **当前 tenant 已加载** 的合并列表上支持按 **类型**、**对象(Kind)** 与 **时间**(基于 `lastTimestamp` / `eventTime` 的范围或相对区间)过滤展示;对象侧可按 **Kind 多选** + **名称关键字**(匹配 `involvedObject.name`)组合;**不**要求 SSE 负载或 URL 增加服务端筛选参数(Phase 1)。 - -### 5.3 类型与对象枚举(Phase 1) - -与 `core/v1` `Event` 及本页合并范围对齐;供 **表格列展示** 与 **筛选器** 使用。 - -| 维度 | 枚举(固定) | 对应 K8s 字段 | 说明 | -|------|----------------|-----------------|------| -| **类型** | `Normal`,`Warning` | `Event.type` | **无标准 `Error` 类型:** Kubernetes `Event.type` 仅约定 `Normal` / `Warning`;失败、不可调度等「错误语义」事件在 API 中一般为 **`Warning`**,而非单独 `Error`。与 [Event v1](https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/event-v1/) 一致;筛选器仅这两项。若 API 返回空或非上述字符串(含个别组件自定义值),**类型**列 **原样显示**,该项 **不参与**「类型」枚举筛选(或归入「其他」选项,实现二选一并在 UI 文案中写清)。 | -| **对象(Kind)** | `Tenant`,`Pod`,`StatefulSet`,`PersistentVolumeClaim` | `involvedObject.kind` | Phase 1 与 §4 资源范围一致。筛选为 **Kind 多选**;`involvedObject.name` 用 **字符串** 展示与 **可选关键字** 过滤,不设枚举。 | - -**实现提示:** 前端可用 TypeScript 字面量联合或常量数组表达上述枚举,避免魔法字符串分散。 - -## 6. 非功能需求 - -| 维度 | 要求 | -|------|------| -| RBAC | 用户需能 `list`/`watch` `events`,并能 `list` 用于发现 Pod/STS/PVC 的资源。 | -| 性能 | 合并列表有上限;连接断开必须释放 Watch;避免每 Tab 无界协程。 | -| 多副本 | 若无会话粘滞,需文档说明 **SSE 须 sticky** 或 Phase 1 仅单副本;避免 Watch 落在错误实例上长期悬挂。 | -| 网关 / 代理 | 常见 **Nginx / Ingress** 默认 **读超时(如 60s)** 会切断长时间无响应字节的 SSE,表现为 **静默断流**、客户端 **频繁重连**。**上线 checklist:** 调大 `proxy_read_timeout`(或 Envoy 等 **等价超时**),与 **多副本 sticky** 并列;具体数值由运维与是否采用服务端注释/心跳等策略共同决定。 | -| 安全 | SSE 快照 DTO 不包含 Secret 内容;**租户隔离**:流与 UI 仅暴露当前 `{tenant}` 范围内事件。 | - -## 7. 发布策略 - -1. **直接交付 SSE** 为事件唯一通道;**删除**(或不实现)`GET .../tenants/{tenant}/events` 聚合 HTTP 接口,避免双路径维护。 -2. **产品代价(与 §2 一致):** 移除公开 JSON 后,**脚本 / curl / 自动化**无法用单请求拉取合并后的 events(除非另加 **内部 / debug / 运维** 接口);**集成测试**更依赖 **SSE 客户端** 或 **浏览器环境**。 -3. **可选变体:** 若团队不接受对全部调用方删除 REST:可对 **用户 UI** 关闭 `GET .../events`,**保留只读运维 API**(单独鉴权或网络策略);与「完全删除」二选一并在文档中写明。 -4. 无需「先 REST、后开 SSE」或 **SSE 默认关闭** 的阶段性开关;以 SSE 首包 snapshot 满足首屏与更新。 - -## 8. 验收标准 - -1. 人为制造 Pod 级 **Warning** 事件(如不可调度),**约 15s 内** 表格出现对应行,**Object** 为 `Pod/...`,无需整页刷新。 -2. 无 events REST 时,仅靠 SSE **首包与后续快照** 可得到 **合并、排序、截断** 后的一致列表。 -3. RBAC 不足或连接失败时返回 **明确错误**(或 SSE 合理失败语义),不出现「空表误导」。 -4. 关闭标签页后服务端 **停止** 对应 Watch/SSE(开发环境可通过日志验证)。 -5. 同 namespace 存在 **多个 Tenant** 时,在 Tenant A 详情 Events 中 **不出现** Tenant B 的 Pod/STS/PVC 等事件(服务端与前端均需满足)。 -6. 合并所用 Pod / StatefulSet / PVC 名集合与 **§4.1** 及对应 handler 行为一致(代码审查或单测可对照 `rustfs.tenant` 与 `{tenant}-{pool}` 规则)。 -7. **Tenant CR 事件**仅包含 **`involvedObject.kind=Tenant`(或 CRD 等价 Kind)且 `involvedObject.name={tenant}`**;**不得**因同名不同 kind 混入其他资源事件;可验证 **field selector 含 kind** 或文档化的 **等价后滤**(§4.1)。 - ---- - -*一页 PRD 结束。* diff --git a/docs/tech-design-tenant-events-sse.md b/docs/tech-design-tenant-events-sse.md deleted file mode 100644 index 9089671..0000000 --- a/docs/tech-design-tenant-events-sse.md +++ /dev/null @@ -1,239 +0,0 @@ -# Technical Design: Tenant Events SSE - -**Status:** Draft -**Scope:** `console` (Rust) + `console-web` (Next.js) -**Related PRD:** [prd-tenant-events-sse.md](./prd-tenant-events-sse.md) -**Updated:** 2026-03-29 - ---- - -## 1. Goals (from PRD) - -- Aggregate `core/v1` `Event` for **Tenant CR, Pod, StatefulSet, PVC** scoped to one tenant (see PRD §4 / §4.1). -- Deliver updates via **SSE** only; **remove** `GET /api/v1/namespaces/{ns}/tenants/{tenant}/events` (see PRD §5 / §7). -- Frontend: **no** separate HTTP fetch for events; **client-side** filters for type / kind / time. - -### 1.1 PRD context (revision) - -- **UX / loading:** The PRD §1 clarifies that the tenant detail view is typically **client-side routing** + full `loadTenant()` (or equivalent)—**not** necessarily a browser full-page reload. The Events sub-tab still could not incrementally refresh events alone before SSE; after SSE, **the Events tab self-updates** via the stream without requiring full `loadTenant()` for event rows (keep this distinction when testing and documenting). - -- **Removing REST (product / engineering):** Dropping public `GET .../events` means **curl/scripts** cannot fetch merged JSON in one shot, and **integration tests** lean on **SSE clients** or **browser** (PRD §2 / §7). If the team chooses the **optional variant**, keep a **read-only ops/internal** JSON endpoint behind separate auth/networking—document the chosen path in deploy docs and OpenAPI. - ---- - -## 2. Architecture - -```mermaid -flowchart LR - subgraph browser [console-web] - UI[Tenant Detail Events tab] - ES[EventSource withCredentials] - UI --> ES - end - subgraph console [Operator Console API] - H[SSE handler] - D[Tenant scope discovery] - W[Event watch + filter] - M[Merge dedupe sort cap] - H --> D --> W --> M - end - subgraph k8s [Kubernetes API] - API[core/v1 Event watch/list] - R[Pod STS PVC Tenant lists] - end - ES <-->|cookie session| H - D --> R - W --> API -``` - -- **Discovery** reuses the same label/name rules as `list_pods` / `list_pools` + PVC list by `rustfs.tenant` (PRD §4.1). -- **Watch**: namespace-scoped `Event` stream, **in-memory filter** by `involvedObject` ∈ scope (see §3.3). -- **Transport**: SSE `text/event-stream`, each message body = full snapshot JSON `{ "events": [...] }` (PRD §5.1). - ---- - -## 3. Backend (Rust / `src/console`) - -### 3.1 Routes - -| Action | Path | -|--------|------| -| **Add** | `GET /api/v1/namespaces/:namespace/tenants/:tenant/events/stream` | -| **Remove** | `GET .../tenants/:tenant/events` → delete route + `list_tenant_events` (per PRD) | - -Register in [`src/console/routes/mod.rs`](src/console/routes/mod.rs) (`event_routes` or dedicated stream route). Merge in `api_routes()` in [`server.rs`](src/console/server.rs). - -### 3.2 Module layout (suggested) - -| Piece | Responsibility | -|-------|------------------| -| `handlers/events.rs` or `handlers/events_stream.rs` | Axum handler: auth `Claims`, build K8s client, spawn stream task | -| `tenant_event_scope.rs` (new) | `async fn discover_tenant_event_scope(client, ns, tenant) -> Result`: pod names, STS names, PVC names, tenant name; **shared helpers** with `list_pods` / `list_pools` label strings (`rustfs.tenant=...`) and `{tenant}-{pool}` | -| Reuse `EventItem` / `EventListResponse` shape | Snapshot JSON field names stay stable for the UI; optional thin wrapper `EventSnapshot { events: Vec }` for SSE | - -**Refactor note:** Extract `format!("rustfs.tenant={}", tenant)` and STS name building into a small module used by `pods`, `pools`, and `tenant_event_scope` to satisfy PRD §4.1 “single source of truth”. - -### 3.3 Kubernetes interaction - -**API version:** `core/v1` `Event` only in Phase 1 (PRD §3; aligns with existing `list_tenant_events`). - -#### 3.3.1 Tenant `involvedObject.kind` (correction vs current code) - -**Gap:** [`src/console/handlers/events.rs`](src/console/handlers/events.rs) today uses **only** `involvedObject.name={tenant}` in a field selector; it does **not** filter by `involvedObject.kind`. PRD §4 / §4.1 / §8 require **Tenant** rows to match **`kind` + `name`**, so a different resource kind with the same name could theoretically be mixed in. - -**Fix (implementation contract):** - -1. **Read the CRD Kind** used at runtime (e.g. constant aligned with `deploy/rustfs-operator/crds/` or `kubectl get crd`—typically **`Tenant`**). Store in `Scope` as `tenant_event_kind: String` (or `&'static str` if fixed). -2. **Filtering:** For every candidate `Event`, when attributing to the “Tenant CR” row, require: - - `involved_object.name == tenant` **and** - - `involved_object.kind == tenant_event_kind` (case-sensitive as returned by the API). -3. **List/watch strategy:** Kubernetes field selectors for `Event` may not support combining `involvedObject.kind` and `involvedObject.name` reliably across versions. **Recommended default:** namespace-scoped **list + watch** of `Event`, then **post-filter** all legs (Tenant / Pod / STS / PVC) in Rust—same code path for snapshot and watch updates. Optionally use field selectors where they reduce list size only after verification on target K8s versions. - -**Scope set:** Build `involved: Set<(Kind, Name)>` with **fully qualified kind strings** as returned by `Event.involved_object.kind` (e.g. `Pod`, `StatefulSet`, `PersistentVolumeClaim`, `Tenant`). - -#### 3.3.2 Scope discovery (initial + periodic refresh) - -1. `Pod`: `Api::list` with `labels: rustfs.tenant=` (same as [`handlers/pods.rs`](src/console/handlers/pods.rs)). -2. `StatefulSet`: same label; names must match `{tenant}-{pool}` for each `pool` in `Tenant` spec (same as [`handlers/pools.rs`](src/console/handlers/pools.rs)). -3. `PersistentVolumeClaim`: `Api::list` with `labels: rustfs.tenant=`. -4. `Tenant`: name = path param; **kind** from CRD / constant (see §3.3.1). - -**Filtering (watch path):** - -- `watcher` / `WatchStream` on `Api` **in the namespace** (list+watch with `ListParams::default()` or minimal params). -- For each `Applied`/`Deleted` event, **accept** iff `(involved.kind, involved.name)` ∈ `Scope` (with kind matching rules above). -- On **reconnect**, use `resource_version` from last bookmark/object when possible (kube-rs patterns). - -**Initial snapshot:** Before or right after watch starts, **list** events in namespace and filter the same way, then **dedupe / sort / cap 200** (PRD §5.1). Emit first SSE `data:` line immediately so the UI can render without a separate REST call. - -**Periodic scope refresh:** Re-run discovery every **N** seconds (e.g. 30–60s) or when watch errors, so new Pods/PVCs enter the whitelist without requiring reconnect. Document chosen **N** in code comment. - -### 3.4 Dedupe, sort, cap - -- **Dedupe:** `metadata.uid` first; else weak key `(kind, name, reason, firstTimestamp, message)` (PRD §5.1). -- **Sort:** `lastTimestamp` / `eventTime` descending. -- **Cap:** default **200** events per snapshot (constant + comment for ops). - -### 3.5 SSE response (Axum) - -- `Content-Type: text/event-stream` -- `Cache-Control: no-cache`, `Connection: keep-alive` as appropriate -- Body: async **stream** of UTF-8 lines: `data: \n\n` -- On **fatal** errors **before** stream starts → return **4xx/5xx JSON** (same error envelope as other console handlers), **not** an empty stream. -- On **watch failure after** stream started → optionally send a final SSE event with error shape **or** close connection; **do not** silently send endless empty snapshots (PRD §5.1). - -**Compression:** SSE is long-lived; ensure `CompressionLayer` does not buffer the stream indefinitely (verify `tower-http` behavior or disable compression for this path if needed). - -### 3.6 Auth - -- **HTTP session:** Middleware uses **`session` cookie JWT** ([`middleware/auth.rs`](src/console/middleware/auth.rs)). **EventSource** sends cookies on same-site / credentialed CORS; frontend must use `{ withCredentials: true }` for cross-origin dev. -- **K8s API:** `Claims` still carries `k8s_token` for impersonated `kube::Client`—unchanged from other handlers. - -**PRD note:** “JWT + user K8s token” in the PRD refers to this combined model; SSE does **not** use `Authorization` headers for browser transport. - -### 3.7 OpenAPI - -- Remove or mark deprecated old `GET .../events` in [`openapi.rs`](src/console/openapi.rs). -- Document `GET .../events/stream` (response = `text/event-stream`, example snapshot schema). - ---- - -## 4. Frontend (`console-web`) - -### 4.1 Transport - -- **Prefer `EventSource`** with `{ withCredentials: true }` so the **session cookie** is sent (matches existing auth). -- Parse `message` events: `event.data` → JSON `EventListResponse`-compatible `{ events: EventItem[] }`. -- **URL:** `${apiBase}/api/v1/namespaces/${ns}/tenants/${tenant}/events/stream` (add helper next to removed `listTenantEvents`). - -**Limitations (PRD §5.2):** - -- Standard `EventSource` does **not** send custom `Authorization` headers; cookie session is the primary fit. -- If you ever move to **Bearer-only** auth, plan **fetch streaming** or **query token** (security review) instead of native `EventSource` with headers. - -### 4.2 Lifecycle (Tenant detail client) - -| Moment | Behavior | -|--------|----------| -| User opens **Events** tab | `EventSource` connect; show loading until first `data` or error | -| First `data` | `setEvents(parsed.events)` | -| Further `data` | Replace list with new snapshot (PRD: full snapshot each time) | -| SSE error / disconnect | Non-blocking **toast**; keep last good list; offer **Retry** (close + reopen EventSource) | -| `namespace` / `name` route change | **Close** EventSource, **clear** events state, open new stream | -| Leave page / unmount | `eventSource.close()` | - -Do **not** load events in the initial `Promise.allSettled` batch that currently calls `listTenantEvents`; remove that call. - -### 4.3 CORS and cookies - -- Align with [`server.rs`](src/console/server.rs) `CORS_ALLOWED_ORIGINS` for dev split-host (e.g. Next on `localhost:3000`, API on another port). -- `credentials: "include"` for `fetch` is already used; **EventSource** must mirror with **`withCredentials: true`** so preflight + cookie behavior matches PRD §5.2 / §6. - -### 4.4 Client-side filters (PRD §5.2 / §5.3) - -- **Type:** `Normal` | `Warning` + optional “show raw / Other” for unknown `event_type`. -- **Kind:** multi-select `Tenant` | `Pod` | `StatefulSet` | `PersistentVolumeClaim`. -- **Name:** substring on `involved_object` or `involvedObject.name` if exposed separately in DTO. -- **Time:** filter by parsed `last_timestamp` (and `first_timestamp` if needed) within UI range. - -Keep filter state **local** to the Events tab; do not add query params to SSE URL in Phase 1. - -### 4.5 Types - -- Reuse `EventItem` / `EventListResponse` in `types/api.ts`. -- Add const arrays / unions for **kind** and **type** enums (PRD §5.3). - -### 4.6 i18n - -- Reuse existing “No events” / error strings; add short strings for filter labels and retry if missing. - ---- - -## 5. Non-functional - -| Topic | Design choice | -|-------|----------------| -| **RBAC** | User must `list`/`watch` `events` and `list` pods, statefulsets, persistentvolumeclaims, tenants (same as today + PVC list). Document in deploy notes. | -| **Multi-replica Console** | SSE is sticky to one pod unless using a shared informer; PRD §6: document **ingress sticky sessions** or single replica for Phase 1. | -| **Gateway / proxy (PRD §6)** | Default **read timeouts** (e.g. Nginx **60s**) can **silently close** idle SSE connections → client reconnects. **Deploy checklist:** increase `proxy_read_timeout` (or Envoy equivalent) for the console API route; tune together with optional **server heartbeat** (comment lines) if needed. | -| **Limits** | One watch + periodic discovery per **active** SSE connection; cap snapshots at 200 rows. | - ---- - -## 6. Testing & verification - -| Layer | Suggestion | -|-------|------------| -| **Rust** | Unit tests for `Scope` building from fake `list` results; **Tenant kind filter** (same name, different kind → excluded); dedupe/sort/cap pure functions. | -| **E2E / manual** | PRD §8: Pod Warning ~15s; two tenants same NS isolation; tab close drops connection (server log); **§8.7** Tenant events only (`kind` + `name`). | -| **Integration** | Without REST, prefer **SSE client** (e.g. `curl -N` with cookie, or headless browser) or add **temporary internal** JSON endpoint if product selects PRD §7 variant. | -| **Frontend** | Component test: mock `EventSource` or stream parser; filter logic unit tests. | - -**Project gate:** `make pre-commit` before merge. - ---- - -## 7. Implementation order (suggested) - -1. Extract shared **tenant scope** / label helpers; add **PVC** list by label (aligned with PRD §4.1). -2. Implement **Tenant kind** + `(kind, name)` filtering; remove reliance on name-only field selector for Tenant leg. -3. Implement **SSE handler** + snapshot pipeline; manual `curl -N` with cookie or browser. -4. **Remove** `GET .../events` and frontend `listTenantEvents`; wire **EventSource** on Events tab (`withCredentials`). -5. Add **filters** UI + polish errors / retry. -6. OpenAPI + CHANGELOG + **deploy notes** (sticky + **proxy read timeout** + optional ops-only REST variant if chosen). - ---- - -## 8. Risks & follow-ups - -| Risk | Mitigation | -|------|------------| -| High event volume in namespace | Namespace-wide watch + filter; tune refresh; monitor CPU. | -| `events.k8s.io` only clusters | Out of Phase 1; add later if needed. | -| EventSource CORS in dev | Align `CORS_ALLOWED_ORIGINS` and `withCredentials`. | -| Ingress/proxy idle timeout | **proxy_read_timeout** / equivalent; document in runbook (PRD §6). | -| REST removal | Scripts/tests use SSE or optional internal API; track in PRD §2 / §7 decision. | - ---- - -*End of technical design.* From 2ae838c026dadaee3c61ed009815ec115609f028 Mon Sep 17 00:00:00 2001 From: GatewayJ <835269233@qq.com> Date: Thu, 30 Apr 2026 15:43:52 +0800 Subject: [PATCH 2/2] chore(git): ignore local .codex file --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 754ba70..2518983 100755 --- a/.gitignore +++ b/.gitignore @@ -28,4 +28,5 @@ CONSOLE-INTEGRATION-SUMMARY.md SCRIPTS-UPDATE.md AGENTS.md docs/ +.codex .codex/