Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,39 @@ vet: ## Run go vet against code.

.PHONY: test
test: manifests generate fmt vet envtest ## Run tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test ./... -coverprofile cover.out
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /test/e2e) -coverprofile cover.out

##@ E2E Testing

KIND_CLUSTER_NAME ?= elx-nodegroup-e2e
# E2E_IMG is the controller image to load into the kind cluster.
# Override if you want to use a pre-built image: make test-e2e E2E_IMG=myregistry/myimage:tag
E2E_IMG ?= elx-nodegroup-controller:e2e

.PHONY: kind-create
kind-create: ## Create a kind cluster for e2e testing (requires kind: https://kind.sigs.k8s.io)
kind create cluster --name $(KIND_CLUSTER_NAME) --config test/e2e/kind-config.yaml

.PHONY: kind-delete
kind-delete: ## Delete the e2e kind cluster
kind delete cluster --name $(KIND_CLUSTER_NAME)

.PHONY: kind-load-and-deploy
kind-load-and-deploy: docker-build ## Build the controller image, load it into the kind cluster, and deploy
$(CONTAINER_TOOL) tag $(IMG) $(E2E_IMG)
kind load docker-image $(E2E_IMG) --name $(KIND_CLUSTER_NAME)
$(KUSTOMIZE) build config/crd | kubectl --context kind-$(KIND_CLUSTER_NAME) apply -f -
cd config/manager && $(KUSTOMIZE) edit set image controller=$(E2E_IMG)
$(KUSTOMIZE) build config/default | kubectl --context kind-$(KIND_CLUSTER_NAME) apply -f -
kubectl --context kind-$(KIND_CLUSTER_NAME) -n elx-nodegroup-controller-system \
wait --for=condition=available deployment/controller-manager --timeout=120s

.PHONY: test-e2e
test-e2e: ## Run e2e tests against the cluster referenced by KUBECONFIG (controller must be running)
go test ./test/e2e/... -v -timeout 5m

.PHONY: e2e-full
e2e-full: kind-create kind-load-and-deploy test-e2e kind-delete ## Full e2e lifecycle: create cluster, deploy, test, destroy

##@ Build

Expand Down
239 changes: 226 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,243 @@
# elx-nodegroup-controller

A tiny controller for persisting labels and taints on a list of nodes.
A Kubernetes controller that automatically applies and persists labels and taints across groups of nodes. Define a `NodeGroup` resource and the controller will ensure all member nodes carry the specified labels and taints — even if they are manually removed or if nodes are replaced.

## Table of Contents

## Deploy controller and CRD
- [Overview](#overview)
- [How It Works](#how-it-works)
- [Installation](#installation)
- [Usage](#usage)
- [NodeGroup API Reference](#nodegroup-api-reference)
- [Examples](#examples)
- [Development](#development)
- [Architecture](#architecture)

## Overview

Managing labels and taints on Kubernetes nodes is often tedious and error-prone — especially in clusters with dynamic node provisioning where nodes come and go. The `elx-nodegroup-controller` solves this by introducing the `NodeGroup` custom resource, which acts as a declarative specification of which nodes should carry which labels and taints.

Key features:

- **Declarative**: Define the desired state once; the controller continuously reconciles it.
- **Persistent**: Labels and taints are automatically reapplied if manually removed.
- **Cleanup on deletion**: When a `NodeGroup` is deleted, the controller removes the labels and taints it previously applied.
- **Dynamic membership**: Nodes can be selected by exact name or by node group name patterns (useful for auto-scaled node groups).

## How It Works

1. You create a `NodeGroup` resource listing the nodes (by name or naming pattern) and the labels/taints you want applied.
2. The controller watches both `NodeGroup` and `Node` resources.
3. On each reconciliation loop, the controller ensures every member node has the specified labels and taints.
4. If a `NodeGroup` is deleted, the controller cleans up all labels and taints it originally applied via a finalizer before allowing the resource to be removed.

## Installation

### Prerequisites

- Kubernetes cluster >= 1.24
- `kubectl` configured to talk to your cluster
- `kustomize` (or use `kubectl apply -k`)

### Deploy the Controller and CRD

```bash
kustomize build config/default | kubectl apply -f -
```

## Sample nodegroup manifest
This creates the following resources in the `elx-nodegroup-controller-system` namespace:

| Resource | Name |
|----------|------|
| CRD | `nodegroups.k8s.elx.cloud` |
| Namespace | `elx-nodegroup-controller-system` |
| ServiceAccount | `controller-manager` |
| ClusterRole | `manager-role` |
| ClusterRoleBinding | `manager-rolebinding` |
| Deployment | `controller-manager` |

### Uninstall

```bash
kustomize build config/default | kubectl delete -f -
```

## Usage

### NodeGroup API Reference

```
apiVersion: k8s.elx.cloud/v1alpha2
kind: NodeGroup
```

`NodeGroup` is a cluster-scoped resource (no namespace required).

#### Spec

| Field | Type | Description |
|-------|------|-------------|
| `members` | `[]string` | Explicit list of Kubernetes node names to include in this group. |
| `nodeGroupNames` | `[]string` | Node naming patterns for dynamic membership. A node is included if any segment of its name (split on `-`) matches one of these values. |
| `labels` | `map[string]string` | Labels to apply to all member nodes. |
| `taints` | `[]corev1.Taint` | Taints to apply to all member nodes. Each taint has `key`, `value` (optional), and `effect` (`NoSchedule`, `PreferNoSchedule`, or `NoExecute`). |

You can use `members`, `nodeGroupNames`, or both together — their results are combined.

### Examples

#### Apply labels and taints to specific nodes

```yaml
apiVersion: k8s.elx.cloud/v1alpha2
kind: NodeGroup
metadata:
name: compute-nodes
spec:
members:
- worker-node-1
- worker-node-2
labels:
workload-type: compute
environment: production
taints:
- key: workload-type
value: compute
effect: NoSchedule
```

#### Dynamic membership via node group name patterns

Useful in clusters where auto-scaling creates nodes with predictable name prefixes. A node named `gpu-a100-abc123` would match the pattern `gpu` because `gpu` is one of the `-`-separated segments of the name.

```yml
```yaml
apiVersion: k8s.elx.cloud/v1alpha2
kind: NodeGroup
metadata:
name: nodegroup-sample
name: gpu-nodes
spec:
members:
- node1 # Kubernetes node name
nodeGroupNames:
- node0 # Kubernetes nodegroup name, used for clusters with dynamic node naming
nodeGroupNames:
- gpu
labels:
name: value
hardware: gpu
taints:
- effect: "NoSchedule"
key: key
value: value
- key: nvidia.com/gpu
value: "true"
effect: NoSchedule
```

#### Mixed: explicit members and dynamic patterns

```yaml
apiVersion: k8s.elx.cloud/v1alpha2
kind: NodeGroup
metadata:
name: spot-nodes
spec:
members:
- specific-spot-node-1
nodeGroupNames:
- spot
labels:
capacity-type: spot
taints:
- key: spot-instance
value: "true"
effect: NoExecute
```

#### Common operations

```bash
# Create or update a NodeGroup
kubectl apply -f nodegroup.yaml

# List all NodeGroups in the cluster
kubectl get nodegroups

# Inspect a NodeGroup
kubectl describe nodegroup gpu-nodes

# Delete a NodeGroup (cleans up labels and taints from member nodes)
kubectl delete nodegroup gpu-nodes
```

## Development

### Prerequisites

- Go 1.22+
- Docker
- `controller-gen`
- `kustomize`
- `envtest` (for running tests)

### Build

```bash
make build
```

### Run tests

```bash
make test
```

### Generate CRD manifests and DeepCopy methods

```bash
make manifests
make generate
```

### Build and push the container image

```bash
make docker-build docker-push IMG=<registry>/<image>:<tag>
```

### Deploy from a custom image

```bash
make deploy IMG=<registry>/<image>:<tag>
```

## Architecture

The controller is built with [controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) (Kubebuilder v3 layout).

### Reconciliation flow

```
NodeGroup created/updated
│
â–¼
Build member list
(spec.members + nodes matching spec.nodeGroupNames)
│
â–¼
NodeGroup being deleted?
│ │
Yes No
│ │
â–¼ â–¼
Remove Add finalizer
labels/taints (if missing)
from nodes │
│ ▼
â–¼ Apply labels to
Remove member nodes
finalizer │
â–¼
Apply taints to
member nodes
```

### Watching Nodes

The controller also watches `Node` resources. When a node changes (e.g., it is recreated after being replaced by the autoscaler), the controller looks up which `NodeGroup`s list that node as a member and triggers reconciliation for each of them — ensuring labels and taints are immediately reapplied.

### Finalizer

The controller uses the `k8s.elx.cloud/finalizer` finalizer on each `NodeGroup`. This ensures the controller has the opportunity to clean up labels and taints from member nodes before Kubernetes removes the `NodeGroup` object.
Loading