feat: federated workload scheduling across POP cells by scotwells · Pull Request #116 · datum-cloud/compute

scotwells · 2026-05-28T16:37:01Z

Summary

Before this PR, a Workload could only run within a single control plane cell. There was no mechanism to schedule workloads across geographically distributed POP cells, no way to enforce resource quotas across cells, and no way for the management plane to aggregate status from instances running in different cells.

This PR delivers the full federated deployment scheduling pipeline:

Placement-driven scheduling — the WorkloadReconciler (running in the management cluster) reads a Workload's spec.placements[].cityCodes and creates a WorkloadDeployment for each city. Deployments are reconciled by the per-cell WorkloadDeploymentReconciler, which manages Instance lifecycle within the cell.
Federated write-back — the WorkloadDeploymentFederator pushes each WorkloadDeployment into a shared downstream control plane (Karmada) using namespace-scoped projections. Karmada propagates the deployment to the matching edge cluster via PropagationPolicy.
Instance projection — InstanceReconciler in each POP cell writes a copy of every Instance back to the downstream control plane. The InstanceProjector (management plane) reads these write-backs and creates read-only projections in the project namespace, so status from all cells is visible through a single API surface.
Quota enforcement per cell — each Instance creates a ResourceClaim routed to the Milo project control plane. Quota is evaluated before the scheduling gate is removed from the Instance spec, preventing over-provisioning across cells.
Location-aware admission — workload admission and scheduling now consult LocationBinding objects (project-scoped, created by the service catalog) rather than the global Location list. A project only sees locations that are both healthy and enabled for that project.
Webhook TLS via CSI — removed cert-manager Certificate resources. The webhook server mounts its TLS cert directly from a CSI volume, eliminating the cert-manager dependency for in-cluster issuance.

Test plan

go test ./... passes locally
Chainsaw e2e tests: full-federation, instance-writeback, instance-projection, propagation-policy-lifecycle
Create a Workload with two city-code placements and confirm two WorkloadDeployment objects appear
Confirm PropagationPolicy is created in the downstream control plane for each city code
Confirm Instance write-backs appear in the downstream control plane and are projected back into the project namespace
Verify quota claim is created per Instance and the scheduling gate is removed once granted
Verify a placement referencing a city code not in the project's LocationBindings is skipped (no deployment created)
Verify management controllers do not run in edge-cell mode (--enable-management-controllers=false)

Breaking changes

WorkloadDeployment.spec.location field removed — location is now derived from spec.cityCode
Cert-manager Certificate resources for webhook TLS are removed; overlays must provide a CSI volume source instead (see config/overlays/)

Defines the Karmada-based federation architecture for compute workload scheduling. Covers control plane topology, resource locations, creation and deletion flows, instance visibility, operator changes, auto scaling model, and namespace mapping conventions. Resolves #85 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Workloads targeting a city location are now automatically routed to the correct physical site via a Karmada-based federation layer. Each POP cell operates independently, instance health is surfaced back to the control plane in real time, and the platform remains available even when parts of the control plane are temporarily unreachable. Controllers added: - WorkloadDeploymentFederator: replicates WDs into Karmada and manages PropagationPolicies per city code - InstanceProjector: mirrors Instance write-backs from Karmada into the project namespace on the control plane ResourceInterpreterCustomization deployed at config time teaches Karmada how to aggregate replica counts and conditions across POP cells. Operator flags --enable-management-controllers and --enable-cell-controllers allow each deployment to opt into only the controllers it needs. Includes a 6-test Chainsaw e2e suite covering federation, deletion cascade, propagation policy lifecycle, instance projection, instance write-back, and the full end-to-end chain. Resolves #85 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…edge Introduces management-plane and cell overlay paths to the compute OCI artifact so the infra repo can deploy compute-manager in the correct mode for each tier of the federation architecture. The management-plane overlay deploys compute-manager with only WorkloadDeploymentFederator and InstanceProjector enabled, connected to the Karmada downstream control plane via projected ServiceAccount token auth. The cell overlay deploys compute-manager with only WorkloadDeploymentReconciler and InstanceReconciler enabled, with no downstream connection or webhook server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ts for webhook TLS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove the hardcoded datum-control-plane ClusterIssuer from the csi-webhook-cert component. DNS names stay since they are fixed by the service name and namespace. Each consuming overlay now supplies the issuer via a strategic merge patch, allowing different environments to use different cert issuers without forking the component. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Each WorkloadDeployment is routed to exactly one cell cluster via its PropagationPolicy, so aggregation across multiple members is not needed. Replace the summing logic with a direct pass-through of the single member's status. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The cert issuer name is environment-specific configuration that belongs in the infra repo, not the compute overlay. The infra repo's base manager patch already owns the full webhook-server-tls volume definition including the issuer. Consumers deploying outside infra must patch the issuer in their own overlay. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…moval dev: inline self-signed Issuer + Certificate for host.docker.internal, replace kustomize replacements block with direct annotation patch, remove Certificate-patching from webhook_patch.yaml, and clear webhookServer secretRef from config.yaml. single-cluster: replace cert-manager Certificate approach with the csi-webhook-cert component, matching the main branch overlay. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The WorkloadReconciler watches networkingv1alpha.Network objects, which requires the network-services-operator CRDs to be installed. Cell clusters don't have those CRDs, causing the manager to crash on startup. Gate the WorkloadReconciler behind enableManagementControllers so it only runs where the Network CRDs are present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Extracts server config file reading and decoding into a dedicated loadServerConfig helper, reducing main's cyclomatic complexity from 31 to 29 to satisfy the gocyclo linter limit of 30. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Milo's authorization webhook uses Extra claims on the admission request (iam.miloapis.com/parent-name, iam.miloapis.com/parent-type, etc.) to resolve the correct project-scoped policy binding. Dropping them caused the SAR to return Allowed=false even for users with networks.use, because the authorizer couldn't locate the binding without the project context. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

metricRules belongs under spec.quota, not spec.billing. The field is not declared in the ServiceBillingConfig schema, causing Flux dry-run failures in staging with: .spec.billing.metricRules: field not declared in schema

Previously, InstanceReconciler wrote ResourceClaim objects against the local deployment cluster via managementCluster.GetClient(). Those claims were never seen by the Milo quota system, leaving every Instance in QuotaGranted=Unknown indefinitely. This change routes claim creation and deletion to the correct Milo project control plane for each instance using a new ProjectQuotaClientManager that builds per-project REST clients by rewriting the host path — mirroring the URL construction already used by the milomulticluster provider. The management-cluster claim watch is replaced with a multicluster Watches call so that grant/denial status changes in project control planes re-trigger instance reconciles. Claims are stamped with a source-cluster label (discovery.clusterName) so each edge controller only reacts to the claims it created. Co-Authored-By: Claude <claude@anthropic.com>

The admission webhook requires that all metrics referenced in spec.quota.limits[].metric and spec.quota.metricRules[].metricCosts match a name declared in spec.metrics[]. The four quota-tracking metrics (workloads, instances, vcpus, memory) were missing from spec.metrics[], causing the webhook to reject the resource.

…o cell setup Controller flags --enable-management-controllers and --enable-cell-controllers now default to false so kustomize components must explicitly opt in, rather than both groups running by default. This prevented the management-plane deployment from crashing when discovery.clusterName was unset — that field is only required by the InstanceReconciler (a cell controller), so the validation now lives in InstanceReconciler.SetupWithManager instead of initializeClusterDiscovery. Also adds cell-controllers and management-controllers components to the single-cluster overlay, which was silently running with no controllers enabled. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…scovery The rebase during cherry-pick propagation introduced a mixed state where cmd/main.go had the edgeClusterName/projectRestConfig return values partially reverted. This cleans up the function signature and call sites to be consistent, while keeping the validation removed from initializeClusterDiscovery (it belongs in InstanceReconciler.SetupWithManager per the original fix intent). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… RBAC The workload-deployment-federator calls ensureDownstreamNamespace before federating WorkloadDeployment resources, but the compute-manager ClusterRole was missing core-group namespace permissions, causing every reconcile to fail with a forbidden error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Workload scheduling and admission now consult LocationBinding objects (project-scoped, created by the service catalog) rather than the global Location list. This ensures consumers only see locations that are both healthy and available to their specific project. Also upgrades network-services-operator and milo dependencies to versions that introduce LocationBinding and address multicluster-runtime v0.23 API changes (ClusterName type, ProviderRunnable Start lifecycle, generic webhook builder). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ources WorkloadDeploymentReconciler creates and owns NetworkBinding and SubnetClaim resources, and watches Location, NetworkContext, and Subnet. InstanceReconciler watches ResourceClaim for quota. Neither was granted the necessary ClusterRole rules, causing watch failures on cell clusters. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

From the cell cluster's perspective, Karmada is upstream (the federation control plane), not downstream. Rename the flag, env var, and related variables throughout to reflect the actual relationship. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…viderRunnable fix Points go.miloapis.com/milo to the feature branch commit that implements multicluster.ProviderRunnable on the Milo provider, enabling the mc manager to auto-call provider.Start() and set p.mcAware so project clusters can be registered. Without this, p.mcAware was always nil and every project reconcile logged "Multicluster manager not yet started" forever. Also removes the & from ResourceRef in ResourceClaimSpec — the feature branch has ResourceRef as a value type, not a pointer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove non-existent QuotaRestConfig() call and fix SetupWithManager argument count; pass nil quota config to skip quota enforcement for now. Single-tenant cell mode uses namespace-as-project-id and the fixed 'single' cluster name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Wires up Milo ResourceClaim-based quota accounting for cells running in single-cell discovery mode (mode: single), where the multicluster ClusterName is always "single" rather than the Milo project name. Key changes: - Add QuotaKubeconfigPath config field and QuotaRestConfig() method so quota REST config can be configured independently of discovery mode. Returns (nil, nil) when neither path is set, disabling quota rather than silently targeting the local apiserver. - Add projectIDForInstance and clusterNameForProject func fields to InstanceReconciler. In single mode, project ID is derived from instance.Namespace; the watch map func always enqueues ClusterName "single" rather than the project namespace, avoiding ErrClusterNotFound on every quota-grant event. - Guard ResourceClaim watch map func against claims with empty ResourceRef to prevent a nil-dereference panic when a label-matching claim from another actor has no ResourceRef set. - Add TestReconcileQuotaSingleMode covering the full single-mode quota flow: project ID from namespace, watch re-enqueue to "single" cluster. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

v2.1.5 was built with Go 1.24 and refuses to lint Go 1.25 modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tatus change Write-back was only triggered inside the statusChanged||readyChanged block, so instances stuck in a scheduling gate (no status transitions) were never replicated to Karmada. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nged Use apiequality.Semantic.DeepEqual to avoid unnecessary API calls to Karmada on every reconcile when nothing has actually changed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

From the cell cluster's perspective, Karmada is upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…llers Karmada is upstream from every controller's perspective — cell controllers write instances up to it, management controllers read/write through it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

clusterName is only needed when enableCellControllers is true (cell/edge deployments). Management plane deployments use Milo mode without it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

scotwells and others added 30 commits May 18, 2026 12:46

Merge branch 'main' into docs/issue-85-karmada-federation-design

105c335

Merge branch 'main' into docs/issue-85-karmada-federation-design

9734bf6

Merge branch 'main' into docs/issue-85-karmada-federation-design

9d96bd5

feat: replace cert-manager certificate resources with CSI volume moun…

0f69956

…ts for webhook TLS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: remove webhook CA injection — Milo trusts the cert issuer directly

a11861e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: bump Go version to 1.25 to match go.mod requirement

81e73c3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: bump golangci-lint to v2.2.2 for Go 1.25 compatibility

bed3d12

v2.1.5 was built with Go 1.24 and refuses to lint Go 1.25 modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: bump golangci-lint to v2.12.2 (latest, built with Go 1.25)

0d26598

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

scotwells and others added 5 commits May 28, 2026 15:20

fix: skip upstream write-back when spec, labels, and status are uncha…

3ac5115

…nged Use apiequality.Semantic.DeepEqual to avoid unnecessary API calls to Karmada on every reconcile when nothing has actually changed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor: rename writeBackToDownstream -> writeBackToUpstream

c15161e

From the cell cluster's perspective, Karmada is upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: remove clusterName requirement in Milo mode for management plane

a5916b9

clusterName is only needed when enableCellControllers is true (cell/edge deployments). Management plane deployments use Milo mode without it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: federated workload scheduling across POP cells#116

feat: federated workload scheduling across POP cells#116
scotwells wants to merge 35 commits into
mainfrom
feat/federated-deployment-scheduling

scotwells commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

scotwells commented May 28, 2026

Summary

Test plan

Breaking changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant