feat: datumctl compute plugin — deploy and manage workloads from the CLI#113
Draft
scotwells wants to merge 24 commits into
Draft
feat: datumctl compute plugin — deploy and manage workloads from the CLI#113scotwells wants to merge 24 commits into
scotwells wants to merge 24 commits into
Conversation
scotwells
added a commit
that referenced
this pull request
May 29, 2026
…cheduling base After rebasing onto feat/federated-deployment-scheduling, go.mod had picked up the wrong versions of two deps via conflict resolution: - go.datum.net/network-services-operator was left at v0.1.0 (from #113's old go.mod side) instead of v0.21.10-... required by HEAD's LocationBinding usage - go.miloapis.com/service-catalog v0.0.0-20260527221104 transitively requires milo v0.26.1, which has a broken downstreamclient (Apply method missing, ClusterName type mismatch). Add a replace directive to pin milo to v0.25.2 (the version used by the federated-scheduling base) so downstreamclient compiles cleanly. service-catalog is updated to the latest available version. Also apply gofmt alignment fixes surfaced by the rebase on instance_controller.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
a63c87a to
c1186cb
Compare
Adds the datumctl-compute plugin binary with commands for deploying and managing containerized workloads on Datum Cloud via the developer CLI. Commands: - deploy — create or update a workload from flags or a manifest file - destroy — delete a workload and clean up its revision history - status — show health, placement summary, and recent revision info - instances — list and describe running instances across cities - scale — adjust minimum replica count across placements - rollout — watch live progress, view history, and roll back revisions - restart — trigger a rolling restart of a workload or specific city - quota — inspect per-city instance usage and quota headroom Closes #98. Depends on datum-cloud/datumctl#198. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Within a project's virtual control plane, all resources live in the
"default" namespace — the project slug is only used to route to the
right control plane URL. Updated all commands to use
util.ResourceNamespace ("default") instead of the project name as the
k8s namespace.
Also corrects the instance type default from "d1-standard-2" to
"datumcloud/d1-standard-2" to match the format the admission webhook
requires.
Discovered while testing against the staging environment.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The datumctl module requirement was upgrading controller-runtime to v0.23.3, which broke compatibility with multicluster-runtime and milo. Eliminated the dependency by: - Inlining the --plugin-manifest protocol in main.go - Reading DATUM_API_HOST and DATUM_CREDENTIALS_HELPER from env directly in util/client.go instead of via plugin.Context()/plugin.Token() - Reading DATUM_ORG from env in root.go instead of via plugin.NewRootCmd - Dropping the now-unreachable internal/cmd/compute/client.go Also updates CI workflows to use go-version-file instead of a pinned go 1.24.0, and bumps golangci-lint to v2.12.2 which supports go 1.25. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upgrades controller-runtime from v0.21.0 to v0.23.3 and multicluster-runtime from v0.21.0-alpha.8 to v0.23.3, which unblocks adding go.datum.net/datumctl as a direct dependency. The CLI plugin (datumctl-compute) now uses the official datumctl plugin SDK: - plugin.ServeManifest() for the --plugin-manifest protocol - plugin.NewRootCmd() for pre-wired org/project/output flags - plugin.Context() and plugin.Token() for credential access Controller breaking changes addressed: ClusterName distinct type, Watches callback signature, NewWebhookManagedBy generic API. A local milo provider fork is added at internal/provider/milo since the upstream package hasn't been updated for the ClusterName type change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses 63 lint findings across errcheck, goconst, gocyclo, gofmt, prealloc, staticcheck, and unparam linters: - gofmt/goimports: reformat cmd/main.go, deploy.go, util/client.go, webhook - errcheck: assign discarded fmt.Fprint* and Flush returns to _ - staticcheck: update webhook to generic admission.Defaulter[T]/Validator[T] with WithDefaulter/WithValidator; fix SA4010 unused append in quota.go; remove redundant .ObjectMeta selectors in restart.go - unparam: rename four never-used function parameters to _ - gocyclo: extract helpers from watch.Rollout and quota.runQuota to reduce cyclomatic complexity below threshold - goconst: extract repeated string literals to named constants across controllers, validation, and tests - prealloc: preallocate slices with known capacity in validation and tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- errcheck: fix unchecked fmt.Fprint* returns in deploy, quota, rollout, scale - prealloc: preallocate allErrs in workload_validation.go and stateful test - gofmt: reformat destroy.go, instances.go, rollout.go Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- golangci.yml: exclude errcheck for internal/cmd/* — ignoring write errors on stdout/stderr is idiomatic in CLI tools - prealloc: preallocate allErrs in validateScaleSettingMetrics - gofmt: reformat status.go, instance_controller_test.go Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire ValidArgsFunction on every command that accepts a workload name (deploy, destroy, restart, rollout, rollout history, rollout undo, scale, status) and register flag completion for instances --workload. All completions call a shared CompleteWorkloadNames helper in internal/cmd/compute/util that fetches live workload names from the API and always returns ShellCompDirectiveNoFileComp so the shell never falls back to filename completion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove ValidArgsFunction from deploy and replace with util.CompleteWorkloadNamesAndFlags, which wraps CompleteWorkloadNames with plugin.WithFlagCompletion from the datumctl SDK. - Add plugin.WithFlagCompletion to the datumctl plugin SDK so any plugin can get the same behaviour by wrapping their own ValidArgsFunction. - Bump go.datum.net/datumctl to b44de1c (adds WithFlagCompletion). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the hardcoded datum-control-plane ClusterIssuer from the csi-webhook-cert component. DNS names stay since they are fixed by the service name and namespace. Each consuming overlay now supplies the issuer via a strategic merge patch, allowing different environments to use different cert issuers without forking the component. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The cert issuer name is environment-specific configuration that belongs in the infra repo, not the compute overlay. The infra repo's base manager patch already owns the full webhook-server-tls volume definition including the issuer. Consumers deploying outside infra must patch the issuer in their own overlay. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a printer.go with PrintJSON and PrintYAML helpers that commands can use to emit API resources as structured output. Extend completion.go with CompleteInstanceNames, CompleteCityCodes, and CompleteOutputFormats so all -o/--output, --city, and instance-name completions are driven from a single shared source. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both commands now accept -o/--output with tab-completion. json/yaml emit the underlying API resource (InstanceList) or structured quota rows respectively. wide adds an INSTANCE TYPE column for instances. --no-headers suppresses the header row for table and wide. City completion is wired to CompleteCityCodes and instance describe gains tab-completion via CompleteInstanceNames. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add datumctl compute workloads (list) and workloads describe <name> commands. The list command shows NAME/HEALTH/READY/PLACEMENTS/IMAGE/AGE columns with --health and --city filters, -o table|wide|json|yaml, and a footer summary. The describe command replaces status with a unified config+health view: header block, per-placement per-city ready counts with inline degradation annotations, and a container spec block. Remove the now-redundant status command from root.go and delete its package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix duplicate TYPE/INSTANCE TYPE columns in instances -o wide (W3): populate TYPE from runtimeKind (sandbox/vm), INSTANCE TYPE from instType - Fix footer bucketing in instances list (W4): compute Running/Pending/Failed from actual status strings instead of hardcoding Failed=0 - Skip revision ConfigMap Gets in workloads list table mode (W5): only fetch per-workload revision when -o wide is requested, avoiding N round-trips on every list invocation - Compute health footer tallies after filters are applied (W9): previously counted all workloads then printed a filtered subset, making the summary misleading when --health or --city filters were active - Fix gofmt import ordering in workloads.go (B1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Before creating a workload, the deploy command now checks whether the required network(s) exist. If a network is missing, the user is offered the option to create a minimal auto-IPAM network in-place rather than hitting an opaque NetworkNotFound error post-submission. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… API - Add EnsureComputeEntitlement to gate all compute commands on an active service entitlement; prompts TTY users to request access and surfaces approval status - Rewrite quota command to query AllowanceBucket resources from the project VCP (milo-system namespace) instead of deriving usage from instance quota conditions - Add NewPlatformClient targeting the platform API server for ResourceRegistration lookups - Extract ListServiceQuota into util so other service plugins can reuse the quota display logic with their own resource type prefix and display metadata overrides Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hand-rolled HTTP entitlement code with a proper client-go implementation using go.miloapis.com/service-catalog types. Uses client.WithWatch to stream events from the API server and unblocks as soon as the Ready condition appears — no polling interval. Also adds ASCII progress bar to quota table output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The compute CLI client now serializes network-services-operator types (Network, NetworkBinding, SubnetClaim), so deploy can preflight and create networks on the user's behalf. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deployment revisions are becoming a platform concept rather than a client concern. Remove the ConfigMap-backed revision ledger the CLI maintained per workload, along with the 'rollout history' and 'rollout undo' subcommands and the revision column in 'workloads'. 'rollout' remains as a live-progress watch. This also removes the only code path that serialized core/v1 ConfigMaps from the CLI, so the missing-corev1-scheme warning on deploy no longer occurs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cheduling base After rebasing onto feat/federated-deployment-scheduling, go.mod had picked up the wrong versions of two deps via conflict resolution: - go.datum.net/network-services-operator was left at v0.1.0 (from #113's old go.mod side) instead of v0.21.10-... required by HEAD's LocationBinding usage - go.miloapis.com/service-catalog v0.0.0-20260527221104 transitively requires milo v0.26.1, which has a broken downstreamclient (Apply method missing, ClusterName type mismatch). Add a replace directive to pin milo to v0.25.2 (the version used by the federated-scheduling base) so downstreamclient compiles cleanly. service-catalog is updated to the latest available version. Also apply gofmt alignment fixes surfaced by the rebase on instance_controller.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… resolution The first conflict resolution in the aa9dc15 commit accidentally truncated workload_webhook.go, dropping the ValidateCreate method, its kubebuilder marker, and producing a syntactically invalid Default function body (extra brace + wrong return signature). Restore the file to match 5486adf's content (the authoritative post-lint-migration version). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
c1186cb to
8c15212
Compare
The platform now stamps city-code, workload-name, workload-deployment-name, and placement-name directly onto Instances at creation time. The CLI can therefore resolve CITY/WORKLOAD/placement directly from those labels without performing cross-object joins. The prior approach keyed the WorkloadDeployment map on UID and looked up instances via WorkloadDeploymentUIDLabel. That UID is the edge/Karmada WD UID, which differs from the project-cluster WD UID, causing the join to fail across federation planes and producing "unknown"/"orphaned" output. The new label-first path reads CityCodeLabel, WorkloadNameLabel, PlacementNameLabel, and WorkloadDeploymentNameLabel (name is identical across all planes) before falling back to the WD Get/List join. A wdNameFromInstanceName helper strips the trailing ordinal suffix from the Instance name as a last-resort fallback for instances created before the labels existed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the
datumctl computeplugin so developers can deploy and manage containerized workloads on Datum Cloud directly from the CLI.Commands shipped:
deploy— push a container image as a workload with flags or a manifest file; waits for rolloutdestroy— tear down a workload with a confirmation promptstatus— show workload health, per-city placement summary, and the active revisioninstances— list all running instances across cities, with describe for full detailscale— adjust minimum replica count across all placementsrollout— watch live rollout progress, browse revision history, and roll back to any prior revisionrestart— trigger a rolling restart of a workload or a specific cityquota— inspect per-city instance usage and surface quota-exceeded messagesRevision history is stored as a ConfigMap per workload so
rollout historyandrollout undowork without server-side tracking.Dependencies
go.modcurrently uses areplacedirective pointing at that PR's worktree; the directive should be removed and replaced with a release tag once that PR merges.What's not included
logs— telemetry service not yet implementedcities/instance-typesresource listing commandsRelated
Closes #98. Design proposal in #111.