Skip to content

feat: add gateway-api support#299

Open
l0wl3vel wants to merge 18 commits into
masterfrom
feat/gatewayapi
Open

feat: add gateway-api support#299
l0wl3vel wants to merge 18 commits into
masterfrom
feat/gatewayapi

Conversation

@l0wl3vel

@l0wl3vel l0wl3vel commented May 6, 2026

Copy link
Copy Markdown

Description

  • Add kind-cloud-controller-manager to provide Type: Loadbalancer services
  • Introduce envoy-gateway as the Gateway API implementation
  • Move metal-stack control plane kind cluster into the mini_lab_external docker network
    • can't select container IP in the default docker bridge, which we need for the pre-defined *.nip.io DNS records
  • Kept ingress-nginx for now. Still required for Dex, Thanos, Gardener, PowerDNS

WIPs

  • Certificates are a bit messed up still (using default-gateway cert for grcp termination)
  • Link metal-roles pr branch to run ci in pull request metal-roles PR is merged
  • CI is failing

Used AI-Tools ✨

  • none used for generation

Closes: #297

Requires: metal-stack/helm-charts#156 and metal-stack/metal-roles#594

Tested configurations

  • Sonic
  • Dell Sonic
  • Gardener (looks good, deploys correctly with metal-stack on GWAPI and Gardener components still on ingress-nginx, further testing required)
  • Kamaji
    • non-functional. So likely a wontfix, unless it gets integreated into mini-lab. Only usable in capi-lab, which uses an old pinned version of mini-lab.

@metal-robot metal-robot Bot added this to Development May 6, 2026
@l0wl3vel l0wl3vel force-pushed the feat/gatewayapi branch 2 times, most recently from 28079c5 to f84c000 Compare May 8, 2026 14:58
@ma-hartma

Copy link
Copy Markdown
Contributor

Dell Sonic does actually work, but you need credentials to pull from r.metal-stack.io.

@l0wl3vel l0wl3vel mentioned this pull request May 26, 2026
9 tasks
@vknabel

vknabel commented May 28, 2026

Copy link
Copy Markdown
Contributor

Sadly I got the following error:

deploy-control-plane  | TASK [ansible-common/roles/helm-chart : Copy over custom helm charts] **********
deploy-control-plane  | fatal: [localhost]: FAILED! => 
deploy-control-plane  |     changed: false
deploy-control-plane  |     cmd: /usr/bin/rsync --delay-updates -F --compress --delete-after --archive --out-format='<<CHANGED>>%i
deploy-control-plane  |         %n%L' /helm-charts/charts/metal-control-plane /tmp/helm-chart
deploy-control-plane  |     msg: |-
deploy-control-plane  |         rsync: [sender] change_dir "/helm-charts/charts" failed: No such file or directory (2)
deploy-control-plane  |         rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1338) [sender=3.4.1]
deploy-control-plane  |     rc: 23

I had the following overrides metal_roles_version: gatewayapi

@l0wl3vel

Copy link
Copy Markdown
Author

Sadly I got the following error:

deploy-control-plane  | TASK [ansible-common/roles/helm-chart : Copy over custom helm charts] **********
deploy-control-plane  | fatal: [localhost]: FAILED! => 
deploy-control-plane  |     changed: false
deploy-control-plane  |     cmd: /usr/bin/rsync --delay-updates -F --compress --delete-after --archive --out-format='<<CHANGED>>%i
deploy-control-plane  |         %n%L' /helm-charts/charts/metal-control-plane /tmp/helm-chart
deploy-control-plane  |     msg: |-
deploy-control-plane  |         rsync: [sender] change_dir "/helm-charts/charts" failed: No such file or directory (2)
deploy-control-plane  |         rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1338) [sender=3.4.1]
deploy-control-plane  |     rc: 23

I had the following overrides metal_roles_version: gatewayapi

@vknabel fixed in 629cb02

@l0wl3vel

Copy link
Copy Markdown
Author

@Sven-Ric Would you mind taking a look at the network changes?

@l0wl3vel l0wl3vel requested review from Sven-Ric and vknabel May 29, 2026 13:10
@l0wl3vel l0wl3vel marked this pull request as ready for review June 1, 2026 06:53
@l0wl3vel l0wl3vel requested review from a team as code owners June 1, 2026 06:53
@Sven-Ric

Sven-Ric commented Jun 5, 2026

Copy link
Copy Markdown

It seems like the kind node always ends up in the default kind network on a clean first run. The kind network is read from .env, which is written by env.sh. However the Makefile reads .env before env.sh is invoked and the kind node network falls back to default. Because .env is persistent the bug is masked on all subsequent runs.

On initial run:

# docker inspect metal-control-plane-control-plane
[
    {
        <SNIP>
        "NetworkSettings": {
            <SNIP>
            "Networks": {
                "kind": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "DriverOpts": null,
                    "GwPriority": 0,
                    "NetworkID": "6530b19e41b397d41d37f6a38d6b1bbd74c9ba2b7478df95f6a6270cc84c4d0e",
                    "EndpointID": "6d56f5f0fa83330b85e0b0ebbd04175a93d8586c48492c9b545beb7eeecce015",
                    "Gateway": "172.18.0.1",
                    "IPAddress": "172.18.0.2",
                    "MacAddress": "12:98:42:c8:e4:ec",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "fc00:f853:ccd:e793::1",
                    "GlobalIPv6Address": "fc00:f853:ccd:e793::2",
                    "GlobalIPv6PrefixLen": 64,
                    "DNSNames": [
                        "metal-control-plane-control-plane",
                        "bd976835cec0"
                    ]
                }
            }
        },
        "ImageManifestDescriptor": {
            "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
            "digest": "sha256:21c46cf61fd45873f89e6a1bfcba4b7904dffa84c2bec88aeeca9a0409af4725",
            "size": 743,
            "platform": {
                "architecture": "amd64",
                "os": "linux"
            }
        }
    }
]

On all subsequent runs:

# docker inspect metal-control-plane-control-plane
[
    {
        <SNIP>
        "NetworkSettings": {
            <SNIP>
            "Networks": {
                "mini_lab_internal": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "DriverOpts": null,
                    "GwPriority": 0,
                    "NetworkID": "2734b8f942cae84d8693ecd43ab3bb9d5cd71905faf992fbfe5c3df17ddc376b",
                    "EndpointID": "62f8b2a6eb379bb65f13f6441a9249417fc9ce754218a29b699cd7511b393d29",
                    "Gateway": "172.42.0.1",
                    "IPAddress": "172.42.0.2",
                    "MacAddress": "66:e7:b9:9c:2e:39",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "DNSNames": [
                        "metal-control-plane-control-plane",
                        "5b12fbbedfdc"
                    ]
                }
            }
        },
        "ImageManifestDescriptor": {
            "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
            "digest": "sha256:21c46cf61fd45873f89e6a1bfcba4b7904dffa84c2bec88aeeca9a0409af4725",
            "size": 743,
            "platform": {
                "architecture": "amd64",
                "os": "linux"
            }
        }
    }
]

@l0wl3vel

l0wl3vel commented Jun 5, 2026

Copy link
Copy Markdown
Author

Thank you so much for checking it out @Sven-Ric. Fixed in baddf29. A few people checked this PR and it worked fine but CI was failing and I had no clue why. You saved me a lot of time 😅

@l0wl3vel l0wl3vel force-pushed the feat/gatewayapi branch 2 times, most recently from e502dc3 to 01fb3d1 Compare June 8, 2026 14:58
l0wl3vel added 10 commits June 10, 2026 05:21
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
l0wl3vel added 7 commits June 10, 2026 05:24
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Signed-off-by: Benjamin Ritter <benjamin.ritter@x-cellent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Add GatewayAPI support to mini-lab

4 participants