Remove bootstrap nodes and replace with operator peers by lrsaturnino · Pull Request #3909 · threshold-network/keep-core

lrsaturnino · 2026-03-25T04:23:43Z

Problem

The network relies on centrally-managed bootstrap nodes for initial peer discovery. Their embedded public keys bypass firewall IsRecognized() staking checks via the AllowList, meaning an unstaked or slashed embedded peer retains permanent network access. Operators hardcoding bootstrap addresses in --network.peers will lose connectivity when bootstrap infrastructure is decommissioned.

Solution

Replace bootstrap node entries with operator peers (/dns4/ or /ip4/ format)
Decouple firewall AllowList from discovery peers — pass firewall.EmptyAllowList so all peers pass IsRecognized() staking checks
Remove dead ExtractPeersPublicKeys function
Deprecate --network.bootstrap flag with runtime warning
Rename connected_bootstrap_count metric to connected_wellknown_peers_count

Tests

TestValidate_EmptyAllowList_RecognizedPeerAccepted — recognized peer passes via IsRecognized path
TestValidate_EmptyAllowList_UnrecognizedPeerRejected — unrecognized peer rejected with no AllowList bypass
TestValidate_EmptyAllowList_PreviouslyAllowlistedPeerMustPassIsRecognized — previously allowlisted peer no longer bypasses checks
TestResolvePeers — updated expectations for new operator peer entries (mainnet and testnet)
TestNetworkBootstrapFlagDescription_ContainsDeprecationNotice — flag description includes deprecation text
TestIsBootstrap — returns correct boolean value
TestConnectedWellknownPeersCountMetricName — metric constant has correct value
TestObserveConnectedWellknownPeersCount_Callable — renamed function exists and executes without panic

Summary by CodeRabbit

Deprecations
- The --network.bootstrap CLI flag is now deprecated and scheduled for removal in v3.0.0.
Network Configuration
- Updated peer node addresses for mainnet and testnet connectivity.
- Updated Electrum service endpoint for testnet.
Metrics
- Enhanced network connectivity metrics for peer monitoring.

Replace all Boar bootstrap node entries with curated DNS-backed operator peers to complete the bootstrap infrastructure decommission. This is the final phase — Staked nodes were removed in v2.5.1. Peer list changes: - Mainnet: 2 Boar entries replaced with 5 operator peers at keep-nodes.io - Testnet: 1 Boar entry replaced with 2 operator peers at test.keep-nodes.io - All entries use /dns4/ format for IP change resilience Security — AllowList decoupling: - Pass firewall.EmptyAllowList instead of extracting embedded peer keys - All peers (including embedded operators) now pass IsRecognized() staking checks with no firewall bypass - Remove dead ExtractPeersPublicKeys function and its tests Deprecations and renames: - Deprecate --network.bootstrap flag with runtime warning - Rename connected_bootstrap_count metric to connected_wellknown_peers_count Documentation: - Add operator migration guide covering Boar address removal, --network.peers override behavior, and monitoring updates

Operator migration guidance will be distributed separately from the code release.

Replace placeholder mainnet entries with the beta staker node (143.198.18.229:3919) as the sole embedded peer for initial testing. Testnet placeholders remain until operator coordination is complete.

gotestsum's default `./...` package pattern only applies when no args are passed after `--`. With `-- -timeout 15m`, it forwards args directly to `go test`, which defaults to `.` (root package only). The root package has no test files, so CI has been silently running 0 tests.

Replace placeholder hostnames with actual values from config/_peers/testnet.

Remove TestConnectedWellknownPeersCountMetricName and TestMetricConstants which only assert that string constants equal themselves. The compiler already ensures rename safety. Keep the callable integration test which validates the function exists and executes without panicking.

…d var Change EmptyAllowList from an exported mutable package-level var to an exported function returning the package-level singleton. This prevents external code from accidentally mutating the shared empty allowlist.

Add a note that connected_wellknown_peers_count was previously named connected_bootstrap_count, so operators can update Prometheus queries and Grafana dashboards accordingly.

Specify concrete removal version so the deprecated flag does not linger indefinitely.

Add a TODO comment noting that at least one additional mainnet peer across a different operator/ASN should be added before production rollout to avoid a single point of failure for initial peer discovery.

Go 1.24 vet rejects non-constant format strings in fmt.Errorf. This pre-existing issue was hidden because CI was not running tests.

lionakhnazarov

lgtm

## Summary - CI has been silently running **0 tests** because `gotestsum -- -timeout 15m` (without `./...`) only tests the root package, which has no test files - Fix: add explicit `./...` so all subpackages are tested - Also fix `peers_test.go` placeholder hostnames to match actual `config/_peers/testnet` values — this test failure was hidden by the above bug ## Root cause `gotestsum`'s default `./...` package pattern only applies when **no args** are passed after `--`. With `-- -timeout 15m`, gotestsum forwards args directly to `go test`, which defaults to `.` (current directory only). CI log confirms: `DONE 0 tests in 8.107s`. ## Test plan - [ ] CI should now run all Go tests (expect 100+ tests instead of 0) - [ ] `TestResolvePeers/sepolia_network` should pass with corrected hostnames 🤖 Generated with [Claude Code](https://claude.com/claude-code)