sat-geoip builds a satellite-internet intelligence dataset from operator GeoIP feeds, subnet-to-PoP mappings, live BGP announcements, RIR/RPKI ownership evidence, and historical snapshots. It emits CSV, JSONL, and MaxMind DB artifacts that keep geolocation, PoP assignment, PTR observations, and routing state as separate evidence layers.
| Resource | Location |
|---|---|
| Interactive dashboard | GitHub Pages |
| Latest dataset release | GitHub Releases |
| Generated artifacts | outputs/ |
| Operator registry | config/operators.yaml |
| Example evidence | examples/acceptance_evidence.json |
Satellite networks do not map cleanly to conventional GeoIP assumptions. A customer subnet, a declared PoP, a reverse DNS hint, and a BGP origin are different facts with different failure modes. sat-geoip preserves those facts as separate fields and records the quality signals that connect or contradict them.
The pipeline is designed for data-engineering and infrastructure workflows:
- ingest operator-published geofeeds and PoP maps;
- correlate them with live BGP origin state;
- retain explicit semantics for every derived field;
- generate stable machine-readable artifacts for enrichment, routing analytics, and infrastructure inventories.
flowchart LR
A["Operator GeoIP feeds"] --> E["Evidence model"]
B["Operator PoP feeds"] --> E
C["RIPEstat BGP state"] --> E
D["RIR / RPKI / PTR layers"] --> E
E --> R["Resolver"]
R --> J["sat-geoip-prefixes.jsonl"]
R --> C1["CSV reports"]
R --> M["sat-geoip.mmdb"]
R --> S["release statistics"]
The resolver applies fixed precedence rules:
| Field | Winning source | Notes |
|---|---|---|
| Operator | live BGP origin, then RIR org match | ASNs are treated as a discovered set |
| GeoIP | operator geofeed | customer-subnet location semantics |
| PoP | official PoP feed | PTR is corroboration only |
| Routing state | BGP collectors | geofeeds do not imply live routing |
| Ground station claim | constant false |
never inferred from GeoIP or PoP data |
| Dataset metric | Count |
|---|---|
| Prefixes | 13859 |
| Announced prefixes | 10323 |
| GeoFeed-only prefixes | 3536 |
| BGP-only prefixes | 9398 |
| Prefixes with PoP assignment | 3303 |
| Ground station claims | 0 |
| Name | Count |
|---|---|
anuvu |
59 |
avanti |
23 |
bentley_walker |
34 |
caprock |
3 |
carnival |
1 |
castor_marine |
5 |
china_satcom |
84 |
esa |
81 |
eutelsat_skylogic |
278 |
gazprom_space_systems |
45 |
gilat_telecom |
233 |
gogo_business_aviation |
1 |
hispasat |
40 |
hughes |
675 |
inmarsat |
55 |
intelsat |
94 |
intelsat_general |
3 |
iridium |
12 |
itc_global |
8 |
kacific |
26 |
kt_sat |
4 |
kuiper |
2 |
kvh |
62 |
marlink |
29 |
nasa_jpl |
15 |
navarino |
12 |
nbn_sky_muster |
450 |
omniaccess |
13 |
oneweb |
17 |
panasonic_avionics |
13 |
rignet |
1 |
rocket_lab |
2 |
royal_caribbean |
1 |
rscc |
30 |
satcom_direct |
4 |
ses_o3b |
22 |
sky_perfect_jsat |
6 |
spacex_infrastructure |
2 |
speedcast |
106 |
starlink |
5632 |
swarm |
1 |
tampnet |
6 |
telesat |
44 |
telespazio |
13 |
thales_avionics |
1 |
thuraya |
10 |
turksat |
948 |
usap |
19 |
viasat |
4534 |
yahsat |
100 |
| Name | Count |
|---|---|
deep_space |
15 |
geo |
2723 |
geo_mss |
29 |
geo_or_hybrid_satellite |
4914 |
geo_or_multi_orbit |
94 |
hybrid_satellite_offshore |
6 |
leo |
5668 |
meo |
22 |
mixed_satellite |
388 |
The checked-in outputs/ directory is generated from live public feeds. The example evidence fixture remains in the repository to exercise acceptance cases and deterministic tests.
The current registry covers LEO, MEO, GEO, mobility, maritime, aviation, research, space-agency, and launch-infrastructure networks. The full machine-readable ASN list is emitted in outputs/satellite-asns.csv; the maintained registry is config/operators.yaml.
| Coverage class | Examples | Evidence layers | GeoFeed |
|---|---|---|---|
| LEO / MEO satellite internet | Starlink, OneWeb, Iridium, Amazon Leo / Kuiper, SES/O3b | GeoIP/PoP where published; BGP, RDAP/RPKI model | Starlink active |
| GEO / hybrid satellite internet | Viasat, Hughes, Inmarsat, Telesat, Yahsat, Hispasat, Kacific, Thaicom, Turksat, China Satcom, KT Sat | GeoIP where published; BGP, RDAP/RPKI model | Viasat active |
| Maritime, aviation, and remote service providers | Marlink, Speedcast, KVH, Anuvu, Panasonic Avionics, Satcom Direct, Gogo, NSSLGlobal, OmniAccess, Castor Marine, Navarino, Tampnet | BGP, RDAP/RPKI model | Anuvu/MTNSAT active |
| Regional VSAT and teleport operators | Avanti, Eutelsat/Skylogic, Telespazio, Sky Perfect JSAT, RSCC, Gazprom Space Systems, Gilat Telecom, APSTAR, NBN Sky Muster | BGP, RDAP/RPKI model | not found |
| Space and research infrastructure | Swarm, SpaceX infrastructure, KSAT, USAP, NASA JPL, ESA, CNES, Rocket Lab | BGP, RDAP/RPKI model | not found |
| Mobility and cruise-line networks | Carnival, Royal Caribbean, Thales Avionics, Lufthansa Systems, RigNet, CapRock, ITC Global | BGP, RDAP/RPKI model | not found |
- Go resolver with typed evidence and canonical resolved-prefix records.
- Operator registry covering satellite internet operators, MSS providers, mobility integrators, cruise-line networks, research networks, and space infrastructure ASNs.
- RFC 8805 geofeed parser and Starlink PoP CSV parser.
- RIPEstat announced-prefix parser for live BGP state.
- CSV, JSONL, and MaxMind DB outputs.
- Release statistics in JSON and Markdown.
- Static GitHub Pages dashboard generated from release outputs.
- GitHub Actions workflow for scheduled dataset builds and release publishing.
- Tests for the acceptance cases that guard field semantics and confidence separation.
git clone https://github.com/ipanalytics/Sat-geoip.git
cd Sat-geoip
go test ./...
go run ./cmd/sat-geoip -format release -evidence examples/acceptance_evidence.json -out outputsBuild from live public sources:
go run ./cmd/sat-geoip -format live-release -out outputssat-geoip is a standard Go module.
go install ./cmd/sat-geoipFor reproducible CI builds, use Go 1.24 or newer.
Generate resolved records from an evidence file:
go run ./cmd/sat-geoip \
-format jsonl \
-evidence examples/acceptance_evidence.jsonGenerate all release artifacts:
go run ./cmd/sat-geoip \
-format release \
-evidence examples/acceptance_evidence.json \
-out outputsGenerate all artifacts from live public feeds:
go run ./cmd/sat-geoip \
-format live-release \
-out outputsUpdate the README statistics block from release stats:
go run ./cmd/sat-geoip \
-format update-readme-stats \
-stats outputs/stats.json \
-readme README.md| File | Description |
|---|---|
sat-geoip-prefixes.jsonl |
Canonical resolved records, one prefix per line |
sat-geoip-prefixes.csv |
Flattened resolved-prefix table |
sat-geoip.mmdb |
MaxMind DB for prefix lookups |
satellite-asns.csv |
Operator ASN seed registry |
operator-geofeeds.csv |
Known operator feed URLs and formats |
operator-gateway-reference.csv |
Gateway country reference metadata; not customer GeoIP |
prefix-changes.jsonl |
Per-prefix change events compared with the previous committed output |
prefix-changes.csv |
Flattened change event table |
history-summary.json |
Release-level history counters |
starlink-geoip-vs-bgp.csv |
Starlink geofeed and BGP comparison |
starlink-pop-mapping.csv |
Starlink prefix-to-PoP mapping |
pops-vs-ptr-mismatch.csv |
PTR/PoP disagreement report |
stats.json |
Machine-readable release statistics |
RELEASE_NOTES.md |
Markdown body for GitHub Releases |
Canonical JSONL records follow the resolved-prefix schema:
{
"prefix": "14.1.64.0/24",
"operator": "starlink",
"operator_group": "spacex",
"service_type": "satellite_internet",
"orbit_class": "leo",
"origin_asn": 45700,
"origin_as_name": "IDNIC-STARLINK-AS-ID",
"geoip_country": "PH",
"geoip_city": "Manila",
"geoip_source": "starlink_feed_csv",
"geoip_semantics": "customer_subnet_geoip_location",
"pop_code": "mnlaphl1",
"pop_iata": "mnl",
"pop_source": "starlink_pops_csv",
"bgp_state": "announced",
"ground_station_claim": false,
"active_user_claim": true,
"quality_flags": ["geoip_valid", "bgp_announced", "origin_asn_expected"],
"data_confidence": {
"attribution": 0.997,
"geo": 0.85
}
}data_confidence.attribution and data_confidence.geo are intentionally separate. Attribution answers whether a prefix belongs to the operator set. Geo confidence answers whether the declared location label is internally consistent.
The pipeline includes local reference datasets under data/reference:
| Source | Use |
|---|---|
GeoNames countryInfo, admin1CodesASCII, cities1000 |
country, subdivision, and city-country validation |
OurAirports airports.csv |
IATA airport code to country validation |
These files are validation references, not GeoIP sources. They improve quality flags such as geoip_invalid_country_city_pair and support PoP/gateway sanity checks without overriding operator-published geofeed semantics.
- Scheduled releases run from GitHub Actions and publish a date-tagged dataset release.
- Live builds fetch public operator feeds and RIPEstat Data API responses.
- Release jobs update
README.md,outputs/, and the GitHub Release body with dataset statistics. first_seen,last_seen,changed_at, andchange_typeare repository snapshot history fields. They describe when sat-geoip first observed or changed a record, not when the operator originally allocated or routed the prefix.- Prefix change reports compare the current build against the previously committed
sat-geoip-prefixes.jsonlartifact. - Raw snapshot retention is part of the long-term roadmap; current checked-in outputs represent the latest generated release artifact set.
- The resolver keeps evidence layers separate by design. Consumers should select the field appropriate to their workflow instead of collapsing fields into a single location.
- satellite ISP prefix enrichment in network inventory systems;
- comparing operator-declared GeoIP data with live BGP announcements;
- tracking Starlink PoP assignment changes;
- generating MMDB enrichment files for edge and analytics pipelines;
- auditing feed consistency across country, city, PoP, and origin-AS layers.
sat-geoip covers satellite-internet data engineering: operator feeds, BGP state, ownership evidence, PoP mappings, release artifacts, and historical change tracking. It does not score users, reputation, abuse, anonymity, or risk.
- Live BGP collection currently uses RIPEstat REST APIs, not MRT/RIS-Live streams.
- PTR and RPKI enrichment are represented in the model but not fully collected in the first release pipeline.
- OneWeb/Eutelsat and most non-Starlink/Viasat operators are BGP-derived until public operator geofeeds are found.
- SES/O3b, Hughes, Marlink, Intelsat, Avanti, Speedcast, Inmarsat, and Thuraya are BGP-derived in the first release because no public RFC 8805 geofeed is known for those operators.
.
├── cmd/sat-geoip/ # CLI entry point
├── config/ # operator registry
├── data/reference/ # validation-only GeoNames and OurAirports datasets
├── examples/ # acceptance evidence fixtures
├── internal/collectors/ # feed and BGP collector helpers
├── internal/export/ # CSV and JSONL writers
├── internal/history/ # per-prefix snapshot history and change reports
├── internal/live/ # live public-source dataset builder
├── internal/mmdb/ # MaxMind DB writer
├── internal/release/ # artifact and statistics generation
├── internal/resolver/ # core evidence resolution engine
├── internal/validators/ # RFC 8805 and PoP parsers
├── outputs/ # generated dataset artifacts
└── site/ # README assets
The repository includes a scheduled GitHub Actions workflow:
.github/workflows/dataset-release.yml
It runs tests, builds live release artifacts, updates README statistics, commits generated files, and publishes a GitHub Release containing the dataset files.
Manual release:
gh workflow run dataset-release.ymlsat-geoip is licensed under the Apache License 2.0.
sat-geoip publishes derived infrastructure data from public sources. Operator feeds and public BGP APIs can be incomplete, delayed, or internally inconsistent; downstream systems should preserve the source semantics included in each record.