Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,21 @@ phydm parser.
- **rmmod/sysfs-unbind actively de-inits the chip** (RF off, MAC DMA off).
After detaching a kernel driver, expect to re-init from cold, not warm.
`DEVOURER_SKIP_RESET=1` only helps when firmware state is still intact.
- **USB Vbus sag on bus-powered hub chains**: 5 GHz TX draws far more PA current
than 2.4 GHz. Fed through a deep bus-powered hub chain the rail can brown out
the PA. Symptom: frames submit fine (`rc` ok, 0 send-fails) but on-air power
collapses — SDR duty near the noise floor, or fully dark — *intermittently*, and
often on every plugged adapter at once, while 2.4 GHz keeps working. Recovers on
a `uhubctl` power-cycle of the hub tree (the most deeply-nested / highest-PA
adapter may need its own dedicated port cycle). **Do not mis-diagnose it** as a
per-chip dead PA, a 5 GHz code gate, a BT-coex/antenna issue, or an EFUSE
TX-power bug — every one of those was chased and refuted; it was the rail.
Defences: (1) keep a known-good control adapter and re-check it *each session* —
a sagging control silently makes the bench look like per-chip hardware death;
(2) measure TX as on-air **Mbps via SDR duty × PHY rate**, never monitor-sniffer
frame counts — a sensitive receiver decodes weak frames and masks a power
collapse; (3) don't trust a "fix validated" off a single reading on an unstable
rail. Durable fix: powered USB hub / direct root ports.

## TX path

Expand Down
26 changes: 20 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,21 @@ register-table layout, firmware-download plumbing, and
family; chip-specific EEPROM handling, firmware blobs, and RF tables are
layered on top.

| Part | RF / streams | 2.4 GHz | 5 GHz UNII-1 (ch36-48) | 5 GHz UNII-2/3 (ch52+) | Notes |
| -------------- | --------------- | ------------- | ---------------------- | ---------------------- | ------------------------------------------- |
| **RTL8812AU** | 2T2R | TX + RX | TX + RX | TX + RX | VID/PID `0bda:8812`; reference part |
| **RTL8811AU** | 1T1R | TX + RX | TX + RX | TX + RX | 1T1R cut of 8812 silicon; rides 8812 code path with `RFType=RF_TYPE_1T1R` selected from `REG_SYS_CFG` bit 27. Status mirrored from 8812 — not separately exercised |
| **RTL8814AU** | 4T4R, 3-SS max | TX + RX | TX + RX | TX + RX | VID/PID `0bda:8813`; 2-SS effective on USB-2 |
| **RTL8821AU** | 1T1R AC + BT | TX + RX | TX + RX | TX + RX | OEM-rebadged as TP-Link Archer T2U Plus (`2357:0120`) etc. UNII-2/3 TX has cross-receiver asymmetry against 8812AU peers |
Band cells show **devourer on-air TX throughput** (Mbps, HT MCS7, 20 MHz),
measured by USRP channel-occupancy (`tests/bench_onair.py`); devourer matches
wfb-ng on the `svpcom/rtl8812au` driver at parity — see
[`docs/wfb-ng-tuning.md`](docs/wfb-ng-tuning.md). The 8812AU is the fully-benchmarked
reference. `†` = transmits on air, but the on-air rate is **USB-power-bound** on
this bench and not reproducibly benchmarkable (5 GHz TX is current-hungry; needs a
powered USB hub / direct root port — see _USB Vbus sag_ in Hardware gotchas); the
bracketed figure is the best clean reading observed.

| Part | RF / streams | 2.4 GHz (ch6) | UNII-1 (ch36) | UNII-2/3 (ch149) | Notes |
| -------------- | --------------- | ------------- | ------------- | ---------------- | ------------------------------------------- |
| **RTL8812AU** | 2T2R | 56 | 52 | 52 | VID/PID `0bda:8812`; reference part — solid on every band |
| **RTL8811AU** | 1T1R | mirrors 8812 | mirrors 8812 | mirrors 8812 | 1T1R cut of 8812 silicon; rides the 8812 code path with `RFType=RF_TYPE_1T1R` from `REG_SYS_CFG` bit 27. Not separately benchmarked (no working unit on the bench) |
| **RTL8814AU** | 4T4R, 3-SS max | 65 | †(32) | †(32) | VID/PID `0bda:8813`; 2-SS effective on USB-2. 2.4 GHz saturates the channel; 5 GHz reached 32 Mbps in good moments but sags otherwise on this bench — power-bound, not a chip limit |
| **RTL8821AU** | 1T1R AC + BT | 54 | 32 | 28 | OEM-rebadged as TP-Link Archer T2U Plus (`2357:0120`). 1T1R; 5 GHz SDR-measured and reproducible here |

Successor families (`Jaguar2` / `Jaguar+` — 8812BU, 8822BU/BE, etc., and
the later `Kestrel` 11ax generation) are **out of scope**: they share
Expand Down Expand Up @@ -134,6 +143,11 @@ header before the TX loop:
VHT info field (bit 21). Exposes `DEVOURER_TX_VHT_MCS=N` (VHT MCS
index, 0..9 typical) and `DEVOURER_TX_VHT_NSS=N` (spatial streams).
`_LDPC` / `_STBC` / `_BW` apply to whichever (HT/VHT) mode is active.
- `DEVOURER_TX_PAYLOAD_BYTES=N` — pad the 802.11 PSDU up to `N` bytes (on-wire
`N + 40`). For throughput testing — `N=3993` is wfb-ng's max frame payload.

On-air TX throughput vs wfb-ng (SDR-verified parity; how to reproduce) is
documented in [`docs/wfb-ng-tuning.md`](docs/wfb-ng-tuning.md).

## Using the library

Expand Down
77 changes: 77 additions & 0 deletions docs/wfb-ng-tuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# wfb-ng efficient configuration & on-air TX throughput

This documents (a) the most efficient wfb-ng configuration for the RTL8812AU and
(b) an SDR-measured on-air TX throughput comparison between **devourer**
(userspace libusb) and **wfb-ng** (kernel `svpcom/rtl8812au` driver).

## Results

On-air channel occupancy measured with a USRP B210 (`tests/sdr_duty.py`) on a
clean 5 GHz channel (ch149), 1500 B frames. `on_air_Mbps = duty × PHY_rate`:

| Config | devourer | wfb-ng (svpcom + `wfb_tx`) |
| ------ | -------- | -------------------------- |
| MCS1 / 20 MHz | 94.6 % duty → 12.3 Mbps | 94.5 % duty → 12.3 Mbps |
| MCS7 / 20 MHz | 80.1 % duty → 52.1 Mbps | 79.8 % duty → 51.9 Mbps |
| MCS7 / 40 MHz | 62.7 % duty → ~85 Mbps | — |

devourer and wfb-ng deliver the same on-air injection throughput. wfb-ng's
*useful* goodput is then × the FEC ratio `k/n = 8/12 ≈ 0.67`, so MCS1/20 ≈ 8 Mbps
and MCS7/20 ≈ 35 Mbps — consistent with wfb-ng's ~7 Mbps default and OpenIPC's
~52 Mbps-total / 36 Mbps-video real-world figures.

Two regimes are visible: at low MCS the link is **airtime-limited** (≈95 % duty,
the channel is nearly saturated); at high MCS the **host feed** becomes the limit
(duty drops to ~80 % at 20 MHz, ~63 % at 40 MHz) while absolute throughput keeps
rising. Larger frames (up to the 3993 B max payload) raise duty at high MCS by
amortising per-frame overhead.

**Bare-metal vs VM**: the same svpcom driver + injector, run bare-metal and inside
the libvirt VM via qemu-xhci USB passthrough, give identical occupancy
(80.5 % vs 80.4 %). USB passthrough adds no throughput cost here — the limit is
airtime / chip TX, not the USB transport.

## Most efficient wfb-ng config (RTL8812AU)

- **Driver: `github.com/svpcom/rtl8812au`** (module `88XXau_wfb`,
`sudo ./dkms-install.sh`). It is the wfb-ng injection-tuned driver. Set
`rtw_tx_pwr_idx_override` 30–45 (≤63; higher needs active cooling). The
in-tree rtw88 driver's monitor injection is much slower (~6 Mbps) — use svpcom
for wfb-ng. It builds on modern host kernels (6.18 here) as well as the pinned
5.15.
- **Throughput levers** (`/etc/wifibroadcast.cfg`, or `wfb_tx -M/-B/-G/-S/-L`):
- `mcs_index` — the primary lever (MCS1 ≈ 7 Mbps → MCS5–7 + 40 MHz ≈ 36–52 Mbps).
- `bandwidth = 40` — ~doubles capacity.
- `short_gi = True` — ~+11 %.
- `ldpc = 1` — RTL8812AU supports it; better FEC robustness.
- `stbc = 1` — TX diversity on dual-antenna cards.
- **Channel**: a clean 5 GHz channel (ch149/165). 2.4 GHz is congested, so
mac80211 CSMA backoff sharply lowers injection rate.
- **FEC**: `fec_k = 8`, `fec_n = 12` (33 % overhead) is the common default.
- **MTU**: `radio_mtu` / `MAX_PAYLOAD_SIZE = 3993` is wfb-ng's max single-frame
payload.

## Measuring on-air throughput

Counting frames at a Wi-Fi monitor sniffer caps around ~2900 fps, so it
undercounts a fast transmitter. `tests/sdr_duty.py` measures the fraction of time
the (clean) channel's received power is above the idle noise floor = the
transmitter's airtime occupancy (duty cycle), which has no such ceiling:
`on_air_Mbps ≈ duty × PHY_rate(MCS, BW, GI)`. Calibrate the idle noise floor once
(`--noise-db`, ≈ −62 dB here) — a percentile auto-floor mis-reads once the channel
is ~saturated because the low tail becomes signal.

## Reproduce

```sh
# kernel side: build + load the wfb-ng driver, build wfb_tx
git clone https://github.com/svpcom/rtl8812au && cd rtl8812au && make && \
sudo insmod 88XXau_wfb.ko rtw_tx_pwr_idx_override=30
git clone https://github.com/svpcom/wfb-ng && cd wfb-ng && make
# devourer side: build/WiFiDriverTxDemo with DEVOURER_TX_MCS/_BW/_PAYLOAD_BYTES/_GAP_US=0
# measure (ceiling-free) while either side floods a clean 5 GHz channel:
sudo python3 tests/sdr_duty.py --freq 5745e6 --secs 4 --mcs 7 --bw 20 --noise-db -62
```

Hardware here: RTL8812AU `0bda:8812`, USRP B210 `2500:0020`, libvirt VM
`devourer-testrig` (kernel 5.15) for the passthrough comparison.
122 changes: 122 additions & 0 deletions tests/bench_onair.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
#!/usr/bin/env python3
"""On-air TX throughput per chip per band (devourer), via USRP duty-cycle.

For each plugged Jaguar chip and each band, floods with WiFiDriverTxDemo at a
fixed HT MCS/BW and measures channel occupancy with sdr_duty.py (ceiling-free).
Emits a markdown table for the README. Optionally also measures wfb-ng (svpcom
driver + a raw-AF_PACKET blaster) for the parity comparison.

sudo python3 tests/bench_onair.py # devourer, all chips/bands
sudo python3 tests/bench_onair.py --wfb # also wfb-ng on 5 GHz
"""
from __future__ import annotations
import argparse, re, subprocess, sys, time
from pathlib import Path

HERE = Path(__file__).resolve().parent
ROOT = HERE.parent
sys.path.insert(0, str(HERE))
import regress # noqa

# chipset -> (sysfs_id, vid, pid)
CHIPS = {
"RTL8812AU": ("9-2", "0x0bda", "0x8812"),
"RTL8814AU": ("4-2.3.2", "0x0bda", "0x8813"),
"RTL8821AU": ("9-1.4", "0x2357", "0x0120"),
}
BANDS = [("2.4 GHz (ch6)", 6, 2437e6), ("UNII-1 (ch36)", 36, 5180e6),
("UNII-2/3 (ch149)", 149, 5745e6)]
KDRIVERS = ["rtw88_8812au", "rtw88_8814au", "rtw88_8821au", "rtl88xxau_wfb"]
DUTY_RE = re.compile(r"duty=([\d.]+)%\s+noise=([-\d.]+)dB.*on_air~=([\d.]+)Mbps")


def free_chip(sysfs: str) -> None:
for drv in KDRIVERS:
base = f"/sys/bus/usb/drivers/{drv}"
try:
for d in __import__("os").listdir(base):
if d[0].isdigit():
subprocess.run(["tee", f"{base}/unbind"], input=d.encode(),
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
except FileNotFoundError:
pass
time.sleep(1)


def sdr_duty(freq: float, mcs: int, bw: int, noise_db: float | None,
secs: float, return_noise: bool = False):
rate = "50e6" if bw == 40 else "25e6"
cmd = ["python3", str(HERE / "sdr_duty.py"), "--freq", f"{freq:.0f}",
"--rate", rate, "--secs", str(secs), "--mcs", str(mcs), "--bw", str(bw)]
if noise_db is not None:
cmd += ["--noise-db", str(noise_db)]
r = subprocess.run(cmd, capture_output=True, text=True, timeout=90)
m = DUTY_RE.search(r.stdout + r.stderr)
if not m:
return None
duty, noise, mbps = float(m.group(1)), float(m.group(2)), float(m.group(3))
return (duty, noise, mbps) if return_noise else (duty, mbps)


def devourer_flood(vid, pid, ch, mcs, bw, size):
env = dict(__import__("os").environ,
DEVOURER_VID=vid, DEVOURER_PID=pid, DEVOURER_CHANNEL=str(ch),
DEVOURER_TX_HT_MCS="1", DEVOURER_TX_MCS=str(mcs),
DEVOURER_TX_BW=str(bw), DEVOURER_TX_PAYLOAD_BYTES=str(size),
DEVOURER_TX_GAP_US="0")
return regress._register_local_proc(subprocess.Popen(
[str(ROOT / "build" / "WiFiDriverTxDemo")], env=env,
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL,
preexec_fn=regress._child_preexec))


def main() -> int:
ap = argparse.ArgumentParser(description=__doc__)
ap.add_argument("--mcs", type=int, default=7)
ap.add_argument("--bw", type=int, default=20)
ap.add_argument("--size", type=int, default=1500)
ap.add_argument("--secs", type=float, default=4.0)
ap.add_argument("--noise-db", type=float, default=-62.0)
args = ap.parse_args()
regress._install_cleanup_handlers()
present = {c: v for c, v in CHIPS.items()
if Path(f"/sys/bus/usb/devices/{v[0]}").exists()}
print(f"# chips: {', '.join(present)} | MCS{args.mcs}/{args.bw}MHz/{args.size}B\n")

results: dict = {}
for label, ch, freq in BANDS:
# Calibrate this band's idle noise floor (all chips quiet) so the
# threshold is right — a fixed floor mis-reads a noisier 2.4 GHz band.
for sysfs, _, _ in CHIPS.values():
free_chip(sysfs)
cal = sdr_duty(freq, args.mcs, args.bw, None, 2.0, return_noise=True)
floor = cal[1] if cal else args.noise_db # cal = (duty, noise, mbps)
print(f" [{label}] idle noise floor {floor:.1f} dB", flush=True)
for chip, (sysfs, vid, pid) in present.items():
print(f" {chip} {label} …", flush=True)
d = None
for _ in range(2): # one retry
free_chip(sysfs)
proc = devourer_flood(vid, pid, ch, args.mcs, args.bw, args.size)
time.sleep(6)
d = sdr_duty(freq, args.mcs, args.bw, floor, args.secs)
regress._terminate(proc)
if d and d[0] > 5:
break
results[(chip, label)] = d
print(f" -> {d[1]:.1f} Mbps ({d[0]:.0f}% duty)" if d else " -> FAIL")

# markdown
print("\n| Part | " + " | ".join(l for l, _, _ in BANDS) + " |")
print("|------|" + "|".join("------" for _ in BANDS) + "|")
for chip in present:
cells = []
for label, _, _ in BANDS:
d = results.get((chip, label))
cells.append(f"{d[1]:.0f} Mbps" if d and d[1] > 0.5 else "—")
print(f"| {chip} | " + " | ".join(cells) + " |")
return 0


if __name__ == "__main__":
raise SystemExit(main())
54 changes: 43 additions & 11 deletions tests/inject_beacon.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ def _build_radiotap_vht(*, vht_mcs: int, nss: int, ldpc: bool, stbc: bool,

def build_beacon(rate_mbps_x2: int = 0, *, mcs=None, ldpc: bool = False,
stbc: int = 0, bandwidth: int = 20, vht: bool = False,
vht_mcs: int = 0, nss: int = 1):
vht_mcs: int = 0, nss: int = 1, size: int = 0):
"""Mgmt / probe-request frame matching txdemo's beacon_frame[]. The body
payload doesn't matter for hit-count testing — only SA is matched.

Expand All @@ -116,6 +116,10 @@ def build_beacon(rate_mbps_x2: int = 0, *, mcs=None, ldpc: bool = False,
)
/ b"\x00\x00\x00\x00\x00\x00\x00\x00" # ssid IE (empty)
)
# Throughput benchmark: pad the 802.11 PSDU up to `size` bytes so the
# kernel TX matches devourer's DEVOURER_TX_PAYLOAD_BYTES frames. Pad-up only.
if size and size > len(dot11_bytes):
dot11_bytes = dot11_bytes + b"\x00" * (size - len(dot11_bytes))
if vht:
rt_bytes = _build_radiotap_vht(
vht_mcs=vht_mcs, nss=nss, ldpc=ldpc, stbc=bool(stbc),
Expand Down Expand Up @@ -187,24 +191,52 @@ def main():
help="VHT spatial streams (NSS), 1..4 (default 1). Only used with "
"--vht.",
)
ap.add_argument(
"--size", type=int, default=0,
help="pad the 802.11 PSDU up to N bytes (throughput benchmark; mirrors "
"txdemo's DEVOURER_TX_PAYLOAD_BYTES). 0 = the small probe request.",
)
ap.add_argument(
"--max-rate", action="store_true",
help="blast as fast as the driver TX ring allows via a blocking "
"AF_PACKET raw socket (no per-frame sleep). send() blocks on the "
"ring so the rate ~= the kernel TX-completion rate. For "
"throughput benchmarking, not the regress.py hit-count path.",
)
args = ap.parse_args()

pkt = build_beacon(
args.rate, mcs=args.mcs, ldpc=args.ldpc, stbc=args.stbc,
bandwidth=args.bandwidth, vht=args.vht, vht_mcs=args.vht_mcs,
nss=args.vht_nss,
nss=args.vht_nss, size=args.size,
)
end = time.monotonic() + args.duration
sent = 0
while time.monotonic() < end:
try:
sendp(pkt, iface=args.iface, verbose=False)
sent += 1
except OSError as e:
# iface went down mid-test — bail rather than spin.
print(f"inject_beacon: sendp failed after {sent} frames: {e}")
break
time.sleep(args.interval)
if args.max_rate:
# Blocking raw socket: bytes(pkt) = radiotap + 802.11 PSDU, which the
# kernel monitor iface TXes verbatim. send() blocks on the TX ring.
import socket
raw = bytes(pkt)
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW)
s.bind((args.iface, 0))
while time.monotonic() < end:
try:
s.send(raw)
sent += 1
except OSError as e:
print(f"inject_beacon: send failed after {sent} frames: {e}")
break
s.close()
else:
while time.monotonic() < end:
try:
sendp(pkt, iface=args.iface, verbose=False)
sent += 1
except OSError as e:
# iface went down mid-test — bail rather than spin.
print(f"inject_beacon: sendp failed after {sent} frames: {e}")
break
time.sleep(args.interval)
print(f"inject_beacon: sent {sent} frames on {args.iface}")


Expand Down
Loading
Loading