bench: per-chip TX throughput/latency harness; replace README "TX + RX" with real numbers#108
Closed
josephnef wants to merge 1 commit into
Closed
bench: per-chip TX throughput/latency harness; replace README "TX + RX" with real numbers#108josephnef wants to merge 1 commit into
josephnef wants to merge 1 commit into
Conversation
…X" with numbers Adds tests/bench_tput.py — a per-chip TX throughput + per-frame latency benchmark (devourer vs host kernel driver) across bands (2.4/UNII-1/UNII-2-3) and PSDU sizes (1500 / 3994 B). TX rate is measured from usbmon bulk-OUT completions at the source chip (the true frames-accepted rate; counting at a sniffer measures the sniffer's RX ceiling instead — a trap). Reuses regress.py for DUT discovery, kernel bind/unbind, USB power-cycle, process hygiene and log parsers. Driver/injector support: - txdemo: DEVOURER_TX_PAYLOAD_BYTES=N pads the 802.11 PSDU to N bytes (on-wire N+40; PKT_SIZE is 16-bit) so we can TX 1500/3994 B frames. - inject_beacon.py: --size N (matching sized PSDU) and --max-rate (blocking AF_PACKET blaster ~= the kernel TX-completion rate). README "Hardware landscape": the generic "TX + RX" band cells are replaced with measured devourer TX throughput (Mbps @ 1500 / 3994 B), plus a Measured throughput subsection with the kernel-driver comparison and latency. Headline results (HT MCS7, 20 MHz, monitor injection): devourer direct-USB TX is 8-60x faster than kernel AF_PACKET monitor injection (e.g. 8812 2.4 GHz: 46 vs 0.9 Mbps); the kernel monitor path cannot inject 3994 B frames at all (AF_PACKET > MTU) while devourer hits 58-62 Mbps; throughput scales with frame size; devourer per-frame latency 17-128 us; the 8814 TX path is the family's least reliable (high variance, flagged). RX is not tabulated — it cannot be measured cleanly on a 2-USB-bus rig (same-bus contention, flooder saturation, flaky 8814 RX); methodology + caveats in tests/README.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Replaces the generic "TX + RX" band cells in the README "Hardware landscape" table with measured devourer TX throughput, and adds the harness that produces it.
New:
tests/bench_tput.pyPer-chip TX throughput + per-frame latency, devourer vs the host kernel driver, across bands (2.4 / UNII-1 / UNII-2·3) and PSDU sizes (1500 / 3994 B). Reuses
regress.pyfor DUT discovery, kernel bind/unbind, USB power-cycle, process hygiene and log parsers. Resumable,--quicksmoke, CSV + markdown output.Method — the clean metric: TX rate = usbmon bulk-OUT completions at the source chip (true frames-accepted rate). Counting frames at a sniffer instead measures the sniffer's RX ceiling (~336 fps here) — a trap that makes every transmitter look identical. devourer has no host-side TX backpressure (it pipelines URBs), so latency is taken from a separate non-saturating pass.
Driver/injector support
txdemo:DEVOURER_TX_PAYLOAD_BYTES=Npads the 802.11 PSDU to N bytes (on-wire N+40; PKT_SIZE is 16-bit, so 3994 fits).inject_beacon.py:--size N+--max-rate(blocking AF_PACKET blaster ≈ kernel TX-completion rate).Headline results (HT MCS7, 20 MHz, monitor injection)
RX — honestly, not tabulated
RX throughput cannot be measured cleanly on a 2-USB-bus bench: same-bus TX/RX pairs (8812 + 8821 share a host controller) contend, the only reliable cross-bus flooder (8812 → 8814) saturates the receiver at full TX rate, and the 8814 RX path is itself intermittent. A clean cross-bus moderate-rate flood (8812 → 8814) does receive ~3100 frames / 12 s, confirming RX works; a capacity number needs a 3-bus rig or a calibrated SDR transmitter. Full caveats in
tests/README.md.Test
cmake --build build;ctestgreen. Benchmark:sudo tests/bench_tput.py --quick, thensudo tests/bench_tput.py --directions tx. Run on RTL8812AU/8814AU/8821AU.🤖 Generated with Claude Code