Skip to content

reduce noise in flight benchmarks [tokio-threads] [# of columns in benchmarks]#10242

Merged
alamb merged 2 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/remove-tokio-flight-benchmarks
Jul 2, 2026
Merged

reduce noise in flight benchmarks [tokio-threads] [# of columns in benchmarks]#10242
alamb merged 2 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/remove-tokio-flight-benchmarks

Conversation

@Rich-T-kid

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

see #10220 (comment)
attempting to remove as much noise from profile/benchmarks as possible

What changes are included in this PR?

  • sets each tokio runtime builder to use new_current_thread
  • remove [1] column case, trying to make benchmarks easier to read, reducing the size by 1/3 while keeping [4,8]

Are these changes tested?

n/a

Are there any user-facing changes?

n/a

@github-actions github-actions Bot added arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Jun 29, 2026
@Rich-T-kid Rich-T-kid changed the title restrict tokio noise in flight benchmarks reduce noise in flight benchmarks [tokio-threads] [# of columns in benchmarks] Jun 29, 2026
@alamb

alamb commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

run benchmark flight

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Rich-T-kid -- looks good to me

@alamb alamb added the development-process Related to development process of arrow-rs label Jul 1, 2026
@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4860301809-790-hgh8j 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing rich-T-kid/remove-tokio-flight-benchmarks (a830ade) to da07bce (merge-base) diff
BENCH_NAME=flight
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench flight
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                     main                                    rich-T-kid_remove-tokio-flight-benchmarks
-----                                     ----                                    -----------------------------------------
decode/fixed/65536x1                      1.00     49.4±0.98µs    39.5 GB/sec   
decode/fixed/65536x4                      1.00    266.6±3.39µs    29.3 GB/sec     1.01    268.2±8.59µs    29.1 GB/sec
decode/fixed/65536x8                      1.00   569.4±28.96µs    27.4 GB/sec     1.05  595.4±122.09µs    26.2 GB/sec
decode/fixed/8192x1                       1.00      8.4±0.03µs    29.0 GB/sec   
decode/fixed/8192x4                       1.00     29.5±0.40µs    33.1 GB/sec     1.01     29.9±0.24µs    32.7 GB/sec
decode/fixed/8192x8                       1.00     65.6±0.31µs    29.8 GB/sec     1.01     66.0±0.61µs    29.7 GB/sec
decode/nested/65536x1                     1.00  696.3±172.45µs     7.0 GB/sec   
decode/nested/65536x4                     1.00      3.1±0.68ms     6.4 GB/sec     1.06      3.2±0.70ms     6.0 GB/sec
decode/nested/65536x8                     1.17     17.8±1.38ms     2.2 GB/sec     1.00     15.2±1.60ms     2.6 GB/sec
decode/nested/8192x1                      1.00    83.7±20.51µs     7.3 GB/sec   
decode/nested/8192x4                      1.00   354.5±83.32µs     6.9 GB/sec     1.00   353.2±84.10µs     6.9 GB/sec
decode/nested/8192x8                      1.01  730.7±169.15µs     6.7 GB/sec     1.00  726.1±171.39µs     6.7 GB/sec
decode/variable/65536x1                   1.00  1224.2±167.79µs     7.2 GB/sec  
decode/variable/65536x4                   1.00      5.5±0.75ms     6.4 GB/sec     1.02      5.6±0.59ms     6.3 GB/sec
decode/variable/65536x8                   1.02     11.6±1.50ms     6.1 GB/sec     1.00     11.4±1.65ms     6.2 GB/sec
decode/variable/8192x1                    1.00   136.0±22.50µs     8.1 GB/sec   
decode/variable/8192x4                    1.07   630.5±79.34µs     7.0 GB/sec     1.00   588.6±92.25µs     7.5 GB/sec
decode/variable/8192x8                    1.02  1222.1±184.02µs     7.2 GB/sec    1.00  1203.9±178.40µs     7.3 GB/sec
decode_stream/dict/65536x1x4              1.00   191.3±27.53µs     5.1 GB/sec   
decode_stream/dict/65536x4x4              1.02  788.3±140.44µs     5.0 GB/sec     1.00  775.0±130.18µs     5.1 GB/sec
decode_stream/dict/65536x8x4              1.00  1628.9±230.45µs     4.8 GB/sec    1.03  1672.9±256.77µs     4.7 GB/sec
decode_stream/dict/8192x1x4               1.00     26.7±0.12µs     4.8 GB/sec   
decode_stream/dict/8192x4x4               1.00    101.7±0.53µs     5.0 GB/sec     1.00    101.9±1.82µs     5.0 GB/sec
decode_stream/dict/8192x8x4               1.00    205.8±1.02µs     5.0 GB/sec     1.00    205.7±1.20µs     5.0 GB/sec
decode_stream/fixed/65536x1x4             1.00     49.0±0.38µs    39.9 GB/sec   
decode_stream/fixed/65536x4x4             1.00    267.9±6.40µs    29.2 GB/sec     1.01   269.7±55.11µs    29.0 GB/sec
decode_stream/fixed/65536x8x4             1.00  601.4±132.39µs    26.0 GB/sec     1.03  621.1±161.91µs    25.2 GB/sec
decode_stream/fixed/8192x1x4              1.00      8.4±0.03µs    29.0 GB/sec   
decode_stream/fixed/8192x4x4              1.00     28.9±0.35µs    33.8 GB/sec     1.01     29.3±0.14µs    33.4 GB/sec
decode_stream/fixed/8192x8x4              1.01     63.1±0.58µs    31.0 GB/sec     1.00     62.7±0.62µs    31.2 GB/sec
decode_stream/nested/65536x1x4            1.00  693.2±167.47µs     7.0 GB/sec   
decode_stream/nested/65536x4x4            1.00      3.0±0.67ms     6.4 GB/sec     1.01      3.1±0.68ms     6.4 GB/sec
decode_stream/nested/65536x8x4            1.10     18.2±1.38ms     2.1 GB/sec     1.00     16.5±1.77ms     2.4 GB/sec
decode_stream/nested/8192x1x4             1.00    84.0±20.57µs     7.3 GB/sec   
decode_stream/nested/8192x4x4             1.00   353.1±83.92µs     6.9 GB/sec     1.01   356.4±84.22µs     6.9 GB/sec
decode_stream/nested/8192x8x4             1.01  734.0±166.89µs     6.7 GB/sec     1.00  728.3±170.15µs     6.7 GB/sec
decode_stream/variable/65536x1x4          1.00  1190.9±178.34µs     7.4 GB/sec  
decode_stream/variable/65536x4x4          1.05      5.8±0.62ms     6.1 GB/sec     1.00      5.5±0.72ms     6.4 GB/sec
decode_stream/variable/65536x8x4          1.86     21.4±1.99ms     3.3 GB/sec     1.00     11.5±1.39ms     6.1 GB/sec
decode_stream/variable/8192x1x4           1.00   136.9±21.75µs     8.0 GB/sec   
decode_stream/variable/8192x4x4           1.04   608.5±85.93µs     7.2 GB/sec     1.00   586.5±90.64µs     7.5 GB/sec
decode_stream/variable/8192x8x4           1.00  1221.4±174.73µs     7.2 GB/sec    1.00  1222.5±183.88µs     7.2 GB/sec
do_put_dictionary/dict/hydrate/65536x1    1.00   399.2±12.65µs   629.9 MB/sec   
do_put_dictionary/dict/hydrate/65536x4    1.20  1533.5±81.62µs   655.8 MB/sec     1.00  1278.6±28.57µs   786.5 MB/sec
do_put_dictionary/dict/hydrate/65536x8    1.53      4.1±0.47ms   488.4 MB/sec     1.00      2.7±0.07ms   747.2 MB/sec
do_put_dictionary/dict/hydrate/8192x1     1.00     95.8±2.34µs   341.0 MB/sec   
do_put_dictionary/dict/hydrate/8192x4     1.27    218.6±3.44µs   597.6 MB/sec     1.00    172.6±1.07µs   756.8 MB/sec
do_put_dictionary/dict/hydrate/8192x8     1.20    402.1±6.22µs   649.8 MB/sec     1.00    336.1±3.60µs   777.4 MB/sec
do_put_dictionary/dict/resend/65536x1     1.00    112.9±1.41µs     2.2 GB/sec   
do_put_dictionary/dict/resend/65536x4     1.26    305.3±3.77µs     3.2 GB/sec     1.00    242.7±1.23µs     4.0 GB/sec
do_put_dictionary/dict/resend/65536x8     1.17   542.0±11.01µs     3.6 GB/sec     1.00    463.9±5.52µs     4.2 GB/sec
do_put_dictionary/dict/resend/8192x1      1.00     64.6±0.92µs   505.5 MB/sec   
do_put_dictionary/dict/resend/8192x4      2.13     86.5±1.00µs  1511.0 MB/sec     1.00     40.6±0.24µs     3.1 GB/sec
do_put_dictionary/dict/resend/8192x8      1.74    119.6±2.25µs     2.1 GB/sec     1.00     68.7±0.40µs     3.7 GB/sec
encode/fixed/65536x1                      1.00     10.6±0.02µs    46.1 GB/sec   
encode/fixed/65536x4                      1.00     49.8±0.19µs    39.2 GB/sec     1.00     49.6±0.40µs    39.4 GB/sec
encode/fixed/65536x8                      1.00   1112.1±4.91µs     3.5 GB/sec     1.02  1133.0±10.97µs     3.4 GB/sec
encode/fixed/8192x1                       1.00      3.2±0.01µs    19.1 GB/sec   
encode/fixed/8192x4                       1.00      8.8±0.02µs    27.8 GB/sec     1.06      9.3±0.03µs    26.2 GB/sec
encode/fixed/8192x8                       1.14     19.1±0.03µs    25.6 GB/sec     1.00     16.7±0.05µs    29.3 GB/sec
encode/nested/65536x1                     1.00     28.7±0.11µs    42.5 GB/sec   
encode/nested/65536x4                     1.00   1448.1±8.02µs     3.4 GB/sec     1.05  1524.2±36.63µs     3.2 GB/sec
encode/nested/65536x8                     1.00      3.1±0.05ms     3.2 GB/sec     1.06      3.3±0.09ms     3.0 GB/sec
encode/nested/8192x1                      1.00      5.9±0.01µs    25.9 GB/sec   
encode/nested/8192x4                      1.02     21.1±0.06µs    29.0 GB/sec     1.00     20.8±0.07µs    29.4 GB/sec
encode/nested/8192x8                      1.05     49.0±0.31µs    24.9 GB/sec     1.00     46.5±0.16µs    26.3 GB/sec
encode/variable/65536x1                   1.00     70.0±0.48µs    31.4 GB/sec   
encode/variable/65536x4                   1.00      2.6±0.04ms     3.4 GB/sec     1.06      2.7±0.08ms     3.2 GB/sec
encode/variable/65536x8                   1.00      5.5±0.11ms     3.2 GB/sec     1.09      6.0±0.22ms     2.9 GB/sec
encode/variable/8192x1                    1.00     10.7±0.02µs    25.6 GB/sec   
encode/variable/8192x4                    1.03     26.0±0.04µs    42.2 GB/sec     1.00     25.3±0.08µs    43.4 GB/sec
encode/variable/8192x8                    1.04     83.6±0.27µs    26.3 GB/sec     1.00     80.2±0.88µs    27.4 GB/sec
roundtrip/fixed/65536x1                   1.00    318.9±4.19µs  1568.2 MB/sec   
roundtrip/fixed/65536x4                   1.00  1248.7±23.15µs  1602.0 MB/sec     1.01  1259.0±30.40µs  1588.8 MB/sec
roundtrip/fixed/65536x8                   1.00      2.3±0.05ms  1724.5 MB/sec     1.00      2.3±0.05ms  1723.5 MB/sec
roundtrip/fixed/8192x1                    1.00     97.0±0.81µs   645.2 MB/sec   
roundtrip/fixed/8192x4                    1.03    204.8±3.05µs  1222.5 MB/sec     1.00    199.3±1.85µs  1256.0 MB/sec
roundtrip/fixed/8192x8                    1.01    350.6±7.52µs  1428.2 MB/sec     1.00    346.9±5.61µs  1443.5 MB/sec
roundtrip/nested/65536x1                  1.00   918.9±44.57µs  1360.5 MB/sec   
roundtrip/nested/65536x4                  1.00      4.5±0.15ms  1113.0 MB/sec     1.00      4.5±0.16ms  1110.7 MB/sec
roundtrip/nested/65536x8                  1.00      9.5±0.47ms  1056.4 MB/sec     1.00      9.5±0.39ms  1051.3 MB/sec
roundtrip/nested/8192x1                   1.00    165.5±6.13µs   945.4 MB/sec   
roundtrip/nested/8192x4                   1.00   485.7±22.47µs  1288.5 MB/sec     1.00   483.5±22.25µs  1294.2 MB/sec
roundtrip/nested/8192x8                   1.01   962.5±42.81µs  1300.4 MB/sec     1.00   957.0±40.82µs  1307.8 MB/sec
roundtrip/variable/65536x1                1.00  1320.1±58.13µs  1704.5 MB/sec   
roundtrip/variable/65536x4                1.02      8.8±0.40ms  1022.2 MB/sec     1.00      8.6±0.41ms  1040.6 MB/sec
roundtrip/variable/65536x8                1.00     15.7±0.52ms  1149.8 MB/sec     1.09     17.1±0.70ms  1054.4 MB/sec
roundtrip/variable/8192x1                 1.00    218.4±7.12µs  1288.5 MB/sec   
roundtrip/variable/8192x4                 1.02   725.7±24.85µs  1551.2 MB/sec     1.00   711.1±23.21µs  1582.9 MB/sec
roundtrip/variable/8192x8                 1.02  1329.1±54.63µs  1694.0 MB/sec     1.00  1303.8±27.77µs  1726.7 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 935.2s
Peak memory 174.1 MiB
Avg memory 63.5 MiB
CPU user 924.9s
CPU sys 149.1s
Peak spill 0 B

branch

Metric Value
Wall time 615.1s
Peak memory 172.2 MiB
Avg memory 57.2 MiB
CPU user 574.0s
CPU sys 92.5s
Peak spill 0 B

File an issue against this benchmark runner

@alamb alamb merged commit 8c7df18 into apache:main Jul 2, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate development-process Related to development process of arrow-rs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants