Skip to content

Specialize filter for list-like arrays (List/LargeList/FixedSizeList/Map, …)#10236

Open
Jeadie wants to merge 2 commits into
apache:mainfrom
Jeadie:jeadie/filter-list-specialization
Open

Specialize filter for list-like arrays (List/LargeList/FixedSizeList/Map, …)#10236
Jeadie wants to merge 2 commits into
apache:mainfrom
Jeadie:jeadie/filter-list-specialization

Conversation

@Jeadie

@Jeadie Jeadie commented Jun 29, 2026

Copy link
Copy Markdown

Which issue does this PR close?

  • No issue yet. Happy to open if changes wanted.

Rationale for this change

This PR improves the performance of FilterPredicate::filter for array based data types, specifically: List<T>, FixedSizeList<T>, Map<T>.

This optimisation is based on one idea: translate retained (i.e. filter = true) parent-row runs into child element ranges (trivially contiguous due to how list/fixed/map layouts work), then hand those ranges to a already-fast child kernels rather than copying element-by-element.

filter is one of the most-executed kernels in Apache DataFusion, and now these list/nested types have fast path. Several common Datafusion uses are especially impacted:

  • FixedSizeList as embeddings or vectors
  • JSON or nested data (array_agg, Unnest)
  • GROUP BY operations
  • Hash/sort-merge joins filter probe/build columns on list types

What changes are included in this PR?

  1. Specialisation within FilterPredicate::filter for DataType::FixedSizeList, DataType::Map, DataType::List and DataType::LargeList (the latter two are only specialised for certain/most child types).
  2. Associated benchmarks in arrow/benches/filter_kernels.rs.

Changes Explained

Before

Prior to this PR, List<T>, used the MutableArrayData fallback.

Example

filter:       [ T    F    T    T    F  ]
parent rows:  [row0|row1|row2|row3|row4]

child values: [a b c|d e|f g h|i j k|l m n o]

MutableArrayData walks the full child buffer, copying by range for each retained row.

After

filter:   [ T    F    T    T    F  ]
            row0 row1 row2 row3 row4

offsets:  [ 0    3    5    8   11   15 ]

predicate_row_ranges → [(0,1), (2,4)]   ← runs of kept rows

child_ranges:
  run (0,1): offsets[0]..offsets[1] = [0,  3) ← 1 parent element
  run (2,4): offsets[2]..offsets[4] = [5, 11)  ← 2 parent elements. rows 2+3 merged into ONE range

child values: [a b c|d e|f g h|i j k|l m n o]
                ╰─────╯   ╰───────────╯
                [0, 3)       [5, 11)

Rebuild new offsets from retained row lengths:
  row0: 3-0=3  → new_offsets: [0, 3]
  row2: 8-5=3  → new_offsets: [0, 3, 6]
  row3: 11-8=3 → new_offsets: [0, 3, 6, 9]

output: List([ [a,b,c], [f,g,h], [i,j,k] ])

Non-specialised Child types.

List<T> is not specialised for some T child types (and similarly other array types mentioned). This PR specialises if the child type T has a fast, vectorized kernel for it that is driven only by the predicate's Slices (never reads filter directly). Everything else uses the well-tuned, correct MutableArrayData fallback.

Child type Why it stays on fallback
dense Union Consecutive rows carry different type-ids and non-contiguous child offsets, no contiguous ranges to copy
RunEndEncoded Its kernel (filter_run_end_array) reads predicate.filter directly. The specialization streams ranges via a Slices predicate whose filter is intentionally empty.
unmeasured exotics Several exotics not benchmarked stay on fallback by default.

Every other list child is specialized: primitives, boolean, null, Utf8/LargeUtf8/Binary/LargeBinary, Utf8View/BinaryView, FixedSizeBinary, FixedSizeList, Dictionary, Struct, sparse Union, ListView/LargeListView, and nested List/LargeList.

Are these changes tested?

Tests in arrow-select/src/filter.rs.

Benchmark results

>> cargo bench -p arrow --bench filter_kernels \
  --features test_utils \
  --baseline after \
  -- "filter (list|fixedsizelist|map)"
  • size = 65536
  • Cells: before → after (speedup), where
    • before = MutableArrayData fallback
    • after = specialized.
    • ⚠ marks a regression (sub-1.0).

List<T> by child type

Child kept ½ kept 1023/1024 kept 1/1024
Int32 433→373 µs (1.16×) 128→93 µs (1.38×) 2.92→1.12 µs (2.60×)
Utf8 627→588 µs (1.07×) 455→471 µs (0.97× ⚠) 3.57→1.44 µs (2.48×)
LargeUtf8 679→593 µs (1.14×) 644→474 µs (1.36×) 3.58→1.44 µs (2.49×)
Binary 623→581 µs (1.07×) 451→469 µs (0.96× ⚠) 3.49→1.44 µs (2.43×)
LargeBinary 656→590 µs (1.11×) 628→470 µs (1.34×) 3.58→1.44 µs (2.48×)
Utf8View 598→315 µs (1.90×) 566→183 µs (3.10×) 3.16→0.94 µs (3.35×)
FixedSizeBinary 464→348 µs (1.33×) 213→118 µs (1.80×) 2.87→1.03 µs (2.79×)
FixedSizeList 540→446 µs (1.21×) 216→134 µs (1.61×) 3.61→1.42 µs (2.54×)
Dictionary 482→370 µs (1.30×) 126→94 µs (1.34×) 3.36→1.37 µs (2.45×)
Struct 519→371 µs (1.40×) 128→93 µs (1.37×) 3.69→1.12 µs (3.28×)
Map 1107→998 µs (1.11×) 780→648 µs (1.20×) 6.32→2.89 µs (2.19×)
Union (sparse) 832→663 µs (1.26×) 324→168 µs (1.93×) 5.41→2.30 µs (2.35×)
ListView 1702→548 µs (3.11×) 2687→128 µs (20.9×) 5.85→1.41 µs (4.13×)
nested List<List> 612→620 µs (0.99× ⚠) 320→295 µs (1.09×) 3.93→1.73 µs (2.27×)

Direct kernels (new)

Kernel kept ½ kept 1023/1024 kept 1/1024
filter FixedSizeList 309→197 µs (1.57×) 20.8→21.6 µs (0.96× ⚠) 3.27→0.69 µs (4.76×)
filter Map 791→648 µs (1.22×) 286→291 µs (0.98×) 5.01→1.88 µs (2.66×)

Value-length sweep — List<Utf8> @ kept ½

Value length before → after speedup
8 B 669→591 µs 1.13×
64 B 1241→1074 µs 1.16×
256 B 4796→3167 µs 1.51×
10× rows (short) 6598→6446 µs 1.02×

Regressions / caveats

All sub-1.0 results occur only at the dense kept 1023/1024 end (rare for selective predicates), plus the nested-list ½ tie:

  • dense: Utf8 0.97×, Binary 0.96×, filter FixedSizeList 0.96× (memcpy-bound — the fallback is already tight there).
  • nested List<List> @ ½: 0.99× (offset-dominated; ties the fallback, then wins 1.09×/2.27× at the other selectivities).

No remaining regression exceeds ~4%. Every kept 1/1024 (highly selective) case is a 2.2–4.8× win.

Are there any user-facing changes?

N/A.

…t/Map/…)

`FilterPredicate::filter` previously fell back to the generic
`MutableArrayData` path for `List`/`LargeList`/`FixedSizeList`/`Map`. This adds
specialized kernels that map each retained run of parent rows to a contiguous
range of child elements and reuse the already-vectorized per-type child filter
kernels, instead of the generic byte-copy fallback.

Child handling is selectivity-aware (work is proportional to retained runs and
elements, not the full child length) and streams ranges without an intermediate
`Vec`: byte children go straight to `FilterBytes`, nested lists recurse, and
others use a `Slices` predicate. A child-type allowlist keeps types that can't
beat the fallback (dense `Union`, `RunEndEncoded`) on `MutableArrayData`, and a
cheap selectivity guard routes dense `Map` filters to the fallback too.

Adds benchmarks for the affected types in `arrow/benches/filter_kernels.rs`.
@github-actions github-actions Bot added the arrow Changes to the arrow crate label Jun 29, 2026
@alamb

alamb commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

run benchmark filter_kernel

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4864945476-805-lk6b6 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing jeadie/filter-list-specialization (da21a5a) to 7616e10 (merge-base) diff
BENCH_NAME=filter_kernel
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench filter_kernel
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

Benchmark for this request failed.

Last 20 lines of output:

Click to expand
  Downloaded async-stream v0.3.6
  Downloaded cobs v0.3.0
  Downloaded ciborium-ll v0.2.2
  Downloaded ciborium-io v0.2.2
  Downloaded alloca v0.4.0
  Downloaded fnv v1.0.7
  Downloaded darling_macro v0.23.0
  Downloaded const-random v0.1.18
  Downloaded clap_lex v1.1.0
  Downloaded cfg-if v1.0.4
  Downloaded derive_arbitrary v1.4.2
  Downloaded cfg_aliases v0.2.1
  Downloaded crc v3.4.0
  Downloaded form_urlencoded v1.2.2
  Downloaded crypto-common v0.1.7
  Downloaded crunchy v0.2.4
    Blocking waiting for file lock on package cache
error: no bench target named `filter_kernel` in default-run packages

help: a target with a similar name exists: `filter_kernels`

File an issue against this benchmark runner

@Jefffrey

Jefffrey commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

run benchmark filter_kernels

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4865815942-808-4kzlp 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing jeadie/filter-list-specialization (da21a5a) to 7616e10 (merge-base) diff
BENCH_NAME=filter_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench filter_kernels
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                         jeadie_filter-list-specialization      main
-----                                                                         ---------------------------------      ----
filter context decimal128 (kept 1/2)                                          1.00     20.5±0.09µs        ? ?/sec    1.01     20.7±0.18µs        ? ?/sec
filter context decimal128 high selectivity (kept 1023/1024)                   1.02     19.2±0.18µs        ? ?/sec    1.00     18.8±0.20µs        ? ?/sec
filter context decimal128 low selectivity (kept 1/1024)                       1.00    146.1±0.74ns        ? ?/sec    1.01    147.2±0.61ns        ? ?/sec
filter context f32 (kept 1/2)                                                 1.00     83.2±5.49µs        ? ?/sec    1.02     84.9±6.38µs        ? ?/sec
filter context f32 high selectivity (kept 1023/1024)                          1.00      5.5±0.01µs        ? ?/sec    1.00      5.5±0.01µs        ? ?/sec
filter context f32 low selectivity (kept 1/1024)                              1.00   320.9±11.00ns        ? ?/sec    1.04   333.8±14.63ns        ? ?/sec
filter context fsb with value length 20 (kept 1/2)                            1.00     71.6±5.34µs        ? ?/sec    1.18    84.3±29.81µs        ? ?/sec
filter context fsb with value length 20 high selectivity (kept 1023/1024)     1.00     71.6±5.44µs        ? ?/sec    1.18    84.4±29.80µs        ? ?/sec
filter context fsb with value length 20 low selectivity (kept 1/1024)         1.00     71.5±5.32µs        ? ?/sec    1.18    84.3±29.87µs        ? ?/sec
filter context fsb with value length 5 (kept 1/2)                             1.00     71.7±5.41µs        ? ?/sec    1.18    84.4±29.81µs        ? ?/sec
filter context fsb with value length 5 high selectivity (kept 1023/1024)      1.00     71.6±5.48µs        ? ?/sec    1.18    84.4±29.80µs        ? ?/sec
filter context fsb with value length 5 low selectivity (kept 1/1024)          1.00     71.6±5.43µs        ? ?/sec    1.18    84.3±29.88µs        ? ?/sec
filter context fsb with value length 50 (kept 1/2)                            1.00     71.6±5.34µs        ? ?/sec    1.18    84.3±29.82µs        ? ?/sec
filter context fsb with value length 50 high selectivity (kept 1023/1024)     1.00     71.6±5.45µs        ? ?/sec    1.18    84.4±29.79µs        ? ?/sec
filter context fsb with value length 50 low selectivity (kept 1/1024)         1.00     71.5±5.34µs        ? ?/sec    1.18    84.3±29.88µs        ? ?/sec
filter context i32 (kept 1/2)                                                 1.14     14.3±0.09µs        ? ?/sec    1.00     12.5±0.01µs        ? ?/sec
filter context i32 high selectivity (kept 1023/1024)                          1.00      3.7±0.00µs        ? ?/sec    1.00      3.7±0.01µs        ? ?/sec
filter context i32 low selectivity (kept 1/1024)                              1.00    140.0±1.14ns        ? ?/sec    1.06    148.7±1.66ns        ? ?/sec
filter context i32 w NULLs (kept 1/2)                                         1.00     84.0±5.54µs        ? ?/sec    1.02     85.8±6.49µs        ? ?/sec
filter context i32 w NULLs high selectivity (kept 1023/1024)                  1.00      5.5±0.01µs        ? ?/sec    1.00      5.5±0.01µs        ? ?/sec
filter context i32 w NULLs low selectivity (kept 1/1024)                      1.00   328.4±11.29ns        ? ?/sec    1.01   330.2±13.40ns        ? ?/sec
filter context mixed string view (kept 1/2)                                   1.00     92.0±5.31µs        ? ?/sec    1.00     92.0±5.53µs        ? ?/sec
filter context mixed string view high selectivity (kept 1023/1024)            1.00     21.0±0.19µs        ? ?/sec    1.01     21.3±0.17µs        ? ?/sec
filter context mixed string view low selectivity (kept 1/1024)                1.01   417.1±15.17ns        ? ?/sec    1.00   413.2±15.71ns        ? ?/sec
filter context short string view (kept 1/2)                                   1.00     91.4±5.47µs        ? ?/sec    1.00     91.5±5.48µs        ? ?/sec
filter context short string view high selectivity (kept 1023/1024)            1.06     21.2±0.11µs        ? ?/sec    1.00     20.0±0.28µs        ? ?/sec
filter context short string view low selectivity (kept 1/1024)                1.02    356.2±9.77ns        ? ?/sec    1.00   350.6±11.41ns        ? ?/sec
filter context string (kept 1/2)                                              1.00    423.8±6.51µs        ? ?/sec    1.00    421.9±5.58µs        ? ?/sec
filter context string dictionary (kept 1/2)                                   1.00     12.7±0.02µs        ? ?/sec    1.00     12.7±0.02µs        ? ?/sec
filter context string dictionary high selectivity (kept 1023/1024)            1.02      4.2±0.00µs        ? ?/sec    1.00      4.1±0.00µs        ? ?/sec
filter context string dictionary low selectivity (kept 1/1024)                1.00    509.5±1.93ns        ? ?/sec    1.01    513.9±3.43ns        ? ?/sec
filter context string dictionary w NULLs (kept 1/2)                           1.00     85.5±5.43µs        ? ?/sec    1.03     87.8±6.44µs        ? ?/sec
filter context string dictionary w NULLs high selectivity (kept 1023/1024)    1.02      6.0±0.01µs        ? ?/sec    1.00      5.9±0.01µs        ? ?/sec
filter context string dictionary w NULLs low selectivity (kept 1/1024)        1.00   712.8±10.26ns        ? ?/sec    1.00   710.8±14.35ns        ? ?/sec
filter context string high selectivity (kept 1023/1024)                       1.00    317.9±5.65µs        ? ?/sec    1.01    320.3±2.13µs        ? ?/sec
filter context string low selectivity (kept 1/1024)                           1.00   707.8±12.44ns        ? ?/sec    1.07   757.4±10.00ns        ? ?/sec
filter context u8 (kept 1/2)                                                  1.00     12.1±0.02µs        ? ?/sec    1.00     12.1±0.01µs        ? ?/sec
filter context u8 high selectivity (kept 1023/1024)                           1.01   1066.2±3.78ns        ? ?/sec    1.00   1052.1±2.98ns        ? ?/sec
filter context u8 low selectivity (kept 1/1024)                               1.01    129.2±0.95ns        ? ?/sec    1.00    127.8±0.72ns        ? ?/sec
filter context u8 w NULLs (kept 1/2)                                          1.02     85.5±6.41µs        ? ?/sec    1.00     83.4±5.22µs        ? ?/sec
filter context u8 w NULLs high selectivity (kept 1023/1024)                   1.03      2.9±0.01µs        ? ?/sec    1.00      2.8±0.01µs        ? ?/sec
filter context u8 w NULLs low selectivity (kept 1/1024)                       1.00   313.0±13.17ns        ? ?/sec    1.00   312.1±10.03ns        ? ?/sec
filter decimal128 (kept 1/2)                                                  1.01     35.4±0.08µs        ? ?/sec    1.00     35.1±0.08µs        ? ?/sec
filter decimal128 high selectivity (kept 1023/1024)                           1.00     19.1±0.08µs        ? ?/sec    1.01     19.2±0.13µs        ? ?/sec
filter decimal128 low selectivity (kept 1/1024)                               1.00   1548.9±1.24ns        ? ?/sec    1.00   1541.7±1.55ns        ? ?/sec
filter f32 (kept 1/2)                                                         1.00    104.6±0.45µs        ? ?/sec    1.03    108.2±0.44µs        ? ?/sec
filter fixedsizelist (kept 1/2)                                               1.00    200.8±3.33µs        ? ?/sec  
filter fixedsizelist high selectivity (kept 1023/1024)                        1.00     19.8±0.12µs        ? ?/sec  
filter fixedsizelist low selectivity (kept 1/1024)                            1.00    760.5±4.78ns        ? ?/sec  
filter fsb with value length 20 (kept 1/2)                                    1.00     79.9±0.09µs        ? ?/sec    1.00     79.9±0.12µs        ? ?/sec
filter fsb with value length 20 high selectivity (kept 1023/1024)             1.00     24.3±0.47µs        ? ?/sec    1.07     26.0±1.38µs        ? ?/sec
filter fsb with value length 20 low selectivity (kept 1/1024)                 1.02   1664.1±9.02ns        ? ?/sec    1.00   1623.6±1.98ns        ? ?/sec
filter fsb with value length 5 (kept 1/2)                                     1.00     79.6±0.04µs        ? ?/sec    1.00     79.4±0.07µs        ? ?/sec
filter fsb with value length 5 high selectivity (kept 1023/1024)              1.00      5.9±0.05µs        ? ?/sec    1.10      6.5±1.14µs        ? ?/sec
filter fsb with value length 5 low selectivity (kept 1/1024)                  1.04   1619.4±3.11ns        ? ?/sec    1.00   1560.3±7.96ns        ? ?/sec
filter fsb with value length 50 (kept 1/2)                                    1.00    120.6±0.51µs        ? ?/sec    1.00    120.1±0.84µs        ? ?/sec
filter fsb with value length 50 high selectivity (kept 1023/1024)             1.00     87.6±6.60µs        ? ?/sec    1.01     88.6±5.44µs        ? ?/sec
filter fsb with value length 50 low selectivity (kept 1/1024)                 1.01   1633.3±1.22ns        ? ?/sec    1.00   1611.7±2.15ns        ? ?/sec
filter i32 (kept 1/2)                                                         1.00     29.4±0.03µs        ? ?/sec    1.00     29.3±0.03µs        ? ?/sec
filter i32 high selectivity (kept 1023/1024)                                  1.00      4.8±0.06µs        ? ?/sec    1.00      4.9±0.06µs        ? ?/sec
filter i32 low selectivity (kept 1/1024)                                      1.00   1496.7±0.90ns        ? ?/sec    1.00   1496.5±3.54ns        ? ?/sec
filter list binary (kept 1/2)                                                 1.00    722.1±9.73µs        ? ?/sec  
filter list binary high selectivity (kept 1023/1024)                          1.00    791.9±0.72µs        ? ?/sec  
filter list binary low selectivity (kept 1/1024)                              1.00  1984.5±14.12ns        ? ?/sec  
filter list dict (kept 1/2)                                                   1.00   317.3±10.22µs        ? ?/sec  
filter list dict high selectivity (kept 1023/1024)                            1.00     96.3±0.27µs        ? ?/sec  
filter list dict low selectivity (kept 1/1024)                                1.00  1543.2±10.04ns        ? ?/sec  
filter list fixedsizebinary (kept 1/2)                                        1.00   376.5±10.27µs        ? ?/sec  
filter list fixedsizebinary high selectivity (kept 1023/1024)                 1.00    128.5±0.56µs        ? ?/sec  
filter list fixedsizebinary low selectivity (kept 1/1024)                     1.00   1200.3±9.36ns        ? ?/sec  
filter list fixedsizelist (kept 1/2)                                          1.00    502.3±9.33µs        ? ?/sec  
filter list fixedsizelist high selectivity (kept 1023/1024)                   1.00    138.4±1.42µs        ? ?/sec  
filter list fixedsizelist low selectivity (kept 1/1024)                       1.00   1704.8±6.68ns        ? ?/sec  
filter list i32 (kept 1/2)                                                    1.00   315.9±10.03µs        ? ?/sec  
filter list i32 high selectivity (kept 1023/1024)                             1.00     94.9±0.25µs        ? ?/sec  
filter list i32 low selectivity (kept 1/1024)                                 1.00  1058.3±10.56ns        ? ?/sec  
filter list largebinary (kept 1/2)                                            1.00    734.4±9.89µs        ? ?/sec  
filter list largebinary high selectivity (kept 1023/1024)                     1.00    800.2±4.87µs        ? ?/sec  
filter list largebinary low selectivity (kept 1/1024)                         1.00      2.0±0.03µs        ? ?/sec  
filter list largeutf8 (kept 1/2)                                              1.00    732.2±9.07µs        ? ?/sec  
filter list largeutf8 high selectivity (kept 1023/1024)                       1.00    805.6±5.18µs        ? ?/sec  
filter list largeutf8 low selectivity (kept 1/1024)                           1.00      2.0±0.04µs        ? ?/sec  
filter list listview (kept 1/2)                                               1.00   381.0±10.84µs        ? ?/sec  
filter list listview high selectivity (kept 1023/1024)                        1.00    134.5±4.34µs        ? ?/sec  
filter list listview low selectivity (kept 1/1024)                            1.00  1298.2±13.15ns        ? ?/sec  
filter list map (kept 1/2)                                                    1.00  1154.9±11.60µs        ? ?/sec  
filter list map high selectivity (kept 1023/1024)                             1.00   1478.6±3.38µs        ? ?/sec  
filter list map low selectivity (kept 1/1024)                                 1.00      3.5±0.01µs        ? ?/sec  
filter list nested (kept 1/2)                                                 1.00   653.8±10.15µs        ? ?/sec  
filter list nested high selectivity (kept 1023/1024)                          1.00    346.5±1.89µs        ? ?/sec  
filter list nested low selectivity (kept 1/1024)                              1.00  1905.9±12.79ns        ? ?/sec  
filter list struct (kept 1/2)                                                 1.00   318.3±10.80µs        ? ?/sec  
filter list struct high selectivity (kept 1023/1024)                          1.00     97.2±5.04µs        ? ?/sec  
filter list struct low selectivity (kept 1/1024)                              1.00  1239.0±11.76ns        ? ?/sec  
filter list union (kept 1/2)                                                  1.00    475.3±9.87µs        ? ?/sec  
filter list union high selectivity (kept 1023/1024)                           1.00    185.7±0.61µs        ? ?/sec  
filter list union low selectivity (kept 1/1024)                               1.00      2.3±0.02µs        ? ?/sec  
filter list utf8 (kept 1/2)                                                   1.00    722.8±8.53µs        ? ?/sec  
filter list utf8 10xrows (kept 1/2)                                           1.00      7.9±0.14ms        ? ?/sec  
filter list utf8 high selectivity (kept 1023/1024)                            1.00    791.1±0.90µs        ? ?/sec  
filter list utf8 len256 (kept 1/2)                                            1.00     14.5±0.15ms        ? ?/sec  
filter list utf8 len64 (kept 1/2)                                             1.00  1285.5±21.41µs        ? ?/sec  
filter list utf8 len8 (kept 1/2)                                              1.00    726.6±9.37µs        ? ?/sec  
filter list utf8 low selectivity (kept 1/1024)                                1.00   1900.9±8.24ns        ? ?/sec  
filter list utf8view (kept 1/2)                                               1.00   451.1±12.24µs        ? ?/sec  
filter list utf8view high selectivity (kept 1023/1024)                        1.00    215.1±1.54µs        ? ?/sec  
filter list utf8view low selectivity (kept 1/1024)                            1.00  1092.2±13.03ns        ? ?/sec  
filter map (kept 1/2)                                                         1.00    597.9±9.36µs        ? ?/sec  
filter map high selectivity (kept 1023/1024)                                  1.00    607.5±0.54µs        ? ?/sec  
filter map low selectivity (kept 1/1024)                                      1.00      2.0±0.01µs        ? ?/sec  
filter optimize (kept 1/2)                                                    1.06     29.5±0.07µs        ? ?/sec    1.00     27.7±0.11µs        ? ?/sec
filter optimize high selectivity (kept 1023/1024)                             1.04   1383.7±3.30ns        ? ?/sec    1.00   1336.3±0.68ns        ? ?/sec
filter optimize low selectivity (kept 1/1024)                                 1.00   1321.0±0.73ns        ? ?/sec    1.00   1316.0±0.73ns        ? ?/sec
filter run array (kept 1/2)                                                   1.00    280.5±2.78µs        ? ?/sec    1.05    295.6±1.75µs        ? ?/sec
filter run array high selectivity (kept 1023/1024)                            1.00    280.7±5.38µs        ? ?/sec    1.02    287.0±4.72µs        ? ?/sec
filter run array low selectivity (kept 1/1024)                                1.00    230.8±1.07µs        ? ?/sec    1.02    236.3±1.05µs        ? ?/sec
filter single record batch                                                    1.00     29.2±0.07µs        ? ?/sec    1.01     29.4±0.07µs        ? ?/sec
filter u8 (kept 1/2)                                                          1.00     29.5±0.15µs        ? ?/sec    1.04     30.6±0.02µs        ? ?/sec
filter u8 high selectivity (kept 1023/1024)                                   1.01      2.2±0.04µs        ? ?/sec    1.00      2.2±0.04µs        ? ?/sec
filter u8 low selectivity (kept 1/1024)                                       1.01  1478.3±32.66ns        ? ?/sec    1.00   1457.6±2.21ns        ? ?/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 695.2s
Peak memory 29.0 MiB
Avg memory 17.0 MiB
CPU user 691.3s
CPU sys 0.1s
Peak spill 0 B

branch

Metric Value
Wall time 1210.3s
Peak memory 207.5 MiB
Avg memory 46.4 MiB
CPU user 1201.6s
CPU sys 6.9s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants