Skip to content

Benchmark prs#1529

Open
jprendes wants to merge 3 commits into
hyperlight-dev:mainfrom
jprendes:benchmark-prs
Open

Benchmark prs#1529
jprendes wants to merge 3 commits into
hyperlight-dev:mainfrom
jprendes:benchmark-prs

Conversation

@jprendes

Copy link
Copy Markdown
Contributor

No description provided.

@jprendes jprendes force-pushed the benchmark-prs branch 2 times, most recently from 8ad9393 to 0e181c4 Compare June 12, 2026 10:43
@jprendes jprendes added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Jun 12, 2026
@jprendes jprendes force-pushed the benchmark-prs branch 4 times, most recently from bc499ea to a277ac1 Compare June 17, 2026 14:48
@jprendes jprendes force-pushed the benchmark-prs branch 3 times, most recently from e660953 to 9ce9ed0 Compare June 18, 2026 11:22
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.81x slower* → 🚀 **2.41x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.91 ms (❌ 1.81x slower) 10.23 ms (❌ 1.19x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.92 µs (✅ 1.07x slower) 70.81 µs (✅ 1.13x faster) 12.77 µs (✅ 1.02x slower)
small 20.73 µs (✅ 1.11x faster) 65.39 µs (✅ 1.08x faster) 11.50 µs (✅ 1.08x faster)
default 23.58 µs (✅ 1.02x slower) 62.36 µs (✅ 1.00x slower) 12.91 µs (✅ 1.02x slower)
9.82 ms (❌ 1.48x slower) 22.93 µs (❌ 1.13x slower)
large 24.52 µs (✅ 1.07x slower) 89.47 µs (✅ 1.17x faster) 12.97 µs (✅ 1.02x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
596.90 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
22.33 µs (✅ 1.68x faster) 21.02 µs (🚀 1.86x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.23 ms (🚀 2.31x faster) 27.82 ms (🚀 2.41x faster) 29.26 ms (🚀 2.25x faster) 39.64 ms (🚀 1.98x faster)
medium 8.62 ms (🚀 2.05x faster) 9.84 ms (🚀 1.97x faster) 9.12 ms (🚀 1.97x faster) 24.22 ms (✅ 1.27x faster)
small 2.18 ms (✅ 1.73x faster) 5.11 ms (✅ 1.03x slower) 2.43 ms (✅ 1.47x faster) 18.61 ms (❌ 1.19x slower)
default 403.76 µs (✅ 1.27x faster) 2.59 ms (❌ 1.43x slower) 449.76 µs (✅ 1.25x faster) 18.50 ms (❌ 1.59x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.72 ms (✅ 1.11x slower) 3.03 ms (❌ 1.20x slower) 4.48 ms (✅ 1.10x slower)
1MB 40.44 µs (✅ 1.02x slower) 28.08 µs (✅ 1.30x faster) 38.55 µs (✅ 1.07x slower)

snapshots

restore create
medium 15.56 µs (✅ 1.03x faster) 28.75 ms (✅ 1.21x faster)
large 36.68 µs (✅ 1.41x faster) 100.35 ms (✅ 1.40x faster)
small 10.73 µs (❌ 1.13x slower) 2.93 ms (✅ 1.20x faster)
default 9.83 µs (✅ 1.03x faster) 302.77 µs (✅ 1.28x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.41x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.81x slower
kvm / intel (Linux) (❌ *1.88x slower* → 🚀 **7.91x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.01 ms (❌ 1.23x slower) 10.82 ms (✅ 1.15x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 39.32 µs (✅ 1.04x slower) 43.21 µs (✅ 1.03x slower) 20.59 µs (✅ 1.07x slower)
small 39.81 µs (✅ 1.07x slower) 37.25 µs (✅ 1.01x faster) 20.58 µs (✅ 1.10x slower)
default 36.34 µs (✅ 1.03x faster) 36.97 µs (✅ 1.02x faster) 21.09 µs (✅ 1.09x slower)
10.90 ms (❌ 1.64x slower) 14.60 µs (🚀 1.85x faster)
large 37.04 µs (✅ 1.05x slower) 80.05 µs (✅ 1.04x slower) 19.39 µs (✅ 1.00x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
653.71 ms (✅ 1.04x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
34.89 µs (🚀 3.07x faster) 34.42 µs (🚀 3.24x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 66.06 ms (🚀 2.32x faster) 70.78 ms (🚀 2.20x faster) 68.71 ms (🚀 2.24x faster) 80.86 ms (🚀 2.08x faster)
medium 18.34 ms (🚀 2.21x faster) 20.73 ms (🚀 2.07x faster) 18.33 ms (🚀 2.20x faster) 36.35 ms (✅ 1.51x faster)
small 2.30 ms (✅ 1.55x faster) 4.28 ms (✅ 1.29x faster) 2.39 ms (✅ 1.53x faster) 21.46 ms (❌ 1.24x slower)
default 408.44 µs (✅ 1.20x faster) 2.19 ms (✅ 1.05x slower) 416.68 µs (✅ 1.29x faster) 21.34 ms (❌ 1.88x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 6.08 ms (🚀 1.85x faster) 3.03 ms (✅ 1.02x faster) 4.73 ms (✅ 1.17x faster)
1MB 49.10 µs (✅ 1.04x slower) 29.03 µs (✅ 1.25x faster) 44.40 µs (✅ 1.11x slower)

snapshots

restore create
medium 15.95 µs (✅ 1.04x faster) 44.43 ms (✅ 1.68x faster)
large 67.05 µs (🚀 7.91x faster) 173.59 ms (✅ 1.70x faster)
small 11.79 µs (✅ 1.03x slower) 2.71 ms (✅ 1.46x faster)
default 11.17 µs (✅ 1.00x slower) 301.70 µs (✅ 1.27x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 7.91x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.88x slower
mshv3 / amd (Linux) (❌ *2.19x slower* → 🚀 **2.85x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
12.21 ms (❌ 1.77x slower) 8.42 ms (❌ 1.59x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 67.83 µs (✅ 1.35x faster) 154.02 µs (✅ 1.07x faster) 39.15 µs (✅ 1.35x faster)
small 69.99 µs (✅ 1.30x faster) 154.02 µs (✅ 1.02x slower) 41.88 µs (✅ 1.33x faster)
large 61.27 µs (✅ 1.49x faster) 213.11 µs (✅ 1.09x faster) 39.94 µs (✅ 1.39x faster)
54.00 µs (✅ 1.42x faster) 45.40 µs (✅ 1.09x faster)
medium 69.47 µs (✅ 1.23x faster) 173.44 µs (✅ 1.04x slower) 43.01 µs (✅ 1.32x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
86.23 ms (❌ 1.15x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
63.91 µs (✅ 1.33x faster) 60.65 µs (✅ 1.33x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 94.72 ms (❌ 1.52x slower) 10.57 ms (🚀 2.46x faster) 10.34 ms (🚀 2.49x faster) 12.61 ms (🚀 2.25x faster)
small 91.72 ms (❌ 2.19x slower) 2.38 ms (✅ 1.71x faster) 2.19 ms (✅ 1.77x faster) 3.90 ms (✅ 1.49x faster)
large 184.48 ms (❌ 1.28x slower) 33.93 ms (🚀 2.85x faster) 33.59 ms (🚀 2.81x faster) 39.90 ms (🚀 2.58x faster)
default 48.76 ms (❌ 1.41x slower) 516.20 µs (✅ 1.26x faster) 463.74 µs (✅ 1.32x faster) 1.40 ms (✅ 1.09x faster)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.30 ms (✅ 1.06x faster) 5.18 ms (✅ 1.06x slower) 4.11 ms (✅ 1.10x slower)
1MB 42.62 µs (✅ 1.03x slower) 42.48 µs (✅ 1.01x faster) 41.49 µs (✅ 1.00x slower)

snapshots

restore create
default 82.36 µs (✅ 1.02x faster) 358.92 µs (✅ 1.42x faster)
large 1.80 ms (🚀 1.86x faster) 196.62 ms (✅ 1.22x faster)
medium 94.28 µs (✅ 1.30x faster) 47.91 ms (✅ 1.20x faster)
small 85.34 µs (✅ 1.03x faster) 2.78 ms (✅ 1.59x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized_and_drop/large — 🚀 2.85x faster
  • Worst regression: sandboxes/create_initialized_and_drop/small — ❌ 2.19x slower
mshv3 / intel (Linux) (❌ *1.56x slower* → 🚀 **1.89x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
14.56 ms (✅ 1.05x slower) 11.73 ms (❌ 1.12x slower)

guest_calls

different_thread call_with_restore interrupt_latency call call_with_host_function
87.77 µs (✅ 1.43x faster) 70.06 µs (✅ 1.23x faster)
medium 264.19 µs (✅ 1.07x faster) 78.33 µs (✅ 1.31x faster) 120.92 µs (✅ 1.39x faster)
default 245.26 µs (✅ 1.13x faster) 79.13 µs (✅ 1.33x faster) 121.14 µs (✅ 1.35x faster)
small 251.80 µs (✅ 1.09x faster) 79.76 µs (✅ 1.34x faster) 124.74 µs (✅ 1.38x faster)
large 348.65 µs (✅ 1.18x faster) 80.72 µs (✅ 1.28x faster) 115.04 µs (✅ 1.52x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
113.56 ms (✅ 1.13x faster)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
98.22 µs (✅ 1.47x faster) 93.62 µs (✅ 1.49x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
large 230.76 ms (❌ 1.49x slower) 63.71 ms (✅ 1.66x faster) 63.13 ms (✅ 1.69x faster) 70.62 ms (✅ 1.59x faster)
small 59.00 ms (❌ 1.40x slower) 2.56 ms (✅ 1.76x faster) 2.31 ms (🚀 1.89x faster) 5.07 ms (✅ 1.28x faster)
medium 80.48 ms (❌ 1.27x slower) 17.28 ms (✅ 1.63x faster) 17.35 ms (✅ 1.63x faster) 20.46 ms (✅ 1.54x faster)
default 51.96 ms (❌ 1.56x slower) 386.46 µs (✅ 1.32x faster) 355.91 µs (✅ 1.38x faster) 1.52 ms (✅ 1.10x faster)

shared_memory

copy_to_slice copy_from_slice fill
1MB 41.58 µs (✅ 1.07x slower) 70.21 µs (❌ 1.12x slower) 34.91 µs (✅ 1.08x faster)
64MB 8.69 ms (✅ 1.09x faster) 11.42 ms (✅ 1.01x slower) 7.51 ms (✅ 1.03x slower)

snapshots

create restore
medium 42.42 ms (✅ 1.29x faster) 182.54 µs (❌ 1.14x slower)
default 358.88 µs (✅ 1.27x faster) 160.07 µs (✅ 1.02x slower)
small 4.11 ms (✅ 1.45x faster) 157.98 µs (✅ 1.03x slower)
large 165.26 ms (✅ 1.27x faster) 1.80 ms (✅ 1.45x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — 🚀 1.89x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.56x slower
hyperv-ws2025 / amd (Windows) (❌ *4.96x slower* → ✅ **1.42x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
11.54 ms (❌ 1.18x slower) 12.55 ms (❌ 1.46x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 54.08 µs (❌ 1.42x slower) 92.17 µs (❌ 1.37x slower) 208.67 µs (❌ 1.64x slower)
large 59.01 µs (❌ 1.46x slower) 120.51 µs (❌ 1.78x slower) 445.99 µs (❌ 1.79x slower)
medium 54.22 µs (❌ 1.42x slower) 107.02 µs (❌ 1.60x slower) 276.33 µs (❌ 1.68x slower)
small 53.11 µs (❌ 1.39x slower) 108.09 µs (❌ 1.71x slower) 206.86 µs (❌ 1.39x slower)
84.65 µs (✅ 1.10x slower) 148.91 µs (❌ 4.96x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.92 s (❌ 1.49x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
94.42 µs (❌ 1.30x slower) 90.34 µs (❌ 1.35x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 7.06 ms (❌ 1.41x slower) 6.89 ms (❌ 1.62x slower) 1.68 ms (❌ 1.37x slower) 1.90 ms (❌ 1.29x slower)
large 288.00 ms (✅ 1.17x faster) 239.45 ms (✅ 1.25x faster) 393.73 ms (✅ 1.05x slower) 315.94 ms (✅ 1.10x faster)
medium 96.81 ms (❌ 1.14x slower) 88.33 ms (❌ 1.17x slower) 78.66 ms (✅ 1.21x faster) 61.84 ms (✅ 1.42x faster)
small 19.84 ms (❌ 1.28x slower) 13.62 ms (✅ 1.01x slower) 10.73 ms (✅ 1.18x faster) 8.73 ms (✅ 1.37x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 52.17 µs (❌ 1.22x slower) 53.45 µs (❌ 1.14x slower) 43.71 µs (✅ 1.05x slower)
64MB 10.77 ms (❌ 1.29x slower) 13.19 ms (❌ 1.40x slower) 6.89 ms (❌ 1.11x slower)

snapshots

create restore
default 975.92 µs (❌ 1.34x slower) 126.94 µs (❌ 1.68x slower)
large 423.02 ms (❌ 1.12x slower) 65.22 ms (❌ 1.67x slower)
medium 89.67 ms (✅ 1.00x slower) 1.12 ms (❌ 4.23x slower)
small 11.11 ms (✅ 1.02x faster) 136.47 µs (❌ 1.99x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/medium — ✅ 1.42x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 4.96x slower
hyperv-ws2025 / intel (Windows) (❌ *6.68x slower* → ✅ **1.31x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
15.57 ms (✅ 1.06x slower) 12.67 ms (✅ 1.08x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 87.22 µs (❌ 1.17x slower) 123.51 µs (✅ 1.07x slower) 260.67 µs (✅ 1.08x slower)
large 82.18 µs (✅ 1.06x faster) 135.39 µs (❌ 1.16x slower) 790.72 µs (❌ 1.51x slower)
medium 82.22 µs (✅ 1.00x faster) 122.83 µs (✅ 1.07x faster) 351.26 µs (❌ 1.19x slower)
small 85.73 µs (✅ 1.08x slower) 123.63 µs (✅ 1.01x slower) 262.66 µs (✅ 1.07x slower)
101.76 µs (✅ 1.11x faster) 161.41 µs (❌ 2.44x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
5.71 s (❌ 1.22x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
105.93 µs (✅ 1.09x faster) 107.90 µs (✅ 1.08x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.05 ms (✅ 1.04x slower) 6.77 ms (❌ 1.16x slower) 1.34 ms (✅ 1.05x slower) 1.03 ms (✅ 1.09x faster)
large 398.19 ms (✅ 1.01x slower) 313.32 ms (✅ 1.04x faster) 399.59 ms (❌ 1.14x slower) 292.09 ms (✅ 1.05x faster)
medium 104.07 ms (✅ 1.00x slower) 81.05 ms (✅ 1.08x faster) 100.29 ms (❌ 1.15x slower) 75.12 ms (✅ 1.05x faster)
small 21.06 ms (✅ 1.01x slower) 16.71 ms (✅ 1.04x faster) 14.14 ms (❌ 1.16x slower) 12.65 ms (❌ 1.19x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 69.43 µs (✅ 1.04x slower) 71.10 µs (✅ 1.04x slower) 35.20 µs (✅ 1.05x slower)
64MB 14.48 ms (✅ 1.01x slower) 19.65 ms (❌ 1.21x slower) 10.98 ms (✅ 1.10x slower)

snapshots

create restore
default 759.73 µs (✅ 1.31x faster) 151.57 µs (✅ 1.00x slower)
large 428.73 ms (✅ 1.11x faster) 76.93 ms (❌ 1.34x slower)
medium 110.64 ms (✅ 1.10x faster) 6.02 ms (❌ 6.68x slower)
small 14.38 ms (✅ 1.09x faster) 169.71 µs (❌ 1.19x slower)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.31x faster
  • Worst regression: snapshots/restore/medium — ❌ 6.68x slower

@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.92x slower* → 🚀 **2.44x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
8.39 ms (❌ 1.92x slower) 8.17 ms (✅ 1.05x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.61 µs (✅ 1.06x slower) 70.44 µs (✅ 1.14x faster) 12.76 µs (✅ 1.02x slower)
small 20.75 µs (✅ 1.10x faster) 65.94 µs (✅ 1.08x faster) 11.44 µs (✅ 1.10x faster)
default 24.01 µs (✅ 1.04x slower) 61.98 µs (✅ 1.01x faster) 12.75 µs (✅ 1.01x slower)
9.69 ms (❌ 1.46x slower) 25.97 µs (❌ 1.28x slower)
large 24.42 µs (✅ 1.06x slower) 89.09 µs (✅ 1.17x faster) 12.94 µs (✅ 1.03x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
598.13 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
20.93 µs (🚀 1.81x faster) 22.34 µs (✅ 1.72x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.89 ms (🚀 2.25x faster) 27.47 ms (🚀 2.44x faster) 29.09 ms (🚀 2.27x faster) 38.79 ms (🚀 2.02x faster)
medium 8.75 ms (🚀 2.02x faster) 9.90 ms (🚀 1.96x faster) 9.02 ms (🚀 1.99x faster) 24.61 ms (✅ 1.25x faster)
small 2.28 ms (✅ 1.65x faster) 3.66 ms (✅ 1.36x faster) 2.32 ms (✅ 1.54x faster) 19.58 ms (❌ 1.25x slower)
default 403.02 µs (✅ 1.28x faster) 1.86 ms (✅ 1.03x slower) 445.02 µs (✅ 1.27x faster) 18.91 ms (❌ 1.63x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.70 ms (✅ 1.10x slower) 2.96 ms (❌ 1.18x slower) 5.50 ms (❌ 1.35x slower)
1MB 40.14 µs (✅ 1.02x slower) 28.00 µs (✅ 1.30x faster) 41.95 µs (❌ 1.11x slower)

snapshots

restore create
medium 15.99 µs (✅ 1.06x faster) 30.21 ms (✅ 1.16x faster)
large 36.55 µs (✅ 1.46x faster) 98.60 ms (✅ 1.42x faster)
small 11.40 µs (✅ 1.08x slower) 2.88 ms (✅ 1.22x faster)
default 9.43 µs (✅ 1.06x faster) 305.87 µs (✅ 1.29x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.44x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.92x slower
kvm / intel (Linux) (❌ *1.70x slower* → 🚀 **5.39x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.15 ms (❌ 1.25x slower) 9.96 ms (✅ 1.25x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 38.21 µs (✅ 1.01x slower) 39.63 µs (✅ 1.04x faster) 19.73 µs (✅ 1.01x slower)
small 34.55 µs (✅ 1.07x faster) 35.37 µs (✅ 1.06x faster) 19.86 µs (✅ 1.07x slower)
default 38.21 µs (✅ 1.03x slower) 35.90 µs (✅ 1.06x faster) 19.70 µs (✅ 1.03x slower)
10.29 ms (❌ 1.55x slower) 15.21 µs (✅ 1.78x faster)
large 38.51 µs (✅ 1.09x slower) 75.44 µs (✅ 1.02x faster) 18.43 µs (✅ 1.05x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
638.15 ms (✅ 1.07x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
30.93 µs (🚀 3.40x faster) 31.88 µs (🚀 3.47x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 75.42 ms (🚀 2.03x faster) 78.01 ms (🚀 2.00x faster) 74.83 ms (🚀 2.06x faster) 91.67 ms (🚀 1.84x faster)
medium 19.55 ms (🚀 2.08x faster) 22.36 ms (🚀 1.92x faster) 19.71 ms (🚀 2.04x faster) 35.27 ms (✅ 1.56x faster)
small 2.21 ms (✅ 1.61x faster) 4.23 ms (✅ 1.31x faster) 2.41 ms (✅ 1.52x faster) 20.69 ms (❌ 1.20x slower)
default 376.93 µs (✅ 1.30x faster) 2.05 ms (✅ 1.02x faster) 407.56 µs (✅ 1.32x faster) 19.27 ms (❌ 1.70x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 6.08 ms (🚀 1.85x faster) 3.39 ms (✅ 1.10x slower) 4.56 ms (✅ 1.22x faster)
1MB 47.28 µs (✅ 1.00x faster) 27.83 µs (✅ 1.31x faster) 44.24 µs (✅ 1.08x slower)

snapshots

restore create
medium 15.12 µs (✅ 1.11x faster) 47.89 ms (✅ 1.56x faster)
large 66.00 µs (🚀 5.39x faster) 193.81 ms (✅ 1.53x faster)
small 11.61 µs (✅ 1.03x slower) 2.72 ms (✅ 1.46x faster)
default 10.68 µs (✅ 1.02x faster) 299.04 µs (✅ 1.30x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 5.39x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.70x slower
mshv3 / amd (Linux) (❌ *1.98x slower* → 🚀 **2.60x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.03 ms (❌ 1.89x slower) 9.02 ms (❌ 1.71x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 68.32 µs (✅ 1.35x faster) 133.63 µs (✅ 1.22x faster) 39.51 µs (✅ 1.34x faster)
small 66.41 µs (✅ 1.37x faster) 150.98 µs (✅ 1.00x faster) 44.17 µs (✅ 1.25x faster)
large 70.43 µs (✅ 1.29x faster) 208.55 µs (✅ 1.12x faster) 40.19 µs (✅ 1.39x faster)
57.63 µs (✅ 1.34x faster) 44.31 µs (✅ 1.12x faster)
medium 70.11 µs (✅ 1.28x faster) 160.10 µs (✅ 1.04x faster) 42.22 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
86.99 ms (❌ 1.16x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
57.44 µs (✅ 1.49x faster) 61.58 µs (✅ 1.31x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 106.92 ms (❌ 1.71x slower) 11.47 ms (🚀 2.27x faster) 11.05 ms (🚀 2.32x faster) 13.89 ms (🚀 2.04x faster)
small 82.80 ms (❌ 1.98x slower) 2.42 ms (✅ 1.68x faster) 2.17 ms (✅ 1.79x faster) 4.42 ms (✅ 1.31x faster)
large 198.04 ms (❌ 1.37x slower) 37.85 ms (🚀 2.56x faster) 36.37 ms (🚀 2.60x faster) 43.58 ms (🚀 2.36x faster)
default 49.51 ms (❌ 1.44x slower) 497.14 µs (✅ 1.29x faster) 459.41 µs (✅ 1.33x faster) 1.59 ms (✅ 1.04x slower)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.35 ms (✅ 1.05x faster) 5.42 ms (❌ 1.11x slower) 4.38 ms (❌ 1.17x slower)
1MB 42.31 µs (✅ 1.01x slower) 43.13 µs (✅ 1.00x faster) 41.46 µs (✅ 1.01x slower)

snapshots

restore create
default 82.39 µs (✅ 1.02x faster) 357.16 µs (✅ 1.43x faster)
large 1.62 ms (🚀 2.07x faster) 195.13 ms (✅ 1.23x faster)
medium 98.24 µs (✅ 1.18x faster) 48.58 ms (✅ 1.18x faster)
small 87.38 µs (✅ 1.00x slower) 2.90 ms (✅ 1.52x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.60x faster
  • Worst regression: sandboxes/create_initialized_and_drop/small — ❌ 1.98x slower
mshv3 / intel (Linux) (❌ *2.33x slower* → 🚀 **1.94x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
14.40 ms (✅ 1.04x slower) 11.87 ms (❌ 1.13x slower)

guest_calls

call call_with_restore call_with_host_function interrupt_latency different_thread
small 79.07 µs (✅ 1.34x faster) 243.84 µs (✅ 1.14x faster) 115.85 µs (✅ 1.43x faster)
large 70.91 µs (✅ 1.45x faster) 331.88 µs (✅ 1.24x faster) 106.94 µs (✅ 1.65x faster)
medium 77.63 µs (✅ 1.32x faster) 236.32 µs (✅ 1.21x faster) 113.70 µs (✅ 1.49x faster)
default 76.25 µs (✅ 1.39x faster) 240.87 µs (✅ 1.15x faster) 118.87 µs (✅ 1.40x faster)
61.04 µs (✅ 1.41x faster) 82.67 µs (✅ 1.54x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
114.29 ms (✅ 1.12x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
92.98 µs (✅ 1.54x faster) 94.16 µs (✅ 1.53x faster)

sandboxes

create_uninitialized create_uninitialized_and_drop create_initialized_and_drop create_initialized
medium 16.69 ms (✅ 1.69x faster) 17.05 ms (✅ 1.65x faster) 147.48 ms (❌ 2.33x slower) 20.09 ms (✅ 1.56x faster)
default 362.39 µs (✅ 1.36x faster) 371.26 µs (✅ 1.39x faster) 49.84 ms (❌ 1.50x slower) 1.46 ms (✅ 1.14x faster)
large 61.97 ms (✅ 1.73x faster) 63.88 ms (✅ 1.66x faster) 224.00 ms (❌ 1.44x slower) 69.87 ms (✅ 1.61x faster)
small 2.24 ms (🚀 1.94x faster) 2.90 ms (✅ 1.55x faster) 97.52 ms (❌ 2.31x slower) 5.08 ms (✅ 1.27x faster)

shared_memory

fill copy_from_slice copy_to_slice
64MB 7.40 ms (✅ 1.02x slower) 11.69 ms (✅ 1.03x slower) 8.95 ms (✅ 1.06x faster)
1MB 33.44 µs (✅ 1.14x faster) 63.75 µs (✅ 1.02x slower) 41.50 µs (✅ 1.07x slower)

snapshots

create restore
large 162.87 ms (✅ 1.29x faster) 1.74 ms (✅ 1.50x faster)
default 354.95 µs (✅ 1.28x faster) 153.93 µs (✅ 1.03x faster)
small 4.39 ms (✅ 1.36x faster) 158.39 µs (✅ 1.03x slower)
medium 41.74 ms (✅ 1.32x faster) 169.35 µs (✅ 1.04x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — 🚀 1.94x faster
  • Worst regression: sandboxes/create_initialized_and_drop/medium — ❌ 2.33x slower
hyperv-ws2025 / amd (Windows) (❌ *8.66x slower* → ✅ **1.11x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.63 ms (❌ 1.39x slower) 12.67 ms (❌ 1.48x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 48.65 µs (❌ 1.27x slower) 78.79 µs (❌ 1.23x slower) 174.66 µs (❌ 1.26x slower)
large 53.10 µs (❌ 1.29x slower) 93.00 µs (❌ 1.43x slower) 461.88 µs (❌ 1.78x slower)
medium 51.24 µs (❌ 1.40x slower) 96.78 µs (❌ 1.44x slower) 229.54 µs (❌ 1.40x slower)
small 50.33 µs (❌ 1.33x slower) 78.57 µs (❌ 1.26x slower) 187.71 µs (❌ 1.30x slower)
78.47 µs (✅ 1.04x slower) 153.79 µs (❌ 5.12x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.74 s (❌ 1.44x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
98.65 µs (❌ 1.34x slower) 82.02 µs (❌ 1.17x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.24 ms (❌ 1.64x slower) 6.29 ms (❌ 1.48x slower) 1.39 ms (❌ 1.24x slower) 1.50 ms (✅ 1.00x faster)
large 405.55 ms (❌ 1.21x slower) 320.54 ms (✅ 1.07x slower) 388.69 ms (✅ 1.04x slower) 311.94 ms (✅ 1.11x faster)
medium 104.65 ms (❌ 1.23x slower) 107.36 ms (❌ 1.42x slower) 95.10 ms (✅ 1.00x slower) 79.46 ms (✅ 1.10x faster)
small 20.52 ms (❌ 1.32x slower) 17.29 ms (❌ 1.28x slower) 13.25 ms (✅ 1.04x slower) 15.18 ms (❌ 1.27x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 50.56 µs (❌ 1.20x slower) 51.72 µs (❌ 1.15x slower) 44.61 µs (✅ 1.09x slower)
64MB 18.59 ms (❌ 2.23x slower) 18.27 ms (❌ 1.93x slower) 12.48 ms (❌ 2.02x slower)

snapshots

create restore
default 872.32 µs (❌ 1.17x slower) 101.89 µs (❌ 1.40x slower)
large 446.46 ms (❌ 1.18x slower) 59.04 ms (❌ 1.51x slower)
medium 120.42 ms (❌ 1.35x slower) 2.29 ms (❌ 8.66x slower)
small 15.43 ms (❌ 1.36x slower) 115.82 µs (❌ 1.74x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — ✅ 1.11x faster
  • Worst regression: snapshots/restore/medium — ❌ 8.66x slower
hyperv-ws2025 / intel (Windows) (❌ *3.32x slower* → ✅ **1.24x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
16.46 ms (❌ 1.12x slower) 13.95 ms (❌ 1.19x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 77.41 µs (✅ 1.04x slower) 130.89 µs (✅ 1.08x slower) 234.50 µs (✅ 1.05x faster)
large 89.57 µs (✅ 1.01x slower) 127.20 µs (❌ 1.19x slower) 784.62 µs (❌ 1.38x slower)
medium 79.85 µs (✅ 1.03x faster) 139.73 µs (✅ 1.05x slower) 353.80 µs (❌ 1.19x slower)
small 79.49 µs (✅ 1.02x faster) 134.30 µs (✅ 1.08x slower) 254.65 µs (✅ 1.06x slower)
100.42 µs (✅ 1.14x faster) 148.20 µs (❌ 2.24x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.71 s (✅ 1.01x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
97.65 µs (✅ 1.24x faster) 97.15 µs (✅ 1.23x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 7.45 ms (✅ 1.04x faster) 6.10 ms (✅ 1.04x slower) 1.60 ms (❌ 1.31x slower) 1.42 ms (❌ 1.28x slower)
large 383.32 ms (✅ 1.02x faster) 320.47 ms (✅ 1.02x faster) 408.89 ms (❌ 1.17x slower) 301.74 ms (✅ 1.02x faster)
medium 111.60 ms (✅ 1.07x slower) 84.30 ms (✅ 1.03x faster) 97.63 ms (❌ 1.12x slower) 73.95 ms (✅ 1.07x faster)
small 20.50 ms (✅ 1.01x faster) 17.24 ms (✅ 1.01x faster) 14.45 ms (❌ 1.18x slower) 10.59 ms (✅ 1.01x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 70.83 µs (❌ 1.16x slower) 82.24 µs (❌ 1.16x slower) 37.47 µs (❌ 1.15x slower)
64MB 14.75 ms (✅ 1.03x slower) 17.81 ms (✅ 1.10x slower) 11.50 ms (❌ 1.15x slower)

snapshots

create restore
default 810.67 µs (✅ 1.24x faster) 134.78 µs (✅ 1.11x faster)
large 433.71 ms (✅ 1.10x faster) 83.49 ms (❌ 1.45x slower)
medium 112.00 ms (✅ 1.09x faster) 2.99 ms (❌ 3.32x slower)
small 14.84 ms (✅ 1.06x faster) 161.35 µs (❌ 1.11x slower)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.24x faster
  • Worst regression: snapshots/restore/medium — ❌ 3.32x slower

@hyperlight-dev hyperlight-dev deleted a comment from hyperlight-gh-bot Bot Jun 18, 2026
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.35x slower* → 🚀 **2.81x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
5.90 ms (❌ 1.35x slower) 8.19 ms (✅ 1.05x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 20.06 µs (✅ 1.11x faster) 71.45 µs (✅ 1.06x faster) 11.13 µs (✅ 1.13x faster)
small 20.14 µs (✅ 1.14x faster) 66.22 µs (✅ 1.04x faster) 11.11 µs (✅ 1.13x faster)
default 20.00 µs (✅ 1.17x faster) 58.34 µs (✅ 1.07x faster) 11.11 µs (✅ 1.12x faster)
6.52 ms (✅ 1.02x faster) 19.78 µs (✅ 1.03x faster)
large 19.95 µs (✅ 1.14x faster) 93.99 µs (✅ 1.04x faster) 11.19 µs (✅ 1.13x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
551.76 ms (✅ 1.01x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
20.58 µs (🚀 1.85x faster) 20.56 µs (🚀 1.90x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 23.19 ms (🚀 2.81x faster) 25.36 ms (🚀 2.65x faster) 24.11 ms (🚀 2.74x faster) 35.71 ms (🚀 2.19x faster)
medium 7.12 ms (🚀 2.48x faster) 8.95 ms (🚀 2.17x faster) 7.40 ms (🚀 2.43x faster) 21.53 ms (✅ 1.43x faster)
small 1.85 ms (🚀 2.04x faster) 3.49 ms (✅ 1.42x faster) 2.01 ms (✅ 1.78x faster) 15.31 ms (✅ 1.02x faster)
default 383.66 µs (✅ 1.36x faster) 1.74 ms (✅ 1.04x faster) 417.82 µs (✅ 1.35x faster) 11.96 ms (✅ 1.03x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.35 ms (✅ 1.02x slower) 2.04 ms (✅ 1.16x faster) 4.02 ms (✅ 1.02x faster)
1MB 38.23 µs (✅ 1.03x faster) 27.21 µs (✅ 1.33x faster) 37.69 µs (✅ 1.02x faster)

snapshots

restore create
medium 15.48 µs (✅ 1.08x faster) 25.89 ms (✅ 1.35x faster)
large 34.91 µs (✅ 1.56x faster) 90.74 ms (✅ 1.54x faster)
small 10.97 µs (✅ 1.02x slower) 2.26 ms (✅ 1.56x faster)
default 9.80 µs (✅ 1.04x faster) 296.38 µs (✅ 1.33x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.81x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.35x slower
kvm / intel (Linux) (✅ **1.01x slower** → 🚀 **13.07x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
5.47 ms (✅ 1.04x faster) 7.61 ms (✅ 1.64x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 29.33 µs (✅ 1.28x faster) 33.79 µs (✅ 1.26x faster) 15.54 µs (✅ 1.25x faster)
small 29.32 µs (✅ 1.28x faster) 30.87 µs (✅ 1.23x faster) 15.58 µs (✅ 1.20x faster)
default 29.40 µs (✅ 1.27x faster) 30.38 µs (✅ 1.24x faster) 15.57 µs (✅ 1.23x faster)
6.70 ms (✅ 1.01x slower) 11.20 µs (🚀 2.42x faster)
large 29.45 µs (✅ 1.19x faster) 65.78 µs (✅ 1.18x faster) 15.61 µs (✅ 1.23x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
528.62 ms (✅ 1.29x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
26.28 µs (🚀 4.03x faster) 25.96 µs (🚀 4.28x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 64.31 ms (🚀 2.38x faster) 66.21 ms (🚀 2.35x faster) 64.80 ms (🚀 2.37x faster) 75.66 ms (🚀 2.23x faster)
medium 17.28 ms (🚀 2.35x faster) 18.74 ms (🚀 2.29x faster) 17.29 ms (🚀 2.33x faster) 28.43 ms (🚀 1.93x faster)
small 1.95 ms (🚀 1.83x faster) 3.25 ms (✅ 1.71x faster) 2.01 ms (🚀 1.82x faster) 15.18 ms (✅ 1.14x faster)
default 331.08 µs (✅ 1.48x faster) 1.50 ms (✅ 1.40x faster) 347.13 µs (✅ 1.55x faster) 11.37 ms (✅ 1.00x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.74 ms (🚀 2.37x faster) 2.42 ms (✅ 1.28x faster) 4.49 ms (✅ 1.24x faster)
1MB 41.08 µs (✅ 1.15x faster) 23.29 µs (✅ 1.55x faster) 36.86 µs (✅ 1.11x faster)

snapshots

restore create
medium 12.85 µs (✅ 1.34x faster) 35.72 ms (🚀 2.09x faster)
large 48.03 µs (🚀 13.07x faster) 156.85 ms (🚀 1.89x faster)
small 9.79 µs (✅ 1.18x faster) 2.30 ms (✅ 1.72x faster)
default 9.45 µs (✅ 1.18x faster) 249.93 µs (✅ 1.54x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 13.07x faster
  • Worst regression: guest_calls/different_thread — ✅ 1.01x slower
mshv3 / amd (Linux) (❌ *1.44x slower* → 🚀 **4.04x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
9.95 ms (❌ 1.44x slower) 6.35 ms (❌ 1.20x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 60.31 µs (✅ 1.54x faster) 147.41 µs (✅ 1.11x faster) 37.31 µs (✅ 1.43x faster)
small 61.61 µs (✅ 1.49x faster) 151.95 µs (✅ 1.00x faster) 38.50 µs (✅ 1.44x faster)
large 62.69 µs (✅ 1.45x faster) 192.72 µs (✅ 1.24x faster) 39.34 µs (✅ 1.42x faster)
49.18 µs (✅ 1.55x faster) 28.82 µs (✅ 1.72x faster)
medium 60.96 µs (✅ 1.46x faster) 145.43 µs (✅ 1.10x faster) 38.81 µs (✅ 1.45x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
72.93 ms (✅ 1.03x faster)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
57.69 µs (✅ 1.49x faster) 55.85 µs (✅ 1.45x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 45.18 ms (✅ 1.38x faster) 9.73 ms (🚀 2.67x faster) 9.32 ms (🚀 2.76x faster) 12.18 ms (🚀 2.33x faster)
small 34.58 ms (✅ 1.21x faster) 1.89 ms (🚀 2.16x faster) 1.72 ms (🚀 2.26x faster) 3.42 ms (✅ 1.69x faster)
large 79.60 ms (🚀 1.82x faster) 32.97 ms (🚀 2.94x faster) 31.04 ms (🚀 3.04x faster) 39.63 ms (🚀 2.60x faster)
default 32.80 ms (✅ 1.05x faster) 481.81 µs (✅ 1.33x faster) 429.09 µs (✅ 1.42x faster) 1.33 ms (✅ 1.15x faster)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.57 ms (✅ 1.01x faster) 4.63 ms (✅ 1.05x faster) 3.62 ms (✅ 1.03x faster)
1MB 41.82 µs (✅ 1.00x slower) 42.38 µs (✅ 1.01x faster) 41.27 µs (✅ 1.00x faster)

snapshots

restore create
default 83.88 µs (✅ 1.02x faster) 346.24 µs (✅ 1.48x faster)
large 826.97 µs (🚀 4.04x faster) 187.03 ms (✅ 1.28x faster)
medium 93.99 µs (✅ 1.31x faster) 46.82 ms (✅ 1.23x faster)
small 79.33 µs (✅ 1.13x faster) 2.32 ms (🚀 1.90x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 4.04x faster
  • Worst regression: function_call_serialization/deserialize_function_call — ❌ 1.44x slower
mshv3 / intel (Linux) (✅ **1.10x slower** → 🚀 **2.89x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
10.83 ms (✅ 1.03x slower) 14.05 ms (✅ 1.01x slower)

guest_calls

call interrupt_latency call_with_host_function call_with_restore different_thread
medium 69.83 µs (✅ 1.47x faster) 107.20 µs (✅ 1.59x faster) 237.39 µs (✅ 1.21x faster)
small 71.00 µs (✅ 1.48x faster) 107.27 µs (✅ 1.61x faster) 225.84 µs (✅ 1.22x faster)
55.03 µs (✅ 1.57x faster) 80.02 µs (✅ 1.58x faster)
large 70.38 µs (✅ 1.45x faster) 107.95 µs (✅ 1.61x faster) 327.05 µs (✅ 1.25x faster)
default 70.04 µs (✅ 1.49x faster) 107.20 µs (✅ 1.57x faster) 224.42 µs (✅ 1.22x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
106.14 ms (✅ 1.21x faster)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
87.54 µs (✅ 1.64x faster) 85.27 µs (✅ 1.69x faster)

sandboxes

create_uninitialized create_initialized_and_drop create_uninitialized_and_drop create_initialized
large 60.02 ms (✅ 1.78x faster) 111.56 ms (✅ 1.39x faster) 62.53 ms (✅ 1.69x faster) 69.05 ms (✅ 1.63x faster)
default 349.77 µs (✅ 1.41x faster) 33.82 ms (✅ 1.02x slower) 377.95 µs (✅ 1.36x faster) 1.51 ms (✅ 1.11x faster)
medium 16.43 ms (✅ 1.72x faster) 57.32 ms (✅ 1.10x faster) 16.97 ms (✅ 1.66x faster) 20.23 ms (✅ 1.55x faster)
small 2.16 ms (🚀 2.02x faster) 41.98 ms (✅ 1.01x faster) 2.12 ms (🚀 2.12x faster) 4.86 ms (✅ 1.33x faster)

shared_memory

copy_from_slice fill copy_to_slice
1MB 61.08 µs (✅ 1.02x faster) 34.88 µs (✅ 1.10x faster) 41.03 µs (✅ 1.06x slower)
64MB 11.74 ms (✅ 1.04x slower) 7.98 ms (✅ 1.10x slower) 9.26 ms (✅ 1.02x faster)

snapshots

create restore
default 339.63 µs (✅ 1.34x faster) 144.19 µs (✅ 1.10x faster)
large 153.22 ms (✅ 1.37x faster) 903.32 µs (🚀 2.89x faster)
small 3.63 ms (✅ 1.64x faster) 142.47 µs (✅ 1.10x faster)
medium 40.15 ms (✅ 1.37x faster) 162.57 µs (✅ 1.05x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 2.89x faster
  • Worst regression: shared_memory/fill/64MB — ✅ 1.10x slower
hyperv-ws2025 / amd (Windows) (❌ *3.62x slower* → 🚀 **1.82x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
9.19 ms (✅ 1.07x faster) 7.91 ms (✅ 1.08x faster)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 38.51 µs (✅ 1.03x faster) 63.37 µs (✅ 1.04x faster) 141.63 µs (✅ 1.00x faster)
large 39.78 µs (✅ 1.00x slower) 62.74 µs (✅ 1.05x faster) 300.59 µs (✅ 1.03x slower)
medium 39.50 µs (✅ 1.00x slower) 62.15 µs (✅ 1.02x faster) 165.90 µs (✅ 1.03x faster)
small 39.00 µs (✅ 1.00x slower) 63.98 µs (✅ 1.02x slower) 144.40 µs (✅ 1.03x faster)
63.48 µs (✅ 1.20x faster) 108.89 µs (❌ 3.62x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
2.98 s (✅ 1.11x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
67.32 µs (✅ 1.13x faster) 61.18 µs (✅ 1.15x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 5.16 ms (✅ 1.03x slower) 4.19 ms (✅ 1.02x faster) 1.02 ms (✅ 1.21x faster) 989.19 µs (✅ 1.53x faster)
large 239.42 ms (✅ 1.40x faster) 201.44 ms (✅ 1.49x faster) 224.50 ms (✅ 1.67x faster) 190.91 ms (🚀 1.82x faster)
medium 63.48 ms (✅ 1.34x faster) 54.11 ms (✅ 1.39x faster) 55.86 ms (✅ 1.70x faster) 49.09 ms (✅ 1.79x faster)
small 12.62 ms (✅ 1.23x faster) 10.71 ms (✅ 1.26x faster) 7.59 ms (✅ 1.67x faster) 6.88 ms (✅ 1.74x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 42.30 µs (✅ 1.00x faster) 42.67 µs (✅ 1.04x faster) 41.57 µs (✅ 1.00x slower)
64MB 6.72 ms (✅ 1.24x faster) 7.43 ms (✅ 1.27x faster) 5.00 ms (✅ 1.24x faster)

snapshots

create restore
default 525.56 µs (✅ 1.38x faster) 73.14 µs (✅ 1.02x faster)
large 271.12 ms (✅ 1.40x faster) 41.58 ms (✅ 1.07x slower)
medium 66.81 ms (✅ 1.34x faster) 155.78 µs (❌ 1.33x slower)
small 8.64 ms (✅ 1.32x faster) 78.36 µs (✅ 1.04x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 1.82x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 3.62x slower
hyperv-ws2025 / intel (Windows) (❌ *1.99x slower* → ✅ **1.38x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
14.60 ms (✅ 1.00x faster) 12.15 ms (✅ 1.04x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 71.97 µs (✅ 1.03x faster) 110.09 µs (✅ 1.05x faster) 233.45 µs (✅ 1.04x faster)
large 73.38 µs (✅ 1.17x faster) 108.98 µs (✅ 1.02x faster) 601.14 µs (✅ 1.02x faster)
medium 73.88 µs (✅ 1.12x faster) 111.89 µs (✅ 1.18x faster) 298.61 µs (✅ 1.06x faster)
small 73.33 µs (✅ 1.09x faster) 111.44 µs (✅ 1.09x faster) 242.53 µs (✅ 1.05x faster)
81.58 µs (✅ 1.37x faster) 131.74 µs (❌ 1.99x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.63 s (✅ 1.01x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
94.07 µs (✅ 1.26x faster) 96.45 µs (✅ 1.23x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 6.64 ms (✅ 1.17x faster) 5.28 ms (✅ 1.11x faster) 1.04 ms (✅ 1.21x faster) 948.10 µs (✅ 1.18x faster)
large 326.60 ms (✅ 1.20x faster) 278.10 ms (✅ 1.17x faster) 310.56 ms (✅ 1.13x faster) 266.70 ms (✅ 1.15x faster)
medium 87.36 ms (✅ 1.19x faster) 74.84 ms (✅ 1.16x faster) 77.91 ms (✅ 1.12x faster) 67.74 ms (✅ 1.16x faster)
small 17.35 ms (✅ 1.20x faster) 14.80 ms (✅ 1.17x faster) 10.65 ms (✅ 1.15x faster) 9.35 ms (✅ 1.14x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 77.10 µs (❌ 1.17x slower) 79.39 µs (❌ 1.14x slower) 33.01 µs (✅ 1.07x faster)
64MB 13.60 ms (✅ 1.05x faster) 15.70 ms (✅ 1.03x faster) 8.08 ms (✅ 1.24x faster)

snapshots

create restore
default 737.04 µs (✅ 1.38x faster) 142.60 µs (✅ 1.06x faster)
large 407.01 ms (✅ 1.17x faster) 54.53 ms (✅ 1.05x faster)
medium 101.35 ms (✅ 1.20x faster) 357.11 µs (✅ 1.04x faster)
small 12.83 ms (✅ 1.23x faster) 150.51 µs (✅ 1.04x faster)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.38x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 1.99x slower

@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.81x slower* → 🚀 **2.37x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.93 ms (❌ 1.81x slower) 10.43 ms (❌ 1.21x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 21.59 µs (✅ 1.02x faster) 69.42 µs (✅ 1.16x faster) 11.57 µs (✅ 1.10x faster)
small 23.91 µs (✅ 1.06x slower) 67.10 µs (✅ 1.06x faster) 12.89 µs (✅ 1.02x slower)
default 24.36 µs (✅ 1.05x slower) 62.67 µs (✅ 1.01x slower) 12.78 µs (✅ 1.02x slower)
9.62 ms (❌ 1.45x slower) 33.57 µs (❌ 1.65x slower)
large 23.89 µs (✅ 1.05x slower) 89.99 µs (✅ 1.15x faster) 13.09 µs (✅ 1.03x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
597.95 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
22.52 µs (✅ 1.69x faster) 22.90 µs (✅ 1.72x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.14 ms (🚀 2.31x faster) 28.28 ms (🚀 2.37x faster) 30.18 ms (🚀 2.19x faster) 37.66 ms (🚀 2.08x faster)
medium 8.55 ms (🚀 2.06x faster) 11.12 ms (✅ 1.74x faster) 9.32 ms (🚀 1.93x faster) 23.24 ms (✅ 1.32x faster)
small 2.16 ms (✅ 1.74x faster) 3.61 ms (✅ 1.38x faster) 2.32 ms (✅ 1.54x faster) 19.03 ms (❌ 1.21x slower)
default 400.10 µs (✅ 1.29x faster) 1.87 ms (✅ 1.03x slower) 443.99 µs (✅ 1.27x faster) 17.38 ms (❌ 1.50x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.67 ms (✅ 1.10x slower) 2.73 ms (✅ 1.09x slower) 5.03 ms (❌ 1.23x slower)
1MB 39.93 µs (✅ 1.02x slower) 28.23 µs (✅ 1.27x faster) 39.70 µs (✅ 1.05x slower)

snapshots

restore create
medium 14.61 µs (✅ 1.06x faster) 25.58 ms (✅ 1.37x faster)
large 35.94 µs (✅ 1.47x faster) 102.18 ms (✅ 1.37x faster)
small 10.49 µs (✅ 1.01x slower) 2.52 ms (✅ 1.39x faster)
default 9.67 µs (✅ 1.05x faster) 306.56 µs (✅ 1.27x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.37x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.81x slower
kvm / intel (Linux) (❌ *1.92x slower* → 🚀 **8.06x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
6.42 ms (❌ 1.13x slower) 9.33 ms (✅ 1.34x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 37.56 µs (✅ 1.01x faster) 41.31 µs (✅ 1.01x faster) 19.60 µs (✅ 1.00x slower)
small 34.30 µs (✅ 1.09x faster) 35.21 µs (✅ 1.05x faster) 19.57 µs (✅ 1.04x slower)
default 37.20 µs (✅ 1.01x faster) 36.61 µs (✅ 1.02x faster) 19.70 µs (✅ 1.02x slower)
12.68 ms (❌ 1.91x slower) 13.16 µs (🚀 2.06x faster)
large 37.43 µs (✅ 1.06x slower) 77.42 µs (✅ 1.01x faster) 18.11 µs (✅ 1.06x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
605.29 ms (✅ 1.13x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
30.42 µs (🚀 3.47x faster) 31.59 µs (🚀 3.47x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 65.80 ms (🚀 2.33x faster) 71.09 ms (🚀 2.19x faster) 67.25 ms (🚀 2.29x faster) 80.61 ms (🚀 2.09x faster)
medium 17.75 ms (🚀 2.29x faster) 20.09 ms (🚀 2.14x faster) 17.92 ms (🚀 2.25x faster) 36.43 ms (✅ 1.51x faster)
small 2.12 ms (✅ 1.68x faster) 3.73 ms (✅ 1.49x faster) 2.20 ms (✅ 1.66x faster) 22.32 ms (❌ 1.29x slower)
default 377.60 µs (✅ 1.29x faster) 2.05 ms (✅ 1.02x faster) 396.91 µs (✅ 1.35x faster) 21.69 ms (❌ 1.92x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 5.24 ms (🚀 2.15x faster) 2.83 ms (✅ 1.09x faster) 4.24 ms (✅ 1.31x faster)
1MB 47.60 µs (✅ 1.01x slower) 26.31 µs (✅ 1.35x faster) 41.96 µs (✅ 1.03x slower)

snapshots

restore create
medium 14.96 µs (✅ 1.13x faster) 42.56 ms (✅ 1.76x faster)
large 63.89 µs (🚀 8.06x faster) 163.46 ms (🚀 1.81x faster)
small 11.27 µs (✅ 1.03x faster) 2.55 ms (✅ 1.56x faster)
default 10.92 µs (✅ 1.01x faster) 290.35 µs (✅ 1.32x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 8.06x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.92x slower
mshv3 / amd (Linux) (❌ *2.29x slower* → 🚀 **2.72x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
11.64 ms (❌ 1.68x slower) 8.25 ms (❌ 1.56x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 69.47 µs (✅ 1.32x faster) 154.59 µs (✅ 1.06x faster) 39.35 µs (✅ 1.34x faster)
small 75.77 µs (✅ 1.20x faster) 163.98 µs (✅ 1.08x slower) 44.49 µs (✅ 1.25x faster)
large 63.26 µs (✅ 1.32x faster) 222.78 µs (✅ 1.05x faster) 40.02 µs (✅ 1.37x faster)
61.24 µs (✅ 1.22x faster) 49.93 µs (✅ 1.01x slower)
medium 73.45 µs (✅ 1.21x faster) 172.60 µs (✅ 1.02x slower) 42.74 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
85.20 ms (❌ 1.14x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
62.98 µs (✅ 1.36x faster) 60.67 µs (✅ 1.33x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 96.00 ms (❌ 1.54x slower) 11.15 ms (🚀 2.33x faster) 10.75 ms (🚀 2.39x faster) 12.93 ms (🚀 2.19x faster)
small 96.04 ms (❌ 2.29x slower) 2.80 ms (✅ 1.45x faster) 2.64 ms (✅ 1.47x faster) 4.24 ms (✅ 1.37x faster)
large 177.32 ms (❌ 1.23x slower) 35.57 ms (🚀 2.72x faster) 34.82 ms (🚀 2.71x faster) 40.96 ms (🚀 2.51x faster)
default 49.08 ms (❌ 1.42x slower) 518.24 µs (✅ 1.24x faster) 472.07 µs (✅ 1.29x faster) 1.57 ms (✅ 1.03x slower)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.73 ms (✅ 1.02x slower) 5.29 ms (✅ 1.09x slower) 4.00 ms (✅ 1.07x slower)
1MB 42.88 µs (✅ 1.03x slower) 42.61 µs (✅ 1.00x faster) 41.46 µs (✅ 1.01x slower)

snapshots

restore create
default 83.89 µs (✅ 1.00x faster) 364.26 µs (✅ 1.40x faster)
large 1.62 ms (🚀 2.06x faster) 192.74 ms (✅ 1.25x faster)
medium 102.21 µs (✅ 1.05x faster) 48.79 ms (✅ 1.18x faster)
small 86.94 µs (✅ 1.00x faster) 3.08 ms (✅ 1.43x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized_and_drop/large — 🚀 2.72x faster
  • Worst regression: sandboxes/create_initialized_and_drop/small — ❌ 2.29x slower
mshv3 / intel (Linux) (❌ *2.15x slower* → ✅ **1.70x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
11.58 ms (✅ 1.11x slower) 14.01 ms (✅ 1.01x slower)

guest_calls

call_with_host_function different_thread call call_with_restore interrupt_latency
medium 115.75 µs (✅ 1.46x faster) 76.45 µs (✅ 1.35x faster) 261.89 µs (✅ 1.07x faster)
87.15 µs (✅ 1.44x faster) 56.25 µs (✅ 1.53x faster)
default 119.01 µs (✅ 1.42x faster) 76.75 µs (✅ 1.38x faster) 242.38 µs (✅ 1.15x faster)
large 109.07 µs (✅ 1.53x faster) 77.11 µs (✅ 1.32x faster) 349.08 µs (✅ 1.18x faster)
small 120.36 µs (✅ 1.45x faster) 74.63 µs (✅ 1.41x faster) 245.22 µs (✅ 1.13x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
109.76 ms (✅ 1.17x faster)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
95.61 µs (✅ 1.52x faster) 94.32 µs (✅ 1.53x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized create_uninitialized_and_drop
small 58.56 ms (❌ 1.39x slower) 4.82 ms (✅ 1.34x faster) 2.56 ms (✅ 1.70x faster) 2.93 ms (✅ 1.53x faster)
medium 136.36 ms (❌ 2.15x slower) 19.88 ms (✅ 1.58x faster) 17.29 ms (✅ 1.63x faster) 17.38 ms (✅ 1.62x faster)
large 248.72 ms (❌ 1.60x slower) 69.47 ms (✅ 1.62x faster) 63.30 ms (✅ 1.69x faster) 64.03 ms (✅ 1.65x faster)
default 47.02 ms (❌ 1.42x slower) 1.42 ms (✅ 1.17x faster) 363.36 µs (✅ 1.36x faster) 376.91 µs (✅ 1.36x faster)

shared_memory

fill copy_to_slice copy_from_slice
64MB 7.27 ms (✅ 1.00x slower) 8.14 ms (✅ 1.16x faster) 10.41 ms (✅ 1.09x faster)
1MB 35.25 µs (✅ 1.08x faster) 41.70 µs (✅ 1.09x slower) 73.74 µs (❌ 1.18x slower)

snapshots

create restore
medium 40.44 ms (✅ 1.36x faster) 178.80 µs (✅ 1.02x slower)
small 3.75 ms (✅ 1.59x faster) 158.93 µs (✅ 1.02x slower)
default 350.84 µs (✅ 1.31x faster) 161.20 µs (✅ 1.02x slower)
large 159.90 ms (✅ 1.32x faster) 3.27 ms (❌ 1.25x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — ✅ 1.70x faster
  • Worst regression: sandboxes/create_initialized_and_drop/medium — ❌ 2.15x slower
hyperv-ws2025 / amd (Windows) (❌ *6.07x slower* → ✅ **1.44x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
12.11 ms (❌ 1.24x slower) 10.89 ms (❌ 1.27x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 51.23 µs (❌ 1.30x slower) 75.36 µs (❌ 1.14x slower) 189.08 µs (❌ 1.36x slower)
large 50.14 µs (❌ 1.31x slower) 89.49 µs (❌ 1.26x slower) 470.07 µs (❌ 1.77x slower)
medium 53.66 µs (❌ 1.41x slower) 79.85 µs (❌ 1.23x slower) 229.59 µs (❌ 1.52x slower)
small 50.90 µs (❌ 1.32x slower) 97.99 µs (❌ 1.44x slower) 186.37 µs (❌ 1.32x slower)
79.48 µs (✅ 1.02x slower) 182.44 µs (❌ 6.07x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.34 s (❌ 1.32x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
91.87 µs (❌ 1.28x slower) 96.21 µs (❌ 1.25x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 6.68 ms (❌ 1.33x slower) 5.65 ms (❌ 1.33x slower) 1.30 ms (✅ 1.08x slower) 1.22 ms (✅ 1.22x faster)
large 305.40 ms (✅ 1.10x faster) 243.02 ms (✅ 1.23x faster) 318.34 ms (✅ 1.18x faster) 245.23 ms (✅ 1.42x faster)
medium 91.01 ms (✅ 1.07x slower) 75.32 ms (✅ 1.00x faster) 76.26 ms (✅ 1.24x faster) 60.96 ms (✅ 1.44x faster)
small 16.81 ms (✅ 1.09x slower) 13.38 ms (✅ 1.01x faster) 10.43 ms (✅ 1.22x faster) 8.77 ms (✅ 1.37x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 49.66 µs (❌ 1.14x slower) 50.93 µs (❌ 1.22x slower) 43.32 µs (✅ 1.05x slower)
64MB 9.79 ms (❌ 1.17x slower) 12.70 ms (❌ 1.34x slower) 7.48 ms (❌ 1.21x slower)

snapshots

create restore
default 679.35 µs (✅ 1.10x faster) 93.85 µs (❌ 1.28x slower)
large 336.52 ms (✅ 1.13x faster) 65.75 ms (❌ 1.68x slower)
medium 86.46 ms (✅ 1.04x faster) 375.55 µs (❌ 4.22x slower)
small 10.75 ms (✅ 1.06x faster) 108.84 µs (❌ 1.49x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/medium — ✅ 1.44x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 6.07x slower
hyperv-ws2025 / intel (Windows) (❌ *5.82x slower* → ✅ **1.29x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
15.79 ms (✅ 1.08x slower) 12.93 ms (✅ 1.10x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 90.77 µs (❌ 1.23x slower) 151.28 µs (❌ 1.33x slower) 260.72 µs (✅ 1.09x slower)
large 97.07 µs (❌ 1.12x slower) 153.85 µs (❌ 1.39x slower) 794.05 µs (❌ 1.46x slower)
medium 90.89 µs (✅ 1.09x slower) 146.21 µs (❌ 1.14x slower) 368.69 µs (❌ 1.25x slower)
small 90.27 µs (❌ 1.14x slower) 148.76 µs (❌ 1.25x slower) 282.50 µs (❌ 1.17x slower)
115.18 µs (✅ 1.06x slower) 153.17 µs (❌ 2.31x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
5.19 s (✅ 1.11x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
114.38 µs (✅ 1.03x slower) 100.26 µs (✅ 1.17x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.89 ms (❌ 1.15x slower) 6.96 ms (❌ 1.19x slower) 1.33 ms (✅ 1.03x slower) 1.04 ms (✅ 1.07x faster)
large 394.42 ms (✅ 1.00x slower) 311.65 ms (✅ 1.05x faster) 411.48 ms (❌ 1.17x slower) 318.03 ms (✅ 1.04x slower)
medium 111.36 ms (✅ 1.07x slower) 85.13 ms (✅ 1.02x faster) 104.76 ms (❌ 1.20x slower) 75.19 ms (✅ 1.05x faster)
small 22.68 ms (✅ 1.09x slower) 17.66 ms (✅ 1.02x slower) 14.99 ms (❌ 1.23x slower) 11.30 ms (✅ 1.06x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 67.78 µs (✅ 1.04x slower) 71.75 µs (✅ 1.02x slower) 41.64 µs (✅ 1.08x slower)
64MB 14.64 ms (✅ 1.03x slower) 19.45 ms (❌ 1.20x slower) 10.60 ms (✅ 1.06x slower)

snapshots

create restore
default 776.27 µs (✅ 1.29x faster) 148.86 µs (✅ 1.04x faster)
large 429.48 ms (✅ 1.11x faster) 77.77 ms (❌ 1.36x slower)
medium 112.07 ms (✅ 1.09x faster) 5.24 ms (❌ 5.82x slower)
small 14.15 ms (✅ 1.11x faster) 168.88 µs (❌ 1.26x slower)

Summary

  • Biggest gain: snapshots/create/default — ✅ 1.29x faster
  • Worst regression: snapshots/restore/medium — ❌ 5.82x slower

@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.77x slower* → 🚀 **2.48x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.74 ms (❌ 1.77x slower) 10.21 ms (❌ 1.19x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.52 µs (✅ 1.06x slower) 70.89 µs (✅ 1.15x faster) 12.76 µs (✅ 1.01x slower)
small 23.48 µs (✅ 1.03x slower) 67.80 µs (✅ 1.04x faster) 12.62 µs (✅ 1.01x slower)
default 20.57 µs (✅ 1.11x faster) 62.47 µs (✅ 1.01x slower) 12.73 µs (✅ 1.01x slower)
10.35 ms (❌ 1.56x slower) 22.37 µs (✅ 1.10x slower)
large 24.29 µs (✅ 1.06x slower) 88.44 µs (✅ 1.18x faster) 11.50 µs (✅ 1.10x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
594.35 ms (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
22.44 µs (✅ 1.68x faster) 22.40 µs (✅ 1.75x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.18 ms (🚀 2.31x faster) 27.01 ms (🚀 2.48x faster) 28.77 ms (🚀 2.29x faster) 38.83 ms (🚀 2.02x faster)
medium 8.74 ms (🚀 2.02x faster) 9.73 ms (🚀 1.99x faster) 8.85 ms (🚀 2.03x faster) 24.04 ms (✅ 1.28x faster)
small 2.19 ms (✅ 1.72x faster) 4.16 ms (✅ 1.19x faster) 2.30 ms (✅ 1.56x faster) 19.63 ms (❌ 1.25x slower)
default 399.52 µs (✅ 1.29x faster) 2.37 ms (❌ 1.30x slower) 444.05 µs (✅ 1.27x faster) 18.98 ms (❌ 1.63x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.60 ms (✅ 1.08x slower) 2.82 ms (❌ 1.12x slower) 4.81 ms (❌ 1.18x slower)
1MB 40.00 µs (✅ 1.01x slower) 28.05 µs (✅ 1.30x faster) 41.07 µs (✅ 1.09x slower)

snapshots

restore create
medium 16.52 µs (✅ 1.03x faster) 29.31 ms (✅ 1.19x faster)
large 36.02 µs (✅ 1.51x faster) 101.77 ms (✅ 1.38x faster)
small 11.88 µs (❌ 1.12x slower) 2.87 ms (✅ 1.22x faster)
default 11.36 µs (✅ 1.10x slower) 309.44 µs (✅ 1.25x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.48x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.77x slower
kvm / intel (Linux) (❌ *2.04x slower* → 🚀 **12.49x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
5.89 ms (✅ 1.03x slower) 7.05 ms (✅ 1.77x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 32.61 µs (✅ 1.16x faster) 34.16 µs (✅ 1.24x faster) 17.11 µs (✅ 1.14x faster)
small 30.16 µs (✅ 1.24x faster) 31.53 µs (✅ 1.21x faster) 15.83 µs (✅ 1.17x faster)
default 32.50 µs (✅ 1.15x faster) 32.32 µs (✅ 1.17x faster) 17.10 µs (✅ 1.12x faster)
10.90 ms (❌ 1.64x slower) 11.57 µs (🚀 2.34x faster)
large 33.21 µs (✅ 1.06x faster) 69.87 µs (✅ 1.12x faster) 17.19 µs (✅ 1.12x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
549.88 ms (✅ 1.24x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
27.05 µs (🚀 3.91x faster) 27.75 µs (🚀 3.99x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 65.79 ms (🚀 2.33x faster) 66.27 ms (🚀 2.35x faster) 63.49 ms (🚀 2.42x faster) 76.65 ms (🚀 2.20x faster)
medium 17.26 ms (🚀 2.35x faster) 19.32 ms (🚀 2.23x faster) 17.39 ms (🚀 2.32x faster) 35.75 ms (✅ 1.54x faster)
small 2.01 ms (✅ 1.78x faster) 3.26 ms (✅ 1.70x faster) 2.04 ms (✅ 1.80x faster) 22.00 ms (❌ 1.27x slower)
default 331.83 µs (✅ 1.44x faster) 1.56 ms (✅ 1.34x faster) 352.22 µs (✅ 1.52x faster) 23.12 ms (❌ 2.04x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.84 ms (🚀 2.33x faster) 2.65 ms (✅ 1.17x faster) 3.86 ms (✅ 1.44x faster)
1MB 41.61 µs (✅ 1.13x faster) 23.43 µs (✅ 1.54x faster) 36.92 µs (✅ 1.10x faster)

snapshots

restore create
medium 13.08 µs (✅ 1.29x faster) 39.72 ms (🚀 1.88x faster)
large 47.91 µs (🚀 12.49x faster) 162.64 ms (🚀 1.82x faster)
small 9.73 µs (✅ 1.19x faster) 2.28 ms (✅ 1.74x faster)
default 9.56 µs (✅ 1.16x faster) 259.27 µs (✅ 1.48x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 12.49x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 2.04x slower
mshv3 / amd (Linux) (❌ *1.87x slower* → 🚀 **2.73x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
12.94 ms (❌ 1.87x slower) 8.93 ms (❌ 1.69x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 71.73 µs (✅ 1.29x faster) 158.08 µs (✅ 1.03x faster) 44.47 µs (✅ 1.19x faster)
small 68.49 µs (✅ 1.32x faster) 156.53 µs (✅ 1.03x slower) 42.50 µs (✅ 1.31x faster)
large 66.33 µs (✅ 1.36x faster) 228.71 µs (✅ 1.02x slower) 40.94 µs (✅ 1.37x faster)
54.08 µs (✅ 1.44x faster) 48.05 µs (✅ 1.03x faster)
medium 72.28 µs (✅ 1.20x faster) 171.50 µs (✅ 1.01x slower) 44.31 µs (✅ 1.27x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
85.66 ms (❌ 1.14x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
66.43 µs (✅ 1.28x faster) 65.48 µs (✅ 1.23x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 73.72 ms (❌ 1.18x slower) 11.22 ms (🚀 2.32x faster) 10.72 ms (🚀 2.40x faster) 13.94 ms (🚀 2.03x faster)
small 59.40 ms (❌ 1.42x slower) 3.09 ms (✅ 1.32x faster) 2.55 ms (✅ 1.52x faster) 4.36 ms (✅ 1.33x faster)
large 206.60 ms (❌ 1.43x slower) 36.00 ms (🚀 2.69x faster) 34.62 ms (🚀 2.73x faster) 40.25 ms (🚀 2.56x faster)
default 47.54 ms (❌ 1.38x slower) 521.83 µs (✅ 1.22x faster) 463.94 µs (✅ 1.32x faster) 1.59 ms (✅ 1.04x slower)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.45 ms (✅ 1.03x faster) 5.60 ms (❌ 1.15x slower) 4.48 ms (❌ 1.20x slower)
1MB 42.59 µs (✅ 1.03x slower) 42.50 µs (✅ 1.01x faster) 41.57 µs (✅ 1.01x slower)

snapshots

restore create
default 76.56 µs (✅ 1.09x faster) 359.66 µs (✅ 1.41x faster)
large 1.84 ms (🚀 1.82x faster) 194.27 ms (✅ 1.24x faster)
medium 104.84 µs (✅ 1.13x faster) 52.10 ms (✅ 1.10x faster)
small 85.51 µs (✅ 1.02x faster) 3.07 ms (✅ 1.44x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.73x faster
  • Worst regression: function_call_serialization/deserialize_function_call — ❌ 1.87x slower
mshv3 / intel (Linux) (❌ *1.43x slower* → 🚀 **2.26x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
10.85 ms (✅ 1.04x slower) 13.54 ms (✅ 1.02x faster)

guest_calls

interrupt_latency call_with_host_function different_thread call call_with_restore
56.24 µs (✅ 1.53x faster) 86.62 µs (✅ 1.46x faster)
small 113.53 µs (✅ 1.52x faster) 72.85 µs (✅ 1.44x faster) 238.20 µs (✅ 1.16x faster)
medium 114.86 µs (✅ 1.49x faster) 73.29 µs (✅ 1.40x faster) 257.00 µs (✅ 1.13x faster)
default 110.29 µs (✅ 1.54x faster) 73.36 µs (✅ 1.40x faster) 243.60 µs (✅ 1.14x faster)
large 111.67 µs (✅ 1.54x faster) 75.76 µs (✅ 1.35x faster) 349.54 µs (✅ 1.19x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
110.96 ms (✅ 1.16x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
89.81 µs (✅ 1.60x faster) 91.85 µs (✅ 1.59x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
medium 16.12 ms (✅ 1.75x faster) 19.92 ms (✅ 1.58x faster) 16.97 ms (✅ 1.66x faster) 74.00 ms (❌ 1.17x slower)
default 356.09 µs (✅ 1.38x faster) 1.37 ms (✅ 1.22x faster) 363.55 µs (✅ 1.40x faster) 47.02 ms (❌ 1.42x slower)
small 1.93 ms (🚀 2.26x faster) 4.78 ms (✅ 1.36x faster) 2.06 ms (🚀 2.18x faster) 58.00 ms (❌ 1.37x slower)
large 61.99 ms (✅ 1.72x faster) 69.29 ms (✅ 1.62x faster) 60.42 ms (✅ 1.75x faster) 221.92 ms (❌ 1.43x slower)

shared_memory

fill copy_to_slice copy_from_slice
1MB 34.80 µs (✅ 1.09x faster) 41.25 µs (✅ 1.08x slower) 68.78 µs (✅ 1.10x slower)
64MB 6.48 ms (✅ 1.12x faster) 8.13 ms (✅ 1.16x faster) 10.98 ms (✅ 1.03x faster)

snapshots

restore create
small 160.74 µs (✅ 1.04x slower) 3.43 ms (✅ 1.74x faster)
default 155.89 µs (✅ 1.02x faster) 352.93 µs (✅ 1.29x faster)
medium 176.20 µs (✅ 1.01x slower) 39.31 ms (✅ 1.40x faster)
large 1.56 ms (✅ 1.67x faster) 155.95 ms (✅ 1.35x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized/small — 🚀 2.26x faster
  • Worst regression: sandboxes/create_initialized_and_drop/large — ❌ 1.43x slower
hyperv-ws2025 / amd (Windows) (❌ *7.91x slower* → ✅ **1.21x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.83 ms (❌ 1.41x slower) 11.56 ms (❌ 1.35x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 73.54 µs (❌ 1.86x slower) 134.16 µs (❌ 2.15x slower) 286.50 µs (❌ 2.29x slower)
large 61.99 µs (❌ 1.84x slower) 98.84 µs (❌ 1.57x slower) 665.41 µs (❌ 2.52x slower)
medium 75.89 µs (❌ 1.92x slower) 102.43 µs (❌ 2.00x slower) 325.30 µs (❌ 2.20x slower)
small 78.54 µs (❌ 2.05x slower) 131.07 µs (❌ 2.09x slower) 264.79 µs (❌ 1.85x slower)
98.96 µs (❌ 1.45x slower) 179.26 µs (❌ 5.96x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
6.52 s (❌ 1.98x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
111.94 µs (❌ 1.66x slower) 97.39 µs (❌ 1.37x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.70 ms (❌ 1.73x slower) 7.27 ms (❌ 1.71x slower) 1.59 ms (❌ 1.30x slower) 1.46 ms (✅ 1.04x slower)
large 380.35 ms (❌ 1.13x slower) 316.66 ms (✅ 1.06x slower) 308.29 ms (✅ 1.21x faster) 309.46 ms (✅ 1.12x faster)
medium 104.00 ms (❌ 1.23x slower) 94.00 ms (❌ 1.25x slower) 99.99 ms (✅ 1.05x slower) 79.09 ms (✅ 1.11x faster)
small 23.12 ms (❌ 1.49x slower) 20.36 ms (❌ 1.51x slower) 14.07 ms (✅ 1.11x slower) 13.98 ms (❌ 1.17x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 54.07 µs (❌ 1.47x slower) 50.15 µs (❌ 1.18x slower) 47.43 µs (❌ 1.14x slower)
64MB 14.12 ms (❌ 1.69x slower) 17.70 ms (❌ 1.87x slower) 10.78 ms (❌ 1.74x slower)

snapshots

create restore
default 617.75 µs (✅ 1.14x faster) 123.84 µs (❌ 1.78x slower)
large 350.90 ms (✅ 1.08x faster) 58.94 ms (❌ 1.51x slower)
medium 114.06 ms (❌ 1.27x slower) 2.09 ms (❌ 7.91x slower)
small 14.90 ms (❌ 1.31x slower) 142.30 µs (❌ 2.10x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized_and_drop/large — ✅ 1.21x faster
  • Worst regression: snapshots/restore/medium — ❌ 7.91x slower
hyperv-ws2025 / intel (Windows) (❌ *10.73x slower* → ✅ **1.46x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
19.80 ms (❌ 1.35x slower) 16.92 ms (❌ 1.44x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 76.60 µs (✅ 1.02x slower) 136.68 µs (❌ 1.21x slower) 203.37 µs (✅ 1.16x faster)
large 74.48 µs (✅ 1.13x faster) 126.10 µs (✅ 1.09x slower) 880.94 µs (❌ 2.10x slower)
medium 78.40 µs (✅ 1.07x faster) 119.44 µs (✅ 1.05x faster) 300.16 µs (✅ 1.09x slower)
small 75.73 µs (✅ 1.06x faster) 129.29 µs (❌ 1.12x slower) 228.35 µs (✅ 1.01x faster)
77.91 µs (✅ 1.46x faster) 154.04 µs (❌ 2.32x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
5.99 s (❌ 1.28x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
97.37 µs (✅ 1.21x faster) 109.86 µs (✅ 1.06x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 8.76 ms (❌ 1.13x slower) 7.30 ms (❌ 1.25x slower) 1.62 ms (❌ 1.31x slower) 1.27 ms (❌ 1.16x slower)
large 514.43 ms (❌ 1.31x slower) 371.25 ms (❌ 1.14x slower) 520.44 ms (❌ 1.48x slower) 372.12 ms (❌ 1.21x slower)
medium 143.50 ms (❌ 1.38x slower) 101.60 ms (❌ 1.17x slower) 131.91 ms (❌ 1.51x slower) 90.68 ms (❌ 1.15x slower)
small 28.55 ms (❌ 1.38x slower) 19.75 ms (❌ 1.14x slower) 19.50 ms (❌ 1.59x slower) 12.88 ms (❌ 1.21x slower)

shared_memory

copy_from_slice copy_to_slice fill
1MB 57.90 µs (✅ 1.07x faster) 60.70 µs (✅ 1.12x faster) 29.56 µs (✅ 1.20x faster)
64MB 18.94 ms (❌ 1.33x slower) 29.54 ms (❌ 1.82x slower) 15.68 ms (❌ 1.57x slower)

snapshots

create restore
default 992.85 µs (✅ 1.03x faster) 109.68 µs (✅ 1.27x faster)
large 528.44 ms (✅ 1.11x slower) 101.05 ms (❌ 1.76x slower)
medium 140.01 ms (❌ 1.15x slower) 9.66 ms (❌ 10.73x slower)
small 17.24 ms (✅ 1.10x slower) 125.59 µs (✅ 1.01x slower)

Summary

  • Biggest gain: guest_calls/different_thread — ✅ 1.46x faster
  • Worst regression: snapshots/restore/medium — ❌ 10.73x slower

@jprendes jprendes marked this pull request as ready for review June 19, 2026 15:54
Copilot AI review requested due to automatic review settings June 19, 2026 15:54

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new hyperlight-ci workspace crate to host CI/dev helper commands (benchmark runner + report generator), and wires it into GitHub Actions so PR benchmarks run in a matrix and get aggregated into a bot-posted PR comment.

Changes:

  • Add a new hyperlight-ci binary with bench and bench-report subcommands (criterion-swarm execution + criterion-markdown rendering).
  • Add a cargo ci ... alias and update Justfile benchmark targets to use it.
  • Extend PR validation workflows to run benchmarks, upload per-matrix markdown reports, and combine them into a single artifact for hyperlight-gh-bot.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/hyperlight_ci/src/main.rs New CLI entrypoint with bench / bench-report subcommands.
src/hyperlight_ci/src/bench.rs Implements the benchmark runner using criterion-swarm and output mode flags.
src/hyperlight_ci/src/bench_report.rs Generates markdown reports from existing target/criterion results.
src/hyperlight_ci/Cargo.toml New crate manifest and dependencies for the CI tool.
Justfile Switch benchmark recipes to cargo ci bench ....
Cargo.toml Adds src/hyperlight_ci to the workspace members.
Cargo.lock Locks new dependencies (criterion-swarm/criterion-markdown/etc.) and adds hyperlight-ci.
.github/workflows/ValidatePullRequest.yml Adds benchmark matrix job + aggregation job producing the PR comment artifact.
.github/workflows/dep_benchmarks.yml Generates and uploads a benchmark.md report artifact per matrix entry.
.github/hyperlight-bot.yml Configures hyperlight-gh-bot to post the aggregated benchmark comment.
.cargo/config.toml Adds cargo ci alias to run hyperlight-ci.

Comment thread src/hyperlight_ci/Cargo.toml
Comment thread src/hyperlight_ci/Cargo.toml Outdated
Comment thread src/hyperlight_ci/src/bench.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

#[arg(long, short, default_value_t = 0)]
pub jobs: usize,

/// Build output mode (comma-separated or repeated): spinner, stream, summary, none
Comment on lines +72 to +76
pub build_output: Vec<OutputModeFlags>,

/// Benchmarks output mode (comma-separated or repeated): spinner, stream, summary, none
#[arg(long, value_delimiter = ',')]
pub benchmarks_output: Vec<OutputModeFlags>,
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.77x slower* → 🚀 **2.45x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
7.72 ms (❌ 1.77x slower) 10.17 ms (❌ 1.19x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 23.76 µs (✅ 1.06x slower) 68.85 µs (✅ 1.14x faster) 12.84 µs (✅ 1.02x slower)
small 20.77 µs (✅ 1.11x faster) 66.06 µs (✅ 1.06x faster) 11.40 µs (✅ 1.08x faster)
default 24.17 µs (✅ 1.03x slower) 62.81 µs (✅ 1.02x slower) 12.90 µs (✅ 1.02x slower)
10.02 ms (❌ 1.51x slower) 22.74 µs (❌ 1.12x slower)
large 23.60 µs (✅ 1.04x slower) 89.08 µs (✅ 1.18x faster) 12.87 µs (✅ 1.02x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
592.09 ms (✅ 1.06x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
22.27 µs (✅ 1.70x faster) 22.25 µs (✅ 1.75x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 28.64 ms (🚀 2.27x faster) 27.44 ms (🚀 2.45x faster) 29.43 ms (🚀 2.24x faster) 39.09 ms (🚀 2.00x faster)
medium 8.64 ms (🚀 2.04x faster) 9.96 ms (🚀 1.95x faster) 9.03 ms (🚀 1.99x faster) 23.02 ms (✅ 1.33x faster)
small 2.23 ms (✅ 1.69x faster) 3.83 ms (✅ 1.30x faster) 2.41 ms (✅ 1.48x faster) 18.36 ms (❌ 1.17x slower)
default 408.10 µs (✅ 1.26x faster) 1.95 ms (✅ 1.07x slower) 451.93 µs (✅ 1.25x faster) 18.31 ms (❌ 1.58x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.71 ms (✅ 1.10x slower) 2.67 ms (✅ 1.06x slower) 4.37 ms (✅ 1.07x slower)
1MB 39.11 µs (✅ 1.01x faster) 28.07 µs (✅ 1.30x faster) 38.45 µs (✅ 1.03x slower)

snapshots

restore create
medium 14.50 µs (✅ 1.13x faster) 29.53 ms (✅ 1.18x faster)
large 36.51 µs (✅ 1.50x faster) 100.14 ms (✅ 1.40x faster)
small 10.95 µs (✅ 1.05x slower) 2.96 ms (✅ 1.19x faster)
default 10.25 µs (✅ 1.02x slower) 309.61 µs (✅ 1.29x faster)

Summary

  • Biggest gain: sandboxes/create_initialized/large — 🚀 2.45x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.77x slower
kvm / intel (Linux) (❌ *1.79x slower* → 🚀 **13.07x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
5.63 ms (✅ 1.01x faster) 8.47 ms (✅ 1.47x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
medium 32.58 µs (✅ 1.15x faster) 34.58 µs (✅ 1.24x faster) 17.07 µs (✅ 1.14x faster)
small 32.95 µs (✅ 1.13x faster) 31.90 µs (✅ 1.19x faster) 16.17 µs (✅ 1.14x faster)
default 30.51 µs (✅ 1.23x faster) 30.62 µs (✅ 1.24x faster) 17.33 µs (✅ 1.11x faster)
10.53 ms (❌ 1.58x slower) 11.62 µs (🚀 2.33x faster)
large 30.60 µs (✅ 1.15x faster) 83.66 µs (✅ 1.08x slower) 17.33 µs (✅ 1.12x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
614.38 ms (✅ 1.11x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
28.30 µs (🚀 3.76x faster) 27.74 µs (🚀 3.97x faster)

sandboxes

create_uninitialized create_initialized create_uninitialized_and_drop create_initialized_and_drop
large 62.09 ms (🚀 2.47x faster) 66.45 ms (🚀 2.35x faster) 62.46 ms (🚀 2.46x faster) 78.25 ms (🚀 2.15x faster)
medium 16.77 ms (🚀 2.42x faster) 18.98 ms (🚀 2.27x faster) 16.94 ms (🚀 2.38x faster) 34.46 ms (✅ 1.59x faster)
small 1.92 ms (🚀 1.86x faster) 3.20 ms (✅ 1.73x faster) 2.03 ms (✅ 1.80x faster) 20.33 ms (❌ 1.18x slower)
default 330.03 µs (✅ 1.48x faster) 1.56 ms (✅ 1.34x faster) 353.17 µs (✅ 1.52x faster) 20.29 ms (❌ 1.79x slower)

shared_memory

copy_from_slice fill copy_to_slice
64MB 4.95 ms (🚀 2.27x faster) 2.48 ms (✅ 1.25x faster) 3.69 ms (✅ 1.51x faster)
1MB 40.15 µs (✅ 1.16x faster) 23.50 µs (✅ 1.53x faster) 37.08 µs (✅ 1.10x faster)

snapshots

restore create
medium 13.07 µs (✅ 1.25x faster) 39.21 ms (🚀 1.91x faster)
large 48.06 µs (🚀 13.07x faster) 156.52 ms (🚀 1.89x faster)
small 11.70 µs (✅ 1.03x slower) 2.30 ms (✅ 1.72x faster)
default 9.63 µs (✅ 1.15x faster) 254.88 µs (✅ 1.51x faster)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 13.07x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.79x slower
mshv3 / amd (Linux) (❌ *2.39x slower* → 🚀 **2.69x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
16.49 ms (❌ 2.39x slower) 11.24 ms (❌ 2.13x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 69.22 µs (✅ 1.32x faster) 164.91 µs (✅ 1.00x faster) 41.72 µs (✅ 1.27x faster)
small 72.93 µs (✅ 1.25x faster) 150.10 µs (✅ 1.00x faster) 44.06 µs (✅ 1.26x faster)
large 59.72 µs (✅ 1.49x faster) 225.82 µs (✅ 1.01x slower) 43.71 µs (✅ 1.28x faster)
54.86 µs (✅ 1.35x faster) 47.39 µs (✅ 1.05x faster)
medium 73.32 µs (✅ 1.20x faster) 171.68 µs (✅ 1.01x slower) 42.81 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
93.18 ms (❌ 1.25x slower)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
63.86 µs (✅ 1.34x faster) 64.15 µs (✅ 1.26x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 60.80 ms (✅ 1.03x faster) 10.97 ms (🚀 2.37x faster) 11.45 ms (🚀 2.24x faster) 13.26 ms (🚀 2.14x faster)
small 60.92 ms (❌ 1.45x slower) 2.58 ms (✅ 1.57x faster) 2.56 ms (✅ 1.51x faster) 4.35 ms (✅ 1.33x faster)
large 176.96 ms (❌ 1.22x slower) 36.01 ms (🚀 2.69x faster) 37.36 ms (🚀 2.53x faster) 43.84 ms (🚀 2.35x faster)
default 71.98 ms (❌ 2.09x slower) 498.30 µs (✅ 1.29x faster) 463.30 µs (✅ 1.31x faster) 1.52 ms (✅ 1.00x faster)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.95 ms (✅ 1.06x slower) 6.08 ms (❌ 1.25x slower) 4.27 ms (❌ 1.14x slower)
1MB 42.70 µs (✅ 1.03x slower) 42.79 µs (✅ 1.01x slower) 41.46 µs (✅ 1.00x slower)

snapshots

restore create
default 81.77 µs (✅ 1.02x faster) 358.61 µs (✅ 1.42x faster)
large 1.65 ms (🚀 2.03x faster) 198.18 ms (✅ 1.21x faster)
medium 109.38 µs (✅ 1.01x faster) 49.49 ms (✅ 1.16x faster)
small 80.22 µs (✅ 1.10x faster) 3.22 ms (✅ 1.37x faster)

Summary

  • Biggest gain: sandboxes/create_uninitialized_and_drop/large — 🚀 2.69x faster
  • Worst regression: function_call_serialization/deserialize_function_call — ❌ 2.39x slower
mshv3 / intel (Linux) (❌ *1.71x slower* → 🚀 **2.54x faster**)

function_call_serialization

serialize_function_call deserialize_function_call
10.69 ms (✅ 1.02x slower) 13.16 ms (✅ 1.05x faster)

guest_calls

call_with_host_function different_thread call_with_restore call interrupt_latency
default 74.00 µs (🚀 2.25x faster) 133.52 µs (🚀 2.06x faster) 44.91 µs (🚀 2.29x faster)
54.16 µs (🚀 2.31x faster) 37.60 µs (🚀 2.29x faster)
large 75.73 µs (🚀 2.32x faster) 244.55 µs (✅ 1.63x faster) 45.71 µs (🚀 2.23x faster)
medium 74.00 µs (🚀 2.30x faster) 148.00 µs (🚀 1.91x faster) 45.58 µs (🚀 2.28x faster)
small 75.60 µs (🚀 2.31x faster) 137.31 µs (🚀 2.00x faster) 45.38 µs (🚀 2.35x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
109.55 ms (✅ 1.17x faster)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
56.68 µs (🚀 2.54x faster) 57.98 µs (🚀 2.49x faster)

sandboxes

create_uninitialized create_uninitialized_and_drop create_initialized create_initialized_and_drop
default 355.74 µs (✅ 1.39x faster) 379.91 µs (✅ 1.35x faster) 1.43 ms (✅ 1.17x faster) 56.82 ms (❌ 1.71x slower)
medium 16.77 ms (✅ 1.68x faster) 16.67 ms (✅ 1.69x faster) 19.49 ms (✅ 1.61x faster) 86.72 ms (❌ 1.37x slower)
large 62.37 ms (✅ 1.71x faster) 62.42 ms (✅ 1.69x faster) 69.60 ms (✅ 1.62x faster) 200.24 ms (❌ 1.29x slower)
small 2.09 ms (🚀 2.08x faster) 2.22 ms (🚀 2.02x faster) 4.61 ms (✅ 1.41x faster) 54.76 ms (❌ 1.30x slower)

shared_memory

copy_from_slice copy_to_slice fill
64MB 10.58 ms (✅ 1.07x faster) 8.01 ms (✅ 1.18x faster) 6.97 ms (✅ 1.04x faster)
1MB 62.83 µs (✅ 1.01x slower) 41.51 µs (✅ 1.08x slower) 35.60 µs (✅ 1.06x faster)

snapshots

restore create
default 85.65 µs (🚀 1.84x faster) 318.66 µs (✅ 1.44x faster)
large 1.49 ms (✅ 1.75x faster) 157.94 ms (✅ 1.33x faster)
medium 102.06 µs (✅ 1.50x faster) 39.40 ms (✅ 1.39x faster)
small 81.22 µs (🚀 1.88x faster) 3.60 ms (✅ 1.66x faster)

Summary

  • Biggest gain: sample_workloads/24K_in_8K_out_c — 🚀 2.54x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ❌ 1.71x slower
hyperv-ws2025 / amd (Windows) (❌ *6.21x slower* → ✅ **1.46x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
13.64 ms (❌ 1.39x slower) 12.06 ms (❌ 1.41x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 49.34 µs (❌ 1.25x slower) 73.79 µs (❌ 1.13x slower) 189.44 µs (❌ 1.28x slower)
large 48.85 µs (❌ 1.25x slower) 93.69 µs (❌ 1.33x slower) 413.68 µs (❌ 1.43x slower)
medium 50.72 µs (❌ 1.28x slower) 95.34 µs (❌ 1.35x slower) 205.48 µs (❌ 1.34x slower)
small 49.92 µs (❌ 1.30x slower) 75.14 µs (❌ 1.20x slower) 197.91 µs (❌ 1.45x slower)
78.10 µs (✅ 1.00x faster) 186.70 µs (❌ 6.21x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.38 s (❌ 1.33x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
104.94 µs (❌ 1.49x slower) 81.89 µs (✅ 1.11x slower)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 6.59 ms (❌ 1.31x slower) 5.58 ms (❌ 1.31x slower) 1.14 ms (✅ 1.01x slower) 1.34 ms (✅ 1.16x faster)
large 305.84 ms (✅ 1.10x faster) 260.34 ms (✅ 1.15x faster) 320.34 ms (✅ 1.17x faster) 237.98 ms (✅ 1.46x faster)
medium 93.91 ms (✅ 1.11x slower) 66.16 ms (✅ 1.14x faster) 90.08 ms (✅ 1.05x faster) 61.64 ms (✅ 1.42x faster)
small 16.95 ms (✅ 1.09x slower) 13.47 ms (✅ 1.00x faster) 10.69 ms (✅ 1.19x faster) 8.86 ms (✅ 1.35x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 54.34 µs (❌ 1.27x slower) 60.73 µs (❌ 1.58x slower) 43.80 µs (✅ 1.05x slower)
64MB 11.00 ms (❌ 1.32x slower) 15.02 ms (❌ 1.59x slower) 8.64 ms (❌ 1.40x slower)

snapshots

create restore
default 702.69 µs (✅ 1.08x faster) 95.18 µs (❌ 1.25x slower)
large 371.88 ms (✅ 1.02x faster) 63.18 ms (❌ 1.62x slower)
medium 96.28 ms (✅ 1.08x slower) 1.27 ms (❌ 4.79x slower)
small 11.49 ms (✅ 1.01x slower) 108.10 µs (❌ 1.51x slower)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — ✅ 1.46x faster
  • Worst regression: guest_calls/interrupt_latency — ❌ 6.21x slower
hyperv-ws2025 / intel (Windows) (❌ *10.31x slower* → ✅ **1.26x faster**)

function_call_serialization

deserialize_function_call serialize_function_call
15.05 ms (✅ 1.03x slower) 14.78 ms (❌ 1.26x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 120.81 µs (❌ 1.63x slower) 210.11 µs (❌ 1.83x slower) 398.99 µs (❌ 1.76x slower)
large 167.65 µs (❌ 1.84x slower) 287.82 µs (❌ 2.47x slower) 1.03 ms (❌ 2.41x slower)
medium 129.21 µs (❌ 1.56x slower) 231.37 µs (❌ 1.88x slower) 577.72 µs (❌ 1.92x slower)
small 122.15 µs (❌ 1.53x slower) 224.01 µs (❌ 1.80x slower) 439.51 µs (❌ 1.97x slower)
158.13 µs (❌ 1.47x slower) 255.26 µs (❌ 3.85x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.99 s (✅ 1.07x slower)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
97.38 µs (✅ 1.26x faster) 102.32 µs (✅ 1.21x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 10.68 ms (❌ 1.38x slower) 7.88 ms (❌ 1.34x slower) 1.24 ms (✅ 1.02x slower) 1.00 ms (✅ 1.12x faster)
large 420.17 ms (✅ 1.07x slower) 321.18 ms (✅ 1.01x faster) 415.51 ms (❌ 1.18x slower) 303.76 ms (✅ 1.01x faster)
medium 118.88 ms (❌ 1.14x slower) 89.26 ms (✅ 1.02x slower) 102.16 ms (❌ 1.17x slower) 75.92 ms (✅ 1.04x faster)
small 26.58 ms (❌ 1.28x slower) 18.21 ms (✅ 1.05x slower) 16.35 ms (❌ 1.34x slower) 10.56 ms (✅ 1.01x faster)

shared_memory

copy_from_slice copy_to_slice fill
1MB 69.05 µs (✅ 1.00x faster) 73.56 µs (✅ 1.08x slower) 36.21 µs (✅ 1.02x slower)
64MB 14.22 ms (✅ 1.00x faster) 17.48 ms (✅ 1.08x slower) 11.04 ms (✅ 1.11x slower)

snapshots

create restore
default 840.13 µs (✅ 1.21x faster) 291.95 µs (❌ 1.98x slower)
large 447.68 ms (✅ 1.06x faster) 76.96 ms (❌ 1.34x slower)
medium 115.97 ms (✅ 1.05x faster) 9.29 ms (❌ 10.31x slower)
small 15.60 ms (✅ 1.01x faster) 335.66 µs (❌ 2.44x slower)

Summary

  • Biggest gain: sample_workloads/24K_in_8K_out_c — ✅ 1.26x faster
  • Worst regression: snapshots/restore/medium — ❌ 10.31x slower

@jsturtevant

Copy link
Copy Markdown
Contributor

Benchmark Results
kvm / amd (Linux) (❌ 1.77x slower → 🚀 2.45x faster)
kvm / intel (Linux) (❌ 1.79x slower → 🚀 13.07x faster)
mshv3 / amd (Linux) (❌ 2.39x slower → 🚀 2.69x faster)
mshv3 / intel (Linux) (❌ 1.71x slower → 🚀 2.54x faster)
hyperv-ws2025 / amd (Windows) (❌ 6.21x slower → ✅ 1.46x faster)
hyperv-ws2025 / intel (Windows) (❌ 10.31x slower → ✅ 1.26x faster)

It's not clear what this summary is telling me. Are all these faster and we are good to go? I looked at the details for a few and it looks like its slow for some and fast for others

Comment thread .github/hyperlight-bot.yml
@ludfjig

ludfjig commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

is this ready for review @jprendes ?

jprendes added 2 commits June 23, 2026 16:07
Introduce a new internal tooling crate (hyperlight-ci) that provides:

- bench subcommand: Runs criterion benchmarks in parallel via
  criterion-swarm. Features include:
  - Configurable parallelism (-j N, defaults to all P-cores)
  - Configurable output modes (spinner, stream, summary)
  - Support for pre-built binaries (--binary) to skip rebuilds
  - Trailing args forwarded to criterion (filter, --exact, etc.)

- bench-report subcommand: Generates markdown comparison tables from
  criterion's target/criterion/ JSON output via criterion-markdown.
  Features include:
  - Benchmark discovery via criterion-swarm
  - Optional allowlist filtering via --binary or trailing args
  - Output to stdout

This replaces ad-hoc benchmark scripting with a unified tool suitable
for both local development and CI report generation.

Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
- Add cargo alias (`cargo ci`) for convenient hyperlight-ci invocation
- Update dep_benchmarks workflow to use `cargo ci bench` and generate
  a markdown report via `cargo ci bench-report`, posting results as
  a PR comment per hypervisor/cpu matrix entry
- Add benchmarks job to ValidatePullRequest workflow with hypervisor
  and cpu matrix, gated behind docs-only and build-guests checks
- Grant pull-requests: write permission for PR comment posting
- Simplify Justfile bench recipes to delegate to `cargo ci bench`
- Update benchmarking docs to reflect the new workflow

Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.15x slower* → 🚀 **2.76x faster**)

alloc_fragmented

fragmented_256
11.90 ns (---)

alloc_lifo

``
256 1.18 µs (---)
4096 1.20 µs (---)
1500 1.20 µs (---)

alloc_single

``
64 11.77 ns (---)
256 11.83 ns (---)
128 11.82 ns (---)
512 12.14 ns (---)
4096 12.09 ns (---)
1024 12.07 ns (---)
1500 12.12 ns (---)

free

``
256 12.09 ns (---)
4096 12.17 ns (---)
1500 12.15 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
19.47 ns (---) 24.47 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
4.72 ms (✅ 1.08x slower) 9.90 ms (❌ 1.15x slower)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 25.86 µs (❌ 1.13x slower) 74.82 µs (✅ 1.08x slower) 14.15 µs (❌ 1.13x slower)
6.86 ms (✅ 1.03x slower) 21.59 µs (✅ 1.06x slower)
large 24.65 µs (✅ 1.09x slower) 101.36 µs (✅ 1.05x slower) 12.83 µs (✅ 1.03x slower)
medium 25.03 µs (❌ 1.13x slower) 78.40 µs (✅ 1.01x faster) 14.18 µs (❌ 1.13x slower)
default 25.57 µs (✅ 1.10x slower) 66.69 µs (✅ 1.09x slower) 14.10 µs (❌ 1.12x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
590.94 ms (✅ 1.06x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
7.62 ns (---) 7.63 ns (---) 258.23 ns (---) 7.65 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
23.06 µs (✅ 1.70x faster) 22.92 µs (✅ 1.66x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.74 ms (✅ 1.04x faster) 416.63 µs (✅ 1.25x faster) 11.70 ms (✅ 1.01x slower) 454.61 µs (✅ 1.24x faster)
large 26.03 ms (🚀 2.58x faster) 23.56 ms (🚀 2.76x faster) 37.92 ms (🚀 2.07x faster) 24.28 ms (🚀 2.72x faster)
small 3.55 ms (✅ 1.40x faster) 2.00 ms (🚀 1.89x faster) 15.70 ms (✅ 1.00x slower) 2.13 ms (✅ 1.68x faster)
medium 9.11 ms (🚀 2.13x faster) 7.42 ms (🚀 2.38x faster) 21.14 ms (✅ 1.45x faster) 7.72 ms (🚀 2.33x faster)

segmented_payload

``
8192 12.92 ns (---)
65536 23.47 ns (---)
262144 61.72 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 41.26 µs (✅ 1.04x slower) 27.25 µs (✅ 1.32x faster) 38.10 µs (✅ 1.00x faster)
64MB 4.37 ms (✅ 1.03x slower) 2.11 ms (✅ 1.19x faster) 3.92 ms (✅ 1.04x faster)

snapshots

create restore
large 84.52 ms (✅ 1.66x faster) 35.93 µs (✅ 1.51x faster)
small 2.67 ms (✅ 1.32x faster) 12.27 µs (❌ 1.12x slower)
medium 21.91 ms (✅ 1.59x faster) 16.43 µs (✅ 1.05x slower)
default 330.58 µs (✅ 1.19x faster) 10.60 µs (✅ 1.04x slower)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 642.88 ns (---) 776.21 ns (---) 775.83 ns (---) 855.85 ns (---)
65536 3.44 µs (---) 4.70 µs (---) 4.73 µs (---) 3.43 µs (---)
262144 12.68 µs (---) 18.26 µs (---) 18.62 µs (---) 12.66 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 945.82 ns (---) 1.15 µs (---) 1.15 µs (---) 943.13 ns (---)
65536 4.88 µs (---) 7.71 µs (---) 7.85 µs (---) 4.88 µs (---)
262144 20.42 µs (---) 32.45 µs (---) 32.37 µs (---) 20.40 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.76x faster
  • Worst regression: function_call_serialization/deserialize_function_call — ❌ 1.15x slower
kvm / intel (Linux) (✅ **1.07x slower** → 🚀 **11.85x faster**)

alloc_fragmented

fragmented_256
11.54 ns (---)

alloc_lifo

``
256 1.14 µs (---)
4096 1.18 µs (---)
1500 1.18 µs (---)

alloc_single

``
64 11.41 ns (---)
256 11.37 ns (---)
128 11.39 ns (---)
512 11.69 ns (---)
4096 11.67 ns (---)
1024 11.73 ns (---)
1500 11.67 ns (---)

free

``
256 11.42 ns (---)
4096 11.99 ns (---)
1500 11.81 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
25.80 ns (---) 22.90 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
5.10 ms (✅ 1.12x faster) 7.96 ms (✅ 1.57x faster)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 32.68 µs (✅ 1.15x faster) 31.45 µs (✅ 1.20x faster) 17.57 µs (✅ 1.07x faster)
6.77 ms (✅ 1.02x slower) 18.38 µs (✅ 1.47x faster)
large 32.75 µs (✅ 1.07x faster) 67.97 µs (✅ 1.13x faster) 17.64 µs (✅ 1.09x faster)
medium 32.81 µs (✅ 1.15x faster) 37.11 µs (✅ 1.16x faster) 16.82 µs (✅ 1.12x faster)
default 33.73 µs (✅ 1.10x faster) 31.09 µs (✅ 1.21x faster) 17.42 µs (✅ 1.09x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
559.91 ms (✅ 1.22x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
7.07 ns (---) 7.08 ns (---) 252.85 ns (---) 7.06 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
27.77 µs (🚀 4.00x faster) 27.87 µs (🚀 3.81x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.66 ms (✅ 1.26x faster) 346.23 µs (✅ 1.41x faster) 12.15 ms (✅ 1.07x slower) 375.18 µs (✅ 1.42x faster)
large 66.50 ms (🚀 2.34x faster) 65.00 ms (🚀 2.36x faster) 76.06 ms (🚀 2.21x faster) 62.42 ms (🚀 2.47x faster)
small 3.43 ms (✅ 1.61x faster) 2.02 ms (✅ 1.77x faster) 15.28 ms (✅ 1.13x faster) 2.00 ms (🚀 1.83x faster)
medium 18.66 ms (🚀 2.30x faster) 16.87 ms (🚀 2.41x faster) 28.87 ms (🚀 1.90x faster) 16.78 ms (🚀 2.40x faster)

segmented_payload

``
8192 13.03 ns (---)
65536 28.95 ns (---)
262144 82.86 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 41.79 µs (✅ 1.12x faster) 23.38 µs (✅ 1.53x faster) 37.29 µs (✅ 1.10x faster)
64MB 4.57 ms (🚀 2.46x faster) 2.39 ms (✅ 1.29x faster) 4.60 ms (✅ 1.21x faster)

snapshots

create restore
large 156.77 ms (🚀 1.89x faster) 50.52 µs (🚀 11.85x faster)
small 2.28 ms (✅ 1.74x faster) 9.98 µs (✅ 1.15x faster)
medium 37.17 ms (🚀 2.01x faster) 13.41 µs (✅ 1.25x faster)
default 261.86 µs (✅ 1.47x faster) 9.69 µs (✅ 1.14x faster)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 517.54 ns (---) 633.29 ns (---) 656.02 ns (---) 511.93 ns (---)
65536 4.46 µs (---) 5.29 µs (---) 5.31 µs (---) 4.48 µs (---)
262144 16.18 µs (---) 20.30 µs (---) 20.45 µs (---) 16.17 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 713.44 ns (---) 939.27 ns (---) 946.94 ns (---) 713.99 ns (---)
65536 6.10 µs (---) 9.01 µs (---) 9.13 µs (---) 6.05 µs (---)
262144 22.23 µs (---) 34.04 µs (---) 34.47 µs (---) 22.21 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 11.85x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ✅ 1.07x slower
mshv3 / amd (Linux) (❌ *1.18x slower* → 🚀 **8.04x faster**)

alloc_fragmented

fragmented_256
15.22 ns (---)

alloc_lifo

``
4096 1.54 µs (---)
256 1.52 µs (---)
1500 1.54 µs (---)

alloc_single

``
128 15.12 ns (---)
4096 15.69 ns (---)
256 15.12 ns (---)
1500 15.69 ns (---)
1024 15.71 ns (---)
512 15.70 ns (---)
64 15.13 ns (---)

free

``
4096 15.55 ns (---)
256 15.39 ns (---)
1500 15.57 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
30.36 ns (---) 24.29 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
6.17 ms (✅ 1.12x faster) 4.89 ms (✅ 1.08x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 58.54 µs (✅ 1.58x faster) 130.10 µs (✅ 1.26x faster) 39.06 µs (✅ 1.36x faster)
small 58.84 µs (✅ 1.55x faster) 133.64 µs (✅ 1.18x faster) 36.93 µs (✅ 1.46x faster)
large 59.46 µs (✅ 1.53x faster) 189.54 µs (✅ 1.23x faster) 37.21 µs (✅ 1.52x faster)
50.41 µs (✅ 1.51x faster) 53.33 µs (✅ 1.08x slower)
medium 58.70 µs (✅ 1.51x faster) 139.91 µs (✅ 1.21x faster) 37.13 µs (✅ 1.52x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
73.45 ms (✅ 1.02x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_4096 alloc_sg_64k alloc_dealloc_1500
9.79 ns (---) 9.79 ns (---) 304.35 ns (---) 9.78 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
49.65 µs (✅ 1.73x faster) 47.87 µs (✅ 1.68x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 46.74 ms (✅ 1.33x faster) 10.67 ms (🚀 2.44x faster) 10.30 ms (🚀 2.50x faster) 13.06 ms (🚀 2.17x faster)
small 36.14 ms (✅ 1.16x faster) 1.99 ms (🚀 2.05x faster) 1.80 ms (🚀 2.15x faster) 3.40 ms (✅ 1.70x faster)
large 85.96 ms (✅ 1.68x faster) 35.64 ms (🚀 2.72x faster) 34.92 ms (🚀 2.70x faster) 42.60 ms (🚀 2.42x faster)
default 33.82 ms (✅ 1.02x faster) 533.85 µs (✅ 1.20x faster) 492.30 µs (✅ 1.24x faster) 1.38 ms (✅ 1.10x faster)

segmented_payload

``
262144 78.43 ns (---)
65536 29.71 ns (---)
8192 16.41 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.09 ms (✅ 1.10x faster) 4.88 ms (✅ 1.00x slower) 2.58 ms (✅ 1.45x faster)
1MB 49.00 µs (❌ 1.18x slower) 48.53 µs (❌ 1.14x slower) 34.98 µs (✅ 1.18x faster)

snapshots

restore create
default 75.46 µs (✅ 1.12x faster) 433.19 µs (✅ 1.18x faster)
large 194.12 µs (🚀 8.04x faster) 113.83 ms (🚀 2.11x faster)
medium 88.35 µs (✅ 1.36x faster) 28.29 ms (🚀 2.03x faster)
small 76.40 µs (✅ 1.18x faster) 2.56 ms (✅ 1.73x faster)

virtq_readonly_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 22.50 µs (---) 23.70 µs (---) 15.94 µs (---) 15.85 µs (---)
65536 5.77 µs (---) 5.83 µs (---) 4.37 µs (---) 4.38 µs (---)
8192 976.16 ns (---) 955.57 ns (---) 738.87 ns (---) 799.93 ns (---)

virtq_readwrite_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 39.76 µs (---) 39.43 µs (---) 25.51 µs (---) 25.53 µs (---)
65536 9.38 µs (---) 9.41 µs (---) 6.05 µs (---) 6.06 µs (---)
8192 1.48 µs (---) 1.42 µs (---) 1.19 µs (---) 1.20 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 8.04x faster
  • Worst regression: shared_memory/copy_from_slice/1MB — ❌ 1.18x slower
mshv3 / intel (Linux) (✅ **1.11x slower** → 🚀 **2.85x faster**)

alloc_fragmented

fragmented_256
14.24 ns (---)

alloc_lifo

``
4096 1.44 µs (---)
256 1.40 µs (---)
1500 1.44 µs (---)

alloc_single

``
4096 14.23 ns (---)
256 13.93 ns (---)
512 14.38 ns (---)
64 13.94 ns (---)
1024 14.23 ns (---)
1500 14.23 ns (---)
128 13.93 ns (---)

free

``
4096 14.39 ns (---)
256 14.19 ns (---)
1500 14.29 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
27.94 ns (---) 20.66 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
13.67 ms (✅ 1.01x faster) 10.50 ms (✅ 1.00x slower)

guest_calls

call call_with_host_function call_with_restore interrupt_latency different_thread
large 74.99 µs (✅ 1.36x faster) 111.51 µs (✅ 1.56x faster) 350.17 µs (✅ 1.19x faster)
small 75.90 µs (✅ 1.40x faster) 111.83 µs (✅ 1.55x faster) 240.59 µs (✅ 1.15x faster)
default 74.67 µs (✅ 1.40x faster) 112.17 µs (✅ 1.50x faster) 244.15 µs (✅ 1.14x faster)
medium 75.20 µs (✅ 1.35x faster) 109.99 µs (✅ 1.54x faster) 256.20 µs (✅ 1.14x faster)
91.12 µs (✅ 1.06x slower) 93.55 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
110.00 ms (✅ 1.17x faster)

recycle_pool

alloc_sg_64k alloc_dealloc_4096 alloc_dealloc_128 alloc_dealloc_1500
284.08 ns (---) 9.48 ns (---) 9.51 ns (---) 9.56 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
90.97 µs (✅ 1.59x faster) 93.31 µs (✅ 1.57x faster)

sandboxes

create_initialized_and_drop create_uninitialized create_uninitialized_and_drop create_initialized
large 111.56 ms (✅ 1.39x faster) 61.84 ms (✅ 1.73x faster) 62.84 ms (✅ 1.68x faster) 70.09 ms (✅ 1.61x faster)
default 32.74 ms (✅ 1.01x faster) 369.81 µs (✅ 1.33x faster) 401.55 µs (✅ 1.28x faster) 1.57 ms (✅ 1.06x faster)
medium 57.00 ms (✅ 1.11x faster) 17.51 ms (✅ 1.61x faster) 17.19 ms (✅ 1.64x faster) 20.21 ms (✅ 1.55x faster)
small 41.90 ms (✅ 1.01x faster) 3.19 ms (✅ 1.37x faster) 2.88 ms (✅ 1.56x faster) 5.04 ms (✅ 1.29x faster)

segmented_payload

``
262144 86.60 ns (---)
8192 15.18 ns (---)
65536 30.84 ns (---)

shared_memory

fill copy_from_slice copy_to_slice
64MB 7.24 ms (✅ 1.00x faster) 12.40 ms (✅ 1.09x slower) 8.90 ms (✅ 1.06x faster)
1MB 32.52 µs (✅ 1.17x faster) 64.96 µs (✅ 1.04x slower) 42.90 µs (✅ 1.11x slower)

snapshots

restore create
large 916.67 µs (🚀 2.85x faster) 159.45 ms (✅ 1.32x faster)
small 157.51 µs (✅ 1.00x slower) 3.85 ms (✅ 1.55x faster)
default 158.66 µs (✅ 1.01x slower) 356.40 µs (✅ 1.27x faster)
medium 173.57 µs (✅ 1.01x faster) 41.22 ms (✅ 1.33x faster)

virtq_readonly_allocator_strategy

recycle_pool_segmented_fragmented buffer_pool_run recycle_pool_segmented buffer_pool_run_fragmented
262144 21.35 µs (---) 15.86 µs (---) 21.39 µs (---) 15.86 µs (---)
8192 667.30 ns (---) 569.79 ns (---) 666.21 ns (---) 550.11 ns (---)
65536 5.50 µs (---) 4.37 µs (---) 5.62 µs (---) 4.36 µs (---)

virtq_readwrite_allocator_strategy

recycle_pool_segmented_fragmented buffer_pool_run recycle_pool_segmented buffer_pool_run_fragmented
262144 44.12 µs (---) 26.07 µs (---) 42.69 µs (---) 25.85 µs (---)
8192 1.06 µs (---) 752.00 ns (---) 1.07 µs (---) 755.58 ns (---)
65536 9.38 µs (---) 5.89 µs (---) 9.32 µs (---) 5.89 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 2.85x faster
  • Worst regression: shared_memory/copy_to_slice/1MB — ✅ 1.11x slower
hyperv-ws2025 / amd (Windows) (❌ *1.29x slower* → ✅ **1.46x faster**)

alloc_fragmented

fragmented_256
15.41 ns (---)

alloc_lifo

``
1500 1.60 µs (---)
256 1.54 µs (---)
4096 1.60 µs (---)

alloc_single

``
1024 16.10 ns (---)
128 15.41 ns (---)
1500 15.98 ns (---)
256 15.39 ns (---)
4096 15.99 ns (---)
512 15.98 ns (---)
64 15.34 ns (---)

free

``
1500 16.01 ns (---)
256 15.36 ns (---)
4096 15.96 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
22.35 ns (---) 31.02 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
10.58 ms (✅ 1.08x slower) 8.98 ms (✅ 1.05x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 40.16 µs (✅ 1.00x slower) 66.37 µs (✅ 1.01x slower) 155.98 µs (✅ 1.10x slower)
large 43.25 µs (✅ 1.09x slower) 67.76 µs (✅ 1.03x slower) 333.92 µs (✅ 1.07x slower)
medium 42.57 µs (✅ 1.08x slower) 68.61 µs (✅ 1.07x slower) 190.69 µs (✅ 1.10x slower)
small 41.07 µs (✅ 1.07x slower) 67.41 µs (✅ 1.07x slower) 147.17 µs (✅ 1.02x slower)
78.14 µs (✅ 1.03x slower) 34.03 µs (❌ 1.13x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
3.66 s (✅ 1.11x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_dealloc_4096 alloc_sg_64k
9.87 ns (---) 9.87 ns (---) 10.03 ns (---) 351.66 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
63.09 µs (✅ 1.15x faster) 65.66 µs (✅ 1.12x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 5.77 ms (❌ 1.15x slower) 3.89 ms (✅ 1.09x faster) 1.03 ms (✅ 1.26x faster) 1.03 ms (✅ 1.46x faster)
large 272.55 ms (✅ 1.23x faster) 237.72 ms (✅ 1.26x faster) 313.10 ms (✅ 1.20x faster) 287.44 ms (✅ 1.21x faster)
medium 71.14 ms (✅ 1.19x faster) 61.31 ms (✅ 1.23x faster) 78.77 ms (✅ 1.20x faster) 72.76 ms (✅ 1.21x faster)
small 14.44 ms (✅ 1.07x faster) 11.47 ms (✅ 1.17x faster) 10.66 ms (✅ 1.19x faster) 9.99 ms (✅ 1.20x faster)

segmented_payload

``
262144 83.25 ns (---)
65536 30.14 ns (---)
8192 16.91 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
1MB 42.85 µs (✅ 1.02x slower) 43.13 µs (✅ 1.00x faster) 41.67 µs (✅ 1.01x slower)
64MB 9.91 ms (❌ 1.19x slower) 10.33 ms (✅ 1.09x slower) 6.56 ms (✅ 1.06x slower)

snapshots

create restore
default 637.67 µs (✅ 1.17x faster) 90.97 µs (❌ 1.20x slower)
large 316.11 ms (✅ 1.20x faster) 40.35 ms (✅ 1.03x slower)
medium 74.02 ms (✅ 1.21x faster) 137.44 µs (✅ 1.01x faster)
small 9.31 ms (✅ 1.22x faster) 99.71 µs (❌ 1.29x slower)

virtq_readonly_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 17.77 µs (---) 18.31 µs (---) 25.93 µs (---) 25.85 µs (---)
65536 4.53 µs (---) 4.82 µs (---) 5.91 µs (---) 5.96 µs (---)
8192 807.66 ns (---) 821.96 ns (---) 950.84 ns (---) 927.53 ns (---)

virtq_readwrite_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 25.97 µs (---) 25.91 µs (---) 44.35 µs (---) 48.28 µs (---)
65536 6.60 µs (---) 6.55 µs (---) 10.75 µs (---) 10.87 µs (---)
8192 1.22 µs (---) 1.22 µs (---) 1.39 µs (---) 1.42 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/default — ✅ 1.46x faster
  • Worst regression: snapshots/restore/small — ❌ 1.29x slower

1 similar comment
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.15x slower* → 🚀 **2.76x faster**)

alloc_fragmented

fragmented_256
11.90 ns (---)

alloc_lifo

``
256 1.18 µs (---)
4096 1.20 µs (---)
1500 1.20 µs (---)

alloc_single

``
64 11.77 ns (---)
256 11.83 ns (---)
128 11.82 ns (---)
512 12.14 ns (---)
4096 12.09 ns (---)
1024 12.07 ns (---)
1500 12.12 ns (---)

free

``
256 12.09 ns (---)
4096 12.17 ns (---)
1500 12.15 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
19.47 ns (---) 24.47 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
4.72 ms (✅ 1.08x slower) 9.90 ms (❌ 1.15x slower)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 25.86 µs (❌ 1.13x slower) 74.82 µs (✅ 1.08x slower) 14.15 µs (❌ 1.13x slower)
6.86 ms (✅ 1.03x slower) 21.59 µs (✅ 1.06x slower)
large 24.65 µs (✅ 1.09x slower) 101.36 µs (✅ 1.05x slower) 12.83 µs (✅ 1.03x slower)
medium 25.03 µs (❌ 1.13x slower) 78.40 µs (✅ 1.01x faster) 14.18 µs (❌ 1.13x slower)
default 25.57 µs (✅ 1.10x slower) 66.69 µs (✅ 1.09x slower) 14.10 µs (❌ 1.12x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
590.94 ms (✅ 1.06x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
7.62 ns (---) 7.63 ns (---) 258.23 ns (---) 7.65 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
23.06 µs (✅ 1.70x faster) 22.92 µs (✅ 1.66x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.74 ms (✅ 1.04x faster) 416.63 µs (✅ 1.25x faster) 11.70 ms (✅ 1.01x slower) 454.61 µs (✅ 1.24x faster)
large 26.03 ms (🚀 2.58x faster) 23.56 ms (🚀 2.76x faster) 37.92 ms (🚀 2.07x faster) 24.28 ms (🚀 2.72x faster)
small 3.55 ms (✅ 1.40x faster) 2.00 ms (🚀 1.89x faster) 15.70 ms (✅ 1.00x slower) 2.13 ms (✅ 1.68x faster)
medium 9.11 ms (🚀 2.13x faster) 7.42 ms (🚀 2.38x faster) 21.14 ms (✅ 1.45x faster) 7.72 ms (🚀 2.33x faster)

segmented_payload

``
8192 12.92 ns (---)
65536 23.47 ns (---)
262144 61.72 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 41.26 µs (✅ 1.04x slower) 27.25 µs (✅ 1.32x faster) 38.10 µs (✅ 1.00x faster)
64MB 4.37 ms (✅ 1.03x slower) 2.11 ms (✅ 1.19x faster) 3.92 ms (✅ 1.04x faster)

snapshots

create restore
large 84.52 ms (✅ 1.66x faster) 35.93 µs (✅ 1.51x faster)
small 2.67 ms (✅ 1.32x faster) 12.27 µs (❌ 1.12x slower)
medium 21.91 ms (✅ 1.59x faster) 16.43 µs (✅ 1.05x slower)
default 330.58 µs (✅ 1.19x faster) 10.60 µs (✅ 1.04x slower)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 642.88 ns (---) 776.21 ns (---) 775.83 ns (---) 855.85 ns (---)
65536 3.44 µs (---) 4.70 µs (---) 4.73 µs (---) 3.43 µs (---)
262144 12.68 µs (---) 18.26 µs (---) 18.62 µs (---) 12.66 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 945.82 ns (---) 1.15 µs (---) 1.15 µs (---) 943.13 ns (---)
65536 4.88 µs (---) 7.71 µs (---) 7.85 µs (---) 4.88 µs (---)
262144 20.42 µs (---) 32.45 µs (---) 32.37 µs (---) 20.40 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.76x faster
  • Worst regression: function_call_serialization/deserialize_function_call — ❌ 1.15x slower
kvm / intel (Linux) (✅ **1.07x slower** → 🚀 **11.85x faster**)

alloc_fragmented

fragmented_256
11.54 ns (---)

alloc_lifo

``
256 1.14 µs (---)
4096 1.18 µs (---)
1500 1.18 µs (---)

alloc_single

``
64 11.41 ns (---)
256 11.37 ns (---)
128 11.39 ns (---)
512 11.69 ns (---)
4096 11.67 ns (---)
1024 11.73 ns (---)
1500 11.67 ns (---)

free

``
256 11.42 ns (---)
4096 11.99 ns (---)
1500 11.81 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
25.80 ns (---) 22.90 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
5.10 ms (✅ 1.12x faster) 7.96 ms (✅ 1.57x faster)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 32.68 µs (✅ 1.15x faster) 31.45 µs (✅ 1.20x faster) 17.57 µs (✅ 1.07x faster)
6.77 ms (✅ 1.02x slower) 18.38 µs (✅ 1.47x faster)
large 32.75 µs (✅ 1.07x faster) 67.97 µs (✅ 1.13x faster) 17.64 µs (✅ 1.09x faster)
medium 32.81 µs (✅ 1.15x faster) 37.11 µs (✅ 1.16x faster) 16.82 µs (✅ 1.12x faster)
default 33.73 µs (✅ 1.10x faster) 31.09 µs (✅ 1.21x faster) 17.42 µs (✅ 1.09x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
559.91 ms (✅ 1.22x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
7.07 ns (---) 7.08 ns (---) 252.85 ns (---) 7.06 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
27.77 µs (🚀 4.00x faster) 27.87 µs (🚀 3.81x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.66 ms (✅ 1.26x faster) 346.23 µs (✅ 1.41x faster) 12.15 ms (✅ 1.07x slower) 375.18 µs (✅ 1.42x faster)
large 66.50 ms (🚀 2.34x faster) 65.00 ms (🚀 2.36x faster) 76.06 ms (🚀 2.21x faster) 62.42 ms (🚀 2.47x faster)
small 3.43 ms (✅ 1.61x faster) 2.02 ms (✅ 1.77x faster) 15.28 ms (✅ 1.13x faster) 2.00 ms (🚀 1.83x faster)
medium 18.66 ms (🚀 2.30x faster) 16.87 ms (🚀 2.41x faster) 28.87 ms (🚀 1.90x faster) 16.78 ms (🚀 2.40x faster)

segmented_payload

``
8192 13.03 ns (---)
65536 28.95 ns (---)
262144 82.86 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 41.79 µs (✅ 1.12x faster) 23.38 µs (✅ 1.53x faster) 37.29 µs (✅ 1.10x faster)
64MB 4.57 ms (🚀 2.46x faster) 2.39 ms (✅ 1.29x faster) 4.60 ms (✅ 1.21x faster)

snapshots

create restore
large 156.77 ms (🚀 1.89x faster) 50.52 µs (🚀 11.85x faster)
small 2.28 ms (✅ 1.74x faster) 9.98 µs (✅ 1.15x faster)
medium 37.17 ms (🚀 2.01x faster) 13.41 µs (✅ 1.25x faster)
default 261.86 µs (✅ 1.47x faster) 9.69 µs (✅ 1.14x faster)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 517.54 ns (---) 633.29 ns (---) 656.02 ns (---) 511.93 ns (---)
65536 4.46 µs (---) 5.29 µs (---) 5.31 µs (---) 4.48 µs (---)
262144 16.18 µs (---) 20.30 µs (---) 20.45 µs (---) 16.17 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 713.44 ns (---) 939.27 ns (---) 946.94 ns (---) 713.99 ns (---)
65536 6.10 µs (---) 9.01 µs (---) 9.13 µs (---) 6.05 µs (---)
262144 22.23 µs (---) 34.04 µs (---) 34.47 µs (---) 22.21 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 11.85x faster
  • Worst regression: sandboxes/create_initialized_and_drop/default — ✅ 1.07x slower
mshv3 / amd (Linux) (❌ *1.18x slower* → 🚀 **8.04x faster**)

alloc_fragmented

fragmented_256
15.22 ns (---)

alloc_lifo

``
4096 1.54 µs (---)
256 1.52 µs (---)
1500 1.54 µs (---)

alloc_single

``
128 15.12 ns (---)
4096 15.69 ns (---)
256 15.12 ns (---)
1500 15.69 ns (---)
1024 15.71 ns (---)
512 15.70 ns (---)
64 15.13 ns (---)

free

``
4096 15.55 ns (---)
256 15.39 ns (---)
1500 15.57 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
30.36 ns (---) 24.29 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
6.17 ms (✅ 1.12x faster) 4.89 ms (✅ 1.08x faster)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 58.54 µs (✅ 1.58x faster) 130.10 µs (✅ 1.26x faster) 39.06 µs (✅ 1.36x faster)
small 58.84 µs (✅ 1.55x faster) 133.64 µs (✅ 1.18x faster) 36.93 µs (✅ 1.46x faster)
large 59.46 µs (✅ 1.53x faster) 189.54 µs (✅ 1.23x faster) 37.21 µs (✅ 1.52x faster)
50.41 µs (✅ 1.51x faster) 53.33 µs (✅ 1.08x slower)
medium 58.70 µs (✅ 1.51x faster) 139.91 µs (✅ 1.21x faster) 37.13 µs (✅ 1.52x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
73.45 ms (✅ 1.02x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_4096 alloc_sg_64k alloc_dealloc_1500
9.79 ns (---) 9.79 ns (---) 304.35 ns (---) 9.78 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
49.65 µs (✅ 1.73x faster) 47.87 µs (✅ 1.68x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 46.74 ms (✅ 1.33x faster) 10.67 ms (🚀 2.44x faster) 10.30 ms (🚀 2.50x faster) 13.06 ms (🚀 2.17x faster)
small 36.14 ms (✅ 1.16x faster) 1.99 ms (🚀 2.05x faster) 1.80 ms (🚀 2.15x faster) 3.40 ms (✅ 1.70x faster)
large 85.96 ms (✅ 1.68x faster) 35.64 ms (🚀 2.72x faster) 34.92 ms (🚀 2.70x faster) 42.60 ms (🚀 2.42x faster)
default 33.82 ms (✅ 1.02x faster) 533.85 µs (✅ 1.20x faster) 492.30 µs (✅ 1.24x faster) 1.38 ms (✅ 1.10x faster)

segmented_payload

``
262144 78.43 ns (---)
65536 29.71 ns (---)
8192 16.41 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
64MB 5.09 ms (✅ 1.10x faster) 4.88 ms (✅ 1.00x slower) 2.58 ms (✅ 1.45x faster)
1MB 49.00 µs (❌ 1.18x slower) 48.53 µs (❌ 1.14x slower) 34.98 µs (✅ 1.18x faster)

snapshots

restore create
default 75.46 µs (✅ 1.12x faster) 433.19 µs (✅ 1.18x faster)
large 194.12 µs (🚀 8.04x faster) 113.83 ms (🚀 2.11x faster)
medium 88.35 µs (✅ 1.36x faster) 28.29 ms (🚀 2.03x faster)
small 76.40 µs (✅ 1.18x faster) 2.56 ms (✅ 1.73x faster)

virtq_readonly_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 22.50 µs (---) 23.70 µs (---) 15.94 µs (---) 15.85 µs (---)
65536 5.77 µs (---) 5.83 µs (---) 4.37 µs (---) 4.38 µs (---)
8192 976.16 ns (---) 955.57 ns (---) 738.87 ns (---) 799.93 ns (---)

virtq_readwrite_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 39.76 µs (---) 39.43 µs (---) 25.51 µs (---) 25.53 µs (---)
65536 9.38 µs (---) 9.41 µs (---) 6.05 µs (---) 6.06 µs (---)
8192 1.48 µs (---) 1.42 µs (---) 1.19 µs (---) 1.20 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 8.04x faster
  • Worst regression: shared_memory/copy_from_slice/1MB — ❌ 1.18x slower
mshv3 / intel (Linux) (✅ **1.11x slower** → 🚀 **2.85x faster**)

alloc_fragmented

fragmented_256
14.24 ns (---)

alloc_lifo

``
4096 1.44 µs (---)
256 1.40 µs (---)
1500 1.44 µs (---)

alloc_single

``
4096 14.23 ns (---)
256 13.93 ns (---)
512 14.38 ns (---)
64 13.94 ns (---)
1024 14.23 ns (---)
1500 14.23 ns (---)
128 13.93 ns (---)

free

``
4096 14.39 ns (---)
256 14.19 ns (---)
1500 14.29 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
27.94 ns (---) 20.66 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
13.67 ms (✅ 1.01x faster) 10.50 ms (✅ 1.00x slower)

guest_calls

call call_with_host_function call_with_restore interrupt_latency different_thread
large 74.99 µs (✅ 1.36x faster) 111.51 µs (✅ 1.56x faster) 350.17 µs (✅ 1.19x faster)
small 75.90 µs (✅ 1.40x faster) 111.83 µs (✅ 1.55x faster) 240.59 µs (✅ 1.15x faster)
default 74.67 µs (✅ 1.40x faster) 112.17 µs (✅ 1.50x faster) 244.15 µs (✅ 1.14x faster)
medium 75.20 µs (✅ 1.35x faster) 109.99 µs (✅ 1.54x faster) 256.20 µs (✅ 1.14x faster)
91.12 µs (✅ 1.06x slower) 93.55 µs (✅ 1.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
110.00 ms (✅ 1.17x faster)

recycle_pool

alloc_sg_64k alloc_dealloc_4096 alloc_dealloc_128 alloc_dealloc_1500
284.08 ns (---) 9.48 ns (---) 9.51 ns (---) 9.56 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
90.97 µs (✅ 1.59x faster) 93.31 µs (✅ 1.57x faster)

sandboxes

create_initialized_and_drop create_uninitialized create_uninitialized_and_drop create_initialized
large 111.56 ms (✅ 1.39x faster) 61.84 ms (✅ 1.73x faster) 62.84 ms (✅ 1.68x faster) 70.09 ms (✅ 1.61x faster)
default 32.74 ms (✅ 1.01x faster) 369.81 µs (✅ 1.33x faster) 401.55 µs (✅ 1.28x faster) 1.57 ms (✅ 1.06x faster)
medium 57.00 ms (✅ 1.11x faster) 17.51 ms (✅ 1.61x faster) 17.19 ms (✅ 1.64x faster) 20.21 ms (✅ 1.55x faster)
small 41.90 ms (✅ 1.01x faster) 3.19 ms (✅ 1.37x faster) 2.88 ms (✅ 1.56x faster) 5.04 ms (✅ 1.29x faster)

segmented_payload

``
262144 86.60 ns (---)
8192 15.18 ns (---)
65536 30.84 ns (---)

shared_memory

fill copy_from_slice copy_to_slice
64MB 7.24 ms (✅ 1.00x faster) 12.40 ms (✅ 1.09x slower) 8.90 ms (✅ 1.06x faster)
1MB 32.52 µs (✅ 1.17x faster) 64.96 µs (✅ 1.04x slower) 42.90 µs (✅ 1.11x slower)

snapshots

restore create
large 916.67 µs (🚀 2.85x faster) 159.45 ms (✅ 1.32x faster)
small 157.51 µs (✅ 1.00x slower) 3.85 ms (✅ 1.55x faster)
default 158.66 µs (✅ 1.01x slower) 356.40 µs (✅ 1.27x faster)
medium 173.57 µs (✅ 1.01x faster) 41.22 ms (✅ 1.33x faster)

virtq_readonly_allocator_strategy

recycle_pool_segmented_fragmented buffer_pool_run recycle_pool_segmented buffer_pool_run_fragmented
262144 21.35 µs (---) 15.86 µs (---) 21.39 µs (---) 15.86 µs (---)
8192 667.30 ns (---) 569.79 ns (---) 666.21 ns (---) 550.11 ns (---)
65536 5.50 µs (---) 4.37 µs (---) 5.62 µs (---) 4.36 µs (---)

virtq_readwrite_allocator_strategy

recycle_pool_segmented_fragmented buffer_pool_run recycle_pool_segmented buffer_pool_run_fragmented
262144 44.12 µs (---) 26.07 µs (---) 42.69 µs (---) 25.85 µs (---)
8192 1.06 µs (---) 752.00 ns (---) 1.07 µs (---) 755.58 ns (---)
65536 9.38 µs (---) 5.89 µs (---) 9.32 µs (---) 5.89 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 2.85x faster
  • Worst regression: shared_memory/copy_to_slice/1MB — ✅ 1.11x slower
hyperv-ws2025 / amd (Windows) (❌ *1.29x slower* → ✅ **1.46x faster**)

alloc_fragmented

fragmented_256
15.41 ns (---)

alloc_lifo

``
1500 1.60 µs (---)
256 1.54 µs (---)
4096 1.60 µs (---)

alloc_single

``
1024 16.10 ns (---)
128 15.41 ns (---)
1500 15.98 ns (---)
256 15.39 ns (---)
4096 15.99 ns (---)
512 15.98 ns (---)
64 15.34 ns (---)

free

``
1500 16.01 ns (---)
256 15.36 ns (---)
4096 15.96 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
22.35 ns (---) 31.02 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
10.58 ms (✅ 1.08x slower) 8.98 ms (✅ 1.05x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 40.16 µs (✅ 1.00x slower) 66.37 µs (✅ 1.01x slower) 155.98 µs (✅ 1.10x slower)
large 43.25 µs (✅ 1.09x slower) 67.76 µs (✅ 1.03x slower) 333.92 µs (✅ 1.07x slower)
medium 42.57 µs (✅ 1.08x slower) 68.61 µs (✅ 1.07x slower) 190.69 µs (✅ 1.10x slower)
small 41.07 µs (✅ 1.07x slower) 67.41 µs (✅ 1.07x slower) 147.17 µs (✅ 1.02x slower)
78.14 µs (✅ 1.03x slower) 34.03 µs (❌ 1.13x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
3.66 s (✅ 1.11x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_dealloc_4096 alloc_sg_64k
9.87 ns (---) 9.87 ns (---) 10.03 ns (---) 351.66 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
63.09 µs (✅ 1.15x faster) 65.66 µs (✅ 1.12x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 5.77 ms (❌ 1.15x slower) 3.89 ms (✅ 1.09x faster) 1.03 ms (✅ 1.26x faster) 1.03 ms (✅ 1.46x faster)
large 272.55 ms (✅ 1.23x faster) 237.72 ms (✅ 1.26x faster) 313.10 ms (✅ 1.20x faster) 287.44 ms (✅ 1.21x faster)
medium 71.14 ms (✅ 1.19x faster) 61.31 ms (✅ 1.23x faster) 78.77 ms (✅ 1.20x faster) 72.76 ms (✅ 1.21x faster)
small 14.44 ms (✅ 1.07x faster) 11.47 ms (✅ 1.17x faster) 10.66 ms (✅ 1.19x faster) 9.99 ms (✅ 1.20x faster)

segmented_payload

``
262144 83.25 ns (---)
65536 30.14 ns (---)
8192 16.91 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
1MB 42.85 µs (✅ 1.02x slower) 43.13 µs (✅ 1.00x faster) 41.67 µs (✅ 1.01x slower)
64MB 9.91 ms (❌ 1.19x slower) 10.33 ms (✅ 1.09x slower) 6.56 ms (✅ 1.06x slower)

snapshots

create restore
default 637.67 µs (✅ 1.17x faster) 90.97 µs (❌ 1.20x slower)
large 316.11 ms (✅ 1.20x faster) 40.35 ms (✅ 1.03x slower)
medium 74.02 ms (✅ 1.21x faster) 137.44 µs (✅ 1.01x faster)
small 9.31 ms (✅ 1.22x faster) 99.71 µs (❌ 1.29x slower)

virtq_readonly_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 17.77 µs (---) 18.31 µs (---) 25.93 µs (---) 25.85 µs (---)
65536 4.53 µs (---) 4.82 µs (---) 5.91 µs (---) 5.96 µs (---)
8192 807.66 ns (---) 821.96 ns (---) 950.84 ns (---) 927.53 ns (---)

virtq_readwrite_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 25.97 µs (---) 25.91 µs (---) 44.35 µs (---) 48.28 µs (---)
65536 6.60 µs (---) 6.55 µs (---) 10.75 µs (---) 10.87 µs (---)
8192 1.22 µs (---) 1.22 µs (---) 1.39 µs (---) 1.42 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/default — ✅ 1.46x faster
  • Worst regression: snapshots/restore/small — ❌ 1.29x slower

Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
@hyperlight-gh-bot

Copy link
Copy Markdown

Benchmark Results

kvm / amd (Linux) (❌ *1.13x slower* → 🚀 **2.73x faster**)

alloc_fragmented

fragmented_256
11.85 ns (---)

alloc_lifo

``
256 1.20 µs (---)
4096 1.20 µs (---)
1500 1.21 µs (---)

alloc_single

``
64 11.77 ns (---)
256 11.78 ns (---)
128 11.77 ns (---)
512 12.07 ns (---)
4096 12.27 ns (---)
1024 12.08 ns (---)
1500 12.09 ns (---)

free

``
256 11.76 ns (---)
4096 12.13 ns (---)
1500 12.10 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
20.14 ns (---) 23.75 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
4.95 ms (❌ 1.13x slower) 9.04 ms (✅ 1.05x slower)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 23.91 µs (✅ 1.01x slower) 73.49 µs (❌ 1.11x slower) 14.00 µs (❌ 1.12x slower)
6.73 ms (✅ 1.01x slower) 21.68 µs (✅ 1.07x slower)
large 24.59 µs (✅ 1.08x slower) 99.02 µs (✅ 1.06x slower) 13.25 µs (✅ 1.07x slower)
medium 24.59 µs (❌ 1.12x slower) 78.97 µs (✅ 1.02x slower) 14.07 µs (❌ 1.12x slower)
default 25.69 µs (❌ 1.11x slower) 68.00 µs (❌ 1.12x slower) 14.05 µs (✅ 1.11x slower)

guest_functions_with_large_parameters

guest_call_with_large_parameters
589.06 ms (✅ 1.06x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
7.62 ns (---) 7.62 ns (---) 256.77 ns (---) 7.69 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
23.04 µs (✅ 1.69x faster) 22.15 µs (✅ 1.73x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.83 ms (✅ 1.01x slower) 418.26 µs (✅ 1.24x faster) 11.88 ms (✅ 1.02x slower) 450.25 µs (✅ 1.25x faster)
large 25.87 ms (🚀 2.59x faster) 23.85 ms (🚀 2.73x faster) 38.54 ms (🚀 2.03x faster) 24.56 ms (🚀 2.69x faster)
small 3.86 ms (✅ 1.29x faster) 2.03 ms (🚀 1.86x faster) 15.86 ms (✅ 1.01x slower) 2.21 ms (✅ 1.62x faster)
medium 9.25 ms (🚀 2.09x faster) 7.51 ms (🚀 2.35x faster) 21.13 ms (✅ 1.45x faster) 7.83 ms (🚀 2.30x faster)

segmented_payload

``
8192 12.81 ns (---)
65536 23.17 ns (---)
262144 61.05 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 37.28 µs (✅ 1.06x faster) 27.33 µs (✅ 1.32x faster) 37.61 µs (✅ 1.02x faster)
64MB 4.44 ms (✅ 1.04x slower) 2.20 ms (✅ 1.14x faster) 4.20 ms (✅ 1.03x slower)

snapshots

create restore
large 84.80 ms (✅ 1.65x faster) 35.87 µs (✅ 1.53x faster)
small 2.32 ms (✅ 1.52x faster) 11.52 µs (✅ 1.09x slower)
medium 21.71 ms (✅ 1.61x faster) 15.35 µs (✅ 1.00x faster)
default 327.97 µs (✅ 1.19x faster) 10.61 µs (✅ 1.05x slower)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 662.92 ns (---) 785.15 ns (---) 855.09 ns (---) 676.99 ns (---)
65536 3.47 µs (---) 4.66 µs (---) 4.80 µs (---) 3.46 µs (---)
262144 18.15 µs (---) 18.29 µs (---) 26.39 µs (---) 14.28 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 952.47 ns (---) 1.21 µs (---) 1.19 µs (---) 945.89 ns (---)
65536 4.90 µs (---) 7.99 µs (---) 7.76 µs (---) 4.91 µs (---)
262144 21.81 µs (---) 31.76 µs (---) 31.75 µs (---) 21.86 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/large — 🚀 2.73x faster
  • Worst regression: function_call_serialization/serialize_function_call — ❌ 1.13x slower
kvm / intel (Linux) (✅ **1.08x slower** → 🚀 **11.46x faster**)

alloc_fragmented

fragmented_256
13.25 ns (---)

alloc_lifo

``
256 1.31 µs (---)
4096 1.35 µs (---)
1500 1.36 µs (---)

alloc_single

``
64 12.87 ns (---)
256 12.89 ns (---)
128 12.90 ns (---)
512 13.31 ns (---)
4096 13.39 ns (---)
1024 13.36 ns (---)
1500 13.33 ns (---)

free

``
256 13.06 ns (---)
4096 13.44 ns (---)
1500 13.56 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
26.16 ns (---) 26.07 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
5.82 ms (✅ 1.02x slower) 8.62 ms (✅ 1.45x faster)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 36.32 µs (✅ 1.06x faster) 36.97 µs (✅ 1.01x faster) 20.08 µs (✅ 1.08x slower)
6.70 ms (✅ 1.01x slower) 21.14 µs (✅ 1.28x faster)
large 37.34 µs (✅ 1.04x slower) 80.70 µs (✅ 1.05x slower) 20.35 µs (✅ 1.06x slower)
medium 37.66 µs (✅ 1.02x slower) 40.91 µs (✅ 1.04x faster) 20.51 µs (✅ 1.06x slower)
default 34.74 µs (✅ 1.08x faster) 35.96 µs (✅ 1.04x faster) 19.07 µs (✅ 1.01x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
605.61 ms (✅ 1.13x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_sg_64k alloc_dealloc_4096
8.08 ns (---) 8.10 ns (---) 291.63 ns (---) 8.13 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
31.48 µs (🚀 3.52x faster) 31.97 µs (🚀 3.31x faster)

sandboxes

create_initialized create_uninitialized create_initialized_and_drop create_uninitialized_and_drop
default 1.84 ms (✅ 1.14x faster) 376.96 µs (✅ 1.30x faster) 11.47 ms (✅ 1.01x slower) 417.76 µs (✅ 1.29x faster)
large 67.45 ms (🚀 2.31x faster) 65.20 ms (🚀 2.35x faster) 79.75 ms (🚀 2.11x faster) 67.12 ms (🚀 2.29x faster)
small 3.76 ms (✅ 1.47x faster) 2.22 ms (✅ 1.61x faster) 15.71 ms (✅ 1.10x faster) 2.30 ms (✅ 1.59x faster)
medium 19.45 ms (🚀 2.21x faster) 17.53 ms (🚀 2.32x faster) 31.48 ms (✅ 1.75x faster) 18.11 ms (🚀 2.22x faster)

segmented_payload

``
8192 14.83 ns (---)
65536 33.41 ns (---)
262144 94.90 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 45.87 µs (✅ 1.02x faster) 26.87 µs (✅ 1.34x faster) 42.69 µs (✅ 1.04x slower)
64MB 5.50 ms (🚀 2.05x faster) 2.77 ms (✅ 1.12x faster) 4.99 ms (✅ 1.11x faster)

snapshots

create restore
large 160.24 ms (🚀 1.85x faster) 54.36 µs (🚀 11.46x faster)
small 2.44 ms (✅ 1.63x faster) 11.33 µs (✅ 1.02x faster)
medium 39.28 ms (🚀 1.90x faster) 14.87 µs (✅ 1.13x faster)
default 290.36 µs (✅ 1.32x faster) 10.84 µs (✅ 1.02x faster)

virtq_readonly_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 581.59 ns (---) 718.98 ns (---) 719.95 ns (---) 573.62 ns (---)
65536 5.03 µs (---) 6.19 µs (---) 6.01 µs (---) 5.04 µs (---)
262144 18.57 µs (---) 23.04 µs (---) 23.12 µs (---) 18.53 µs (---)

virtq_readwrite_allocator_strategy

buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented buffer_pool_run
8192 817.20 ns (---) 1.05 µs (---) 1.09 µs (---) 816.07 ns (---)
65536 7.03 µs (---) 10.15 µs (---) 10.21 µs (---) 7.00 µs (---)
262144 25.25 µs (---) 38.83 µs (---) 38.20 µs (---) 25.04 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 11.46x faster
  • Worst regression: guest_calls/call/small — ✅ 1.08x slower
mshv3 / amd (Linux) (✅ **1.11x slower** → 🚀 **4.25x faster**)

alloc_fragmented

fragmented_256
14.99 ns (---)

alloc_lifo

``
4096 1.48 µs (---)
256 1.43 µs (---)
1500 1.47 µs (---)

alloc_single

``
128 14.36 ns (---)
4096 14.61 ns (---)
256 14.27 ns (---)
1500 14.62 ns (---)
1024 14.62 ns (---)
512 14.72 ns (---)
64 14.33 ns (---)

free

``
4096 15.03 ns (---)
256 14.58 ns (---)
1500 14.94 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
28.87 ns (---) 21.63 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
7.24 ms (✅ 1.05x slower) 5.40 ms (✅ 1.02x slower)

guest_calls

call_with_host_function call_with_restore call different_thread interrupt_latency
default 64.32 µs (✅ 1.43x faster) 132.42 µs (✅ 1.20x faster) 42.94 µs (✅ 1.24x faster)
small 64.34 µs (✅ 1.43x faster) 140.86 µs (✅ 1.11x faster) 40.46 µs (✅ 1.39x faster)
large 62.94 µs (✅ 1.43x faster) 213.68 µs (✅ 1.12x faster) 40.12 µs (✅ 1.39x faster)
59.83 µs (✅ 1.28x faster) 48.29 µs (✅ 1.03x faster)
medium 64.14 µs (✅ 1.38x faster) 155.03 µs (✅ 1.09x faster) 39.69 µs (✅ 1.40x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
76.10 ms (✅ 1.02x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_4096 alloc_sg_64k alloc_dealloc_1500
9.40 ns (---) 9.43 ns (---) 284.52 ns (---) 9.46 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
60.89 µs (✅ 1.41x faster) 60.91 µs (✅ 1.33x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_uninitialized create_initialized
medium 45.32 ms (✅ 1.38x faster) 10.05 ms (🚀 2.59x faster) 9.48 ms (🚀 2.71x faster) 12.22 ms (🚀 2.32x faster)
small 33.98 ms (✅ 1.23x faster) 1.94 ms (🚀 2.09x faster) 1.80 ms (🚀 2.15x faster) 3.15 ms (🚀 1.84x faster)
large 82.64 ms (✅ 1.75x faster) 32.65 ms (🚀 2.97x faster) 32.40 ms (🚀 2.91x faster) 40.23 ms (🚀 2.56x faster)
default 33.90 ms (✅ 1.02x faster) 468.65 µs (✅ 1.37x faster) 440.70 µs (✅ 1.38x faster) 1.39 ms (✅ 1.10x faster)

segmented_payload

``
262144 81.50 ns (---)
65536 28.88 ns (---)
8192 15.88 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
64MB 6.20 ms (✅ 1.11x slower) 5.10 ms (✅ 1.05x slower) 3.73 ms (✅ 1.00x faster)
1MB 41.39 µs (✅ 1.00x faster) 42.44 µs (✅ 1.01x faster) 41.29 µs (✅ 1.00x faster)

snapshots

restore create
default 81.13 µs (✅ 1.04x faster) 356.02 µs (✅ 1.43x faster)
large 316.48 µs (🚀 4.25x faster) 167.80 ms (✅ 1.43x faster)
medium 99.83 µs (✅ 1.23x faster) 40.96 ms (✅ 1.40x faster)
small 84.01 µs (✅ 1.07x faster) 2.25 ms (🚀 1.96x faster)

virtq_readonly_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 27.89 µs (---) 26.54 µs (---) 15.89 µs (---) 17.16 µs (---)
65536 6.01 µs (---) 5.93 µs (---) 4.04 µs (---) 4.04 µs (---)
8192 927.00 ns (---) 942.56 ns (---) 708.93 ns (---) 753.64 ns (---)

virtq_readwrite_allocator_strategy

recycle_pool_segmented_fragmented recycle_pool_segmented buffer_pool_run_fragmented buffer_pool_run
262144 45.09 µs (---) 40.13 µs (---) 27.94 µs (---) 27.81 µs (---)
65536 10.39 µs (---) 9.72 µs (---) 5.69 µs (---) 5.67 µs (---)
8192 1.34 µs (---) 1.37 µs (---) 1.14 µs (---) 1.11 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 4.25x faster
  • Worst regression: shared_memory/copy_from_slice/64MB — ✅ 1.11x slower
mshv3 / intel (Linux) (❌ *1.16x slower* → 🚀 **3.13x faster**)

alloc_fragmented

fragmented_256
14.37 ns (---)

alloc_lifo

``
1500 1.43 µs (---)
256 1.41 µs (---)
4096 1.43 µs (---)

alloc_single

``
1024 14.37 ns (---)
1500 14.40 ns (---)
128 13.91 ns (---)
256 13.97 ns (---)
64 13.91 ns (---)
4096 14.34 ns (---)
512 14.39 ns (---)

free

``
1500 14.26 ns (---)
256 14.18 ns (---)
4096 14.23 ns (---)

free_list_reuse

lifo_pattern fifo_pattern
27.96 ns (---) 24.24 ns (---)

function_call_serialization

serialize_function_call deserialize_function_call
10.42 ms (✅ 1.01x faster) 13.72 ms (✅ 1.01x faster)

guest_calls

call_with_host_function call_with_restore different_thread call interrupt_latency
small 70.06 µs (🚀 2.46x faster) 130.60 µs (🚀 2.11x faster) 43.91 µs (🚀 2.41x faster)
default 68.97 µs (🚀 2.41x faster) 136.43 µs (🚀 2.02x faster) 43.95 µs (🚀 2.37x faster)
medium 68.74 µs (🚀 2.47x faster) 140.78 µs (🚀 2.04x faster) 43.53 µs (🚀 2.35x faster)
58.05 µs (🚀 2.11x faster) 59.55 µs (✅ 1.45x faster)
large 69.09 µs (🚀 2.51x faster) 246.02 µs (✅ 1.65x faster) 43.97 µs (🚀 2.31x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
107.12 ms (✅ 1.20x faster)

recycle_pool

alloc_sg_64k alloc_dealloc_4096 alloc_dealloc_128 alloc_dealloc_1500
293.98 ns (---) 9.49 ns (---) 9.48 ns (---) 9.56 ns (---)

sample_workloads

24K_in_8K_out_rust 24K_in_8K_out_c
56.80 µs (🚀 2.57x faster) 55.12 µs (🚀 2.64x faster)

sandboxes

create_initialized_and_drop create_uninitialized_and_drop create_initialized create_uninitialized
small 43.48 ms (✅ 1.03x slower) 2.35 ms (🚀 1.92x faster) 4.93 ms (✅ 1.31x faster) 2.55 ms (✅ 1.71x faster)
large 112.08 ms (✅ 1.39x faster) 63.22 ms (✅ 1.67x faster) 70.20 ms (✅ 1.60x faster) 64.00 ms (✅ 1.67x faster)
medium 56.84 ms (✅ 1.11x faster) 17.46 ms (✅ 1.61x faster) 19.92 ms (✅ 1.58x faster) 17.80 ms (✅ 1.59x faster)
default 33.56 ms (✅ 1.01x slower) 395.95 µs (✅ 1.30x faster) 1.54 ms (✅ 1.09x faster) 360.00 µs (✅ 1.36x faster)

segmented_payload

``
65536 30.83 ns (---)
262144 86.58 ns (---)
8192 15.42 ns (---)

shared_memory

copy_from_slice fill copy_to_slice
1MB 72.36 µs (❌ 1.16x slower) 32.53 µs (✅ 1.17x faster) 43.37 µs (❌ 1.12x slower)
64MB 11.69 ms (✅ 1.03x slower) 6.99 ms (✅ 1.04x faster) 8.77 ms (✅ 1.08x faster)

snapshots

create restore
medium 41.74 ms (✅ 1.32x faster) 93.06 µs (🚀 1.85x faster)
large 160.09 ms (✅ 1.31x faster) 833.84 µs (🚀 3.13x faster)
default 324.64 µs (✅ 1.40x faster) 83.25 µs (🚀 1.90x faster)
small 3.55 ms (✅ 1.68x faster) 80.56 µs (🚀 1.93x faster)

virtq_readonly_allocator_strategy

buffer_pool_run recycle_pool_segmented_fragmented buffer_pool_run_fragmented recycle_pool_segmented
65536 4.36 µs (---) 5.57 µs (---) 4.36 µs (---) 5.53 µs (---)
262144 15.93 µs (---) 21.16 µs (---) 16.07 µs (---) 20.72 µs (---)
8192 581.79 ns (---) 675.47 ns (---) 559.87 ns (---) 674.40 ns (---)

virtq_readwrite_allocator_strategy

buffer_pool_run recycle_pool_segmented_fragmented buffer_pool_run_fragmented recycle_pool_segmented
65536 5.99 µs (---) 9.19 µs (---) 5.99 µs (---) 9.20 µs (---)
262144 24.98 µs (---) 42.19 µs (---) 25.17 µs (---) 40.21 µs (---)
8192 742.02 ns (---) 1.00 µs (---) 747.60 ns (---) 1.09 µs (---)

Summary

  • Biggest gain: snapshots/restore/large — 🚀 3.13x faster
  • Worst regression: shared_memory/copy_from_slice/1MB — ❌ 1.16x slower
hyperv-ws2025 / amd (Windows) (✅ **1.11x slower** → 🚀 **1.95x faster**)

alloc_fragmented

fragmented_256
15.72 ns (---)

alloc_lifo

``
1500 1.60 µs (---)
256 1.55 µs (---)
4096 1.61 µs (---)

alloc_single

``
1024 16.29 ns (---)
128 15.65 ns (---)
1500 16.29 ns (---)
256 15.69 ns (---)
4096 16.30 ns (---)
512 16.32 ns (---)
64 15.64 ns (---)

free

``
1500 15.97 ns (---)
256 15.35 ns (---)
4096 15.98 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
22.45 ns (---) 30.85 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
9.17 ms (✅ 1.07x faster) 7.80 ms (✅ 1.10x faster)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 41.78 µs (✅ 1.06x slower) 64.11 µs (✅ 1.02x faster) 142.33 µs (✅ 1.01x slower)
large 39.52 µs (✅ 1.00x faster) 65.52 µs (✅ 1.00x faster) 304.57 µs (✅ 1.03x slower)
medium 43.79 µs (✅ 1.11x slower) 64.00 µs (✅ 1.00x slower) 174.39 µs (✅ 1.00x slower)
small 43.28 µs (✅ 1.11x slower) 64.24 µs (✅ 1.02x slower) 148.52 µs (✅ 1.02x slower)
73.20 µs (✅ 1.04x faster) 29.64 µs (✅ 1.01x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
3.09 s (✅ 1.07x faster)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_dealloc_4096 alloc_sg_64k
9.91 ns (---) 9.85 ns (---) 9.93 ns (---) 363.51 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
63.68 µs (✅ 1.13x faster) 63.93 µs (✅ 1.14x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 4.84 ms (✅ 1.04x faster) 3.98 ms (✅ 1.07x faster) 848.55 µs (✅ 1.48x faster) 774.49 µs (🚀 1.95x faster)
large 252.85 ms (✅ 1.33x faster) 221.07 ms (✅ 1.35x faster) 235.90 ms (✅ 1.59x faster) 212.90 ms (✅ 1.63x faster)
medium 67.20 ms (✅ 1.26x faster) 58.46 ms (✅ 1.29x faster) 59.82 ms (✅ 1.59x faster) 53.30 ms (✅ 1.65x faster)
small 12.96 ms (✅ 1.20x faster) 11.17 ms (✅ 1.21x faster) 8.01 ms (✅ 1.58x faster) 7.40 ms (✅ 1.62x faster)

segmented_payload

``
262144 83.26 ns (---)
65536 30.10 ns (---)
8192 16.99 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
1MB 42.10 µs (✅ 1.01x faster) 42.64 µs (✅ 1.04x faster) 41.39 µs (✅ 1.00x faster)
64MB 7.53 ms (✅ 1.11x faster) 8.34 ms (✅ 1.13x faster) 5.71 ms (✅ 1.08x faster)

snapshots

create restore
default 581.14 µs (✅ 1.26x faster) 75.10 µs (✅ 1.01x slower)
large 292.12 ms (✅ 1.30x faster) 40.84 ms (✅ 1.05x slower)
medium 72.82 ms (✅ 1.23x faster) 132.63 µs (✅ 1.00x faster)
small 9.04 ms (✅ 1.26x faster) 78.50 µs (✅ 1.01x slower)

virtq_readonly_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 18.02 µs (---) 17.81 µs (---) 27.41 µs (---) 26.54 µs (---)
65536 4.24 µs (---) 4.21 µs (---) 6.05 µs (---) 6.21 µs (---)
8192 749.55 ns (---) 735.49 ns (---) 871.75 ns (---) 855.53 ns (---)

virtq_readwrite_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 110.78 µs (---) 26.62 µs (---) 44.25 µs (---) 46.53 µs (---)
65536 6.41 µs (---) 6.58 µs (---) 10.76 µs (---) 10.65 µs (---)
8192 1.19 µs (---) 1.20 µs (---) 1.42 µs (---) 1.38 µs (---)

Summary

  • Biggest gain: sandboxes/create_uninitialized/default — 🚀 1.95x faster
  • Worst regression: guest_calls/call/medium — ✅ 1.11x slower
hyperv-ws2025 / intel (Windows) (❌ *1.23x slower* → ✅ **1.76x faster**)

alloc_fragmented

fragmented_256
15.26 ns (---)

alloc_lifo

``
1500 1.65 µs (---)
256 1.50 µs (---)
4096 1.64 µs (---)

alloc_single

``
1024 16.21 ns (---)
128 14.95 ns (---)
1500 16.24 ns (---)
256 14.97 ns (---)
4096 16.24 ns (---)
512 16.26 ns (---)
64 14.94 ns (---)

free

``
1500 16.45 ns (---)
256 15.25 ns (---)
4096 16.45 ns (---)

free_list_reuse

fifo_pattern lifo_pattern
21.82 ns (---) 30.15 ns (---)

function_call_serialization

deserialize_function_call serialize_function_call
14.12 ms (✅ 1.04x faster) 12.08 ms (✅ 1.03x slower)

guest_calls

call call_with_host_function call_with_restore different_thread interrupt_latency
default 87.14 µs (❌ 1.17x slower) 119.39 µs (✅ 1.03x slower) 250.87 µs (✅ 1.04x slower)
large 79.46 µs (✅ 1.04x slower) 117.19 µs (✅ 1.03x slower) 603.40 µs (✅ 1.02x slower)
medium 101.37 µs (❌ 1.23x slower) 120.79 µs (✅ 1.09x faster) 353.05 µs (✅ 1.07x slower)
small 94.40 µs (❌ 1.18x slower) 120.69 µs (✅ 1.00x faster) 277.77 µs (❌ 1.11x slower)
113.55 µs (✅ 1.03x slower) 47.69 µs (✅ 1.39x faster)

guest_functions_with_large_parameters

guest_call_with_large_parameters
4.68 s (✅ 1.00x slower)

recycle_pool

alloc_dealloc_128 alloc_dealloc_1500 alloc_dealloc_4096 alloc_sg_64k
9.98 ns (---) 9.87 ns (---) 9.83 ns (---) 349.53 ns (---)

sample_workloads

24K_in_8K_out_c 24K_in_8K_out_rust
97.70 µs (✅ 1.21x faster) 99.15 µs (✅ 1.20x faster)

sandboxes

create_initialized_and_drop create_initialized create_uninitialized_and_drop create_uninitialized
default 7.22 ms (✅ 1.07x faster) 5.40 ms (✅ 1.08x faster) 985.79 µs (✅ 1.28x faster) 974.51 µs (✅ 1.15x faster)
large 343.34 ms (✅ 1.14x faster) 293.03 ms (✅ 1.11x faster) 327.65 ms (✅ 1.07x faster) 289.81 ms (✅ 1.06x faster)
medium 92.77 ms (✅ 1.12x faster) 79.84 ms (✅ 1.09x faster) 82.45 ms (✅ 1.06x faster) 74.00 ms (✅ 1.06x faster)
small 18.73 ms (✅ 1.11x faster) 15.48 ms (✅ 1.12x faster) 11.08 ms (✅ 1.10x faster) 9.95 ms (✅ 1.07x faster)

segmented_payload

``
262144 86.03 ns (---)
65536 30.28 ns (---)
8192 17.05 ns (---)

shared_memory

copy_from_slice copy_to_slice fill
1MB 69.63 µs (✅ 1.05x slower) 70.56 µs (✅ 1.02x slower) 33.00 µs (✅ 1.07x faster)
64MB 14.13 ms (✅ 1.01x faster) 16.07 ms (✅ 1.01x faster) 9.44 ms (✅ 1.06x faster)

snapshots

create restore
default 753.07 µs (✅ 1.35x faster) 150.03 µs (✅ 1.01x faster)
large 423.52 ms (✅ 1.12x faster) 55.02 ms (✅ 1.04x faster)
medium 107.96 ms (✅ 1.13x faster) 253.88 µs (✅ 1.76x faster)
small 13.45 ms (✅ 1.17x faster) 164.46 µs (✅ 1.04x slower)

virtq_readonly_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 16.12 µs (---) 15.94 µs (---) 22.04 µs (---) 21.86 µs (---)
65536 4.80 µs (---) 4.81 µs (---) 5.64 µs (---) 5.84 µs (---)
8192 595.39 ns (---) 640.57 ns (---) 746.46 ns (---) 752.56 ns (---)

virtq_readwrite_allocator_strategy

buffer_pool_run buffer_pool_run_fragmented recycle_pool_segmented recycle_pool_segmented_fragmented
262144 26.08 µs (---) 29.62 µs (---) 49.39 µs (---) 50.29 µs (---)
65536 6.30 µs (---) 6.32 µs (---) 10.44 µs (---) 10.67 µs (---)
8192 935.26 ns (---) 898.59 ns (---) 1.23 µs (---) 1.16 µs (---)

Summary

  • Biggest gain: snapshots/restore/medium — ✅ 1.76x faster
  • Worst regression: guest_calls/call/medium — ❌ 1.23x slower

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants