Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 56 additions & 51 deletions sql/core/benchmarks/CSVBenchmark-results.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,76 +2,81 @@
Benchmark to measure CSV read/write performance
================================================================================================

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string 26170 26230 94 0.0 523394.1 1.0X
One quoted string 10877 10913 57 0.0 217531.6 1.0X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns 51860 52209 580 0.0 51859.6 1.0X
Select 100 columns 23745 23781 43 0.0 23745.3 2.2X
Select one column 20220 20278 56 0.0 20219.6 2.6X
count() 3218 3308 105 0.3 3218.2 16.1X
Select 100 columns, one bad input field 28039 28266 212 0.0 28039.4 1.8X
Select 100 columns, corrupt record field 31122 31132 17 0.0 31122.3 1.7X
Select 1000 columns 41330 42137 916 0.0 41330.3 1.0X
Select 100 columns 15231 15390 189 0.1 15231.0 2.7X
Select one column 12603 12667 61 0.1 12603.2 3.3X
count() 2610 2630 28 0.4 2610.3 15.8X
Select 100 columns, one bad input field 17949 18138 202 0.1 17949.1 2.3X
Select 100 columns, corrupt record field 20239 20372 126 0.0 20239.1 2.0X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count() 9648 9682 35 1.0 964.8 1.0X
Select 1 column + count() 6694 6706 16 1.5 669.4 1.4X
count() 1548 1560 19 6.5 154.8 6.2X
Select 10 columns + count() 6079 6168 80 1.6 607.9 1.0X
Select 1 column + count() 3674 3760 112 2.7 367.4 1.7X
count() 870 882 17 11.5 87.0 7.0X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 834 845 16 12.0 83.4 1.0X
to_csv(timestamp) 5794 5808 21 1.7 579.4 0.1X
write timestamps to files 6073 6082 11 1.6 607.3 0.1X
Create a dataset of dates 959 968 12 10.4 95.9 0.9X
to_csv(date) 3980 3987 6 2.5 398.0 0.2X
write dates to files 3894 3899 5 2.6 389.4 0.2X
Create a dataset of timestamps 712 755 38 14.0 71.2 1.0X
to_csv(timestamp) 4106 4176 66 2.4 410.6 0.2X
write timestamps to files 4352 4365 13 2.3 435.2 0.2X
Create a dataset of dates 841 846 7 11.9 84.1 0.8X
to_csv(date) 2660 2674 19 3.8 266.0 0.3X
write dates to files 2942 3003 80 3.4 294.2 0.2X
Create a dataset of times 771 789 28 13.0 77.1 0.9X
to_csv(time) 3086 3130 47 3.2 308.6 0.2X
write times to files 3271 3390 119 3.1 327.1 0.2X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 1180 1186 4 8.5 118.0 1.0X
read timestamps from files 9655 9670 19 1.0 965.5 0.1X
infer timestamps from files 19167 19244 68 0.5 1916.7 0.1X
read date text from files 1111 1129 22 9.0 111.1 1.1X
read date from files 9513 9521 7 1.1 951.3 0.1X
infer date from files 19126 19159 31 0.5 1912.6 0.1X
timestamp strings 1137 1144 7 8.8 113.7 1.0X
parse timestamps from Dataset[String] 10759 10774 22 0.9 1075.9 0.1X
infer timestamps from Dataset[String] 19823 19835 13 0.5 1982.3 0.1X
date strings 1579 1583 5 6.3 157.9 0.7X
parse dates from Dataset[String] 11033 11055 22 0.9 1103.3 0.1X
from_csv(timestamp) 8860 8864 6 1.1 886.0 0.1X
from_csv(date) 9649 9670 27 1.0 964.9 0.1X
infer error timestamps from Dataset[String] with default format 11156 11157 1 0.9 1115.6 0.1X
infer error timestamps from Dataset[String] with user-provided format 11118 11147 26 0.9 1111.8 0.1X
infer error timestamps from Dataset[String] with legacy format 11140 11152 10 0.9 1114.0 0.1X
read timestamp text from files 659 665 5 15.2 65.9 1.0X
read timestamps from files 7524 7566 66 1.3 752.4 0.1X
infer timestamps from files 14004 14125 123 0.7 1400.4 0.0X
read date text from files 551 554 2 18.1 55.1 1.2X
read date from files 5445 5496 47 1.8 544.5 0.1X
infer date from files 10910 10918 8 0.9 1091.0 0.1X
timestamp strings 762 775 18 13.1 76.2 0.9X
parse timestamps from Dataset[String] 5936 6036 89 1.7 593.6 0.1X
infer timestamps from Dataset[String] 10598 10664 67 0.9 1059.8 0.1X
date strings 1205 1212 9 8.3 120.5 0.5X
parse dates from Dataset[String] 6858 6911 49 1.5 685.8 0.1X
from_csv(timestamp) 4824 4859 33 2.1 482.4 0.1X
from_csv(date) 6096 6101 7 1.6 609.6 0.1X
infer error timestamps from Dataset[String] with default format 7161 7167 7 1.4 716.1 0.1X
infer error timestamps from Dataset[String] with user-provided format 7225 7311 136 1.4 722.5 0.1X
infer error timestamps from Dataset[String] with legacy format 7094 7244 131 1.4 709.4 0.1X
read time text from files 587 592 5 17.0 58.7 1.1X
read time from files 4141 4253 109 2.4 414.1 0.2X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters 4268 4277 9 0.0 42682.0 1.0X
pushdown disabled 4250 4254 5 0.0 42501.3 1.0X
w/ filters 863 869 5 0.1 8634.6 4.9X
w/o filters 4222 4293 64 0.0 42222.8 1.0X
pushdown disabled 4170 4176 9 0.0 41702.7 1.0X
w/ filters 520 526 9 0.2 5198.6 8.1X

OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Linux 6.14.0-1017-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
OpenJDK 64-Bit Server VM 17.0.18+8-LTS on Mac OS X 26.5
Apple M3 Pro
Interval: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Read as Intervals 748 749 2 0.4 2493.1 1.0X
Read Raw Strings 304 305 1 1.0 1014.7 2.5X
Read as Intervals 423 426 4 0.7 1408.4 1.0X
Read Raw Strings 170 172 2 1.8 565.5 2.5X


Loading