truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

Performance of the Kiba ETL benchmarks is low

Open chrisseaton opened this issue 7 years ago • 3 comments

https://github.com/thbar/kiba/ and https://github.com/thbar/kiba-ruby-benchmarks

From https://github.com/oracle/truffleruby/issues/1054

chrisseaton avatar Feb 05 '18 20:02 chrisseaton

I ran the benchmark from the linked issue. It appears to be a longer running benchmark now and ran ~2.6x faster on native truffleruby:

bundle install
brew install axel
bundle exec kiba setup.etl
bundle exec kiba csv_processing.etl

truffleruby 20.2.0-dev like ruby 2.6.6, GraalVM CE Native [x86_64-darwin]

I, [2020-07-23T16:29:42.462000 #62571]  INFO -- : Running with truffleruby 20.2.0-dev-201c3006, like ruby 2.6.6, GraalVM CE Native [x86_64-darwin]
I, [2020-07-23T16:29:42.464000 #62571]  INFO -- : Opening data/extract-1000k.csv
I, [2020-07-23T16:34:43.571000 #62571]  INFO -- : Processing done (took 301.11 seconds) - 999901 rows processed

MRI 2.6.6

I, [2020-07-23T16:36:50.687523 #63185]  INFO -- : Running with ruby 2.6.6p146 
I, [2020-07-23T16:36:50.687628 #63185]  INFO -- : Opening data/extract-1000k.csv
I, [2020-07-23T16:49:58.552304 #63185]  INFO -- : Processing done (took 787.86 seconds) - 999901 rows processed

bjfish avatar Jul 23 '20 21:07 bjfish

run it with latest truffleruby and got the following results:

bundle exec kiba csv_processing.etl
I, [2021-03-05T20:49:34.039841 #46849]  INFO -- : Running with truffleruby 21.1.0-dev-ffeea561, like ruby 2.7.2, GraalVM CE Native [x86_64-darwin]
I, [2021-03-05T20:49:34.041098 #46849]  INFO -- : Opening data/extract-1000k.csv
I, [2021-03-05T20:52:48.918820 #46849]  INFO -- : Processing done (took 194.88 seconds) - 999901 rows processed
4068053997 51935714 data/output.csv
bundle exec kiba csv_processing.etl       
I, [2021-03-06T00:34:20.465825 #47890]  INFO -- : Running with ruby 2.7.1p83 
/Users/novoi/.rubies/ruby-2.7.1/lib/ruby/gems/2.7.0/gems/kiba-2.0.0/lib/kiba/runner.rb:68: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/Users/novoi/tmp/kiba-ruby-benchmarks/etl/csv_source.rb:25: warning: The called method `initialize' is defined here
I, [2021-03-06T00:34:20.465936 #47890]  INFO -- : Opening data/extract-1000k.csv
/Users/novoi/tmp/kiba-ruby-benchmarks/etl/csv_source.rb:14: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/Users/novoi/.rubies/ruby-2.7.1/lib/ruby/2.7.0/csv.rb:508: warning: The called method `foreach' is defined here
/Users/novoi/tmp/kiba-ruby-benchmarks/etl/csv_source.rb:32: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/Users/novoi/.rubies/ruby-2.7.1/lib/ruby/2.7.0/csv.rb:635: warning: The called method `open' is defined here
I, [2021-03-06T00:37:41.237080 #47890]  INFO -- : Processing done (took 200.77 seconds) - 999901 rows processed
4068053997 51935714 data/output.csv

gogainda avatar Mar 06 '21 00:03 gogainda

Thanks for the update, that looks pretty close. It would still be worth investigating how to get it faster on TruffleRuby.

(the deleted comment above was a test comment)

eregon avatar Mar 08 '21 11:03 eregon