faker
faker copied to clipboard
Benchmarking loading JSON file vs YML file
Motivation / Background
Continues with https://github.com/faker-ruby/faker/issues/2851
This Pull Request has been created because we are comparing loading times between JSON and YML files, as suggested by @thdaraujo. It turns out that i18n already supports JSON files!
Here's a screenshot on the execution times. While loading with YML file, took 0.320253 seconds, loading the translation with JSON only took 0.019719 seconds. Meaning, it is waay faster, approximately ~%175 percent faster!
Additional information
Also, I did some testing on my machine and ran the TestEsMxLocale
test by loading the YML file and then ran it again with the JSON file, and the execution time is significantly faster when passing a JSON.
As a note, the reasoning for choosing YAML.load_file()
and JSON.load_file()
methods in the task, is because those are the methods that i18n is using.
load_yaml method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L253
load_json method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L271
I believe this path can lead us to make faker much much faster! Please, let me know what you guys think! @thdaraujo @stefannibrasil
All the best! Salo.
Checklist
Before submitting the PR make sure the following are checked:
- [X] This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
- [X] Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex:
[Fix #issue-number]
- [X] Tests are added or updated if you fix a bug, refactor something, or add a feature.
- [X] Tests and Rubocop are passing before submitting your proposed changes.
Just wanted to say that this is in my todo, will get to it soon. Thanks for the PR description!
Hey @salochara , sorry for the late reply!
This is pretty exciting, thanks for doing the investigation. It looks very promising. I'll play around with this idea and get back to you soon.
I've been thinking about how we could approach this, and the potential ways are: 1- maybe convert all yaml locales to json, and only work with json from now on (could break other apps that depend on faker locales) 2- adding a build step that does the conversion when we build the gem and get rid of yaml before packaging the gem (would need to run tests against the package to make sure it all still works)
There are pros and cons to both approaches, so I have to think about it.
I have written some details on how I made the Haskell fakedata (which uses the same yaml file) faster here: https://psibi.in/posts/2019-09-23-fakedata-perf.html IIRC, the overhead involved was in parsing the yaml file and the IO overhead of reading yaml file was negligible.
Hello! 👋 Hope you guys are doing great @thdaraujo @stefannibrasil
Let me know what are your thought on this PR, I'm looking forward to keep on contributing to faker!
Thanks for your review @thdaraujo I'm currently traveling 🌴... I'll work on this next week!
@thdaraujo Hello! 👋 I'm back.
I've addressed your comments. Let me know what you think!
Hi @thdaraujo Hope you're doing great.
Just wondering if you're thinking on moving forward with this? I believe that the time performance is quite good!
Let me know what you think. All the best! Salo.
Hey @salochara , sorry for the late reply!
Do you want to investigate how much work it would take to convert all the yaml files to json? Maybe we could have a script that converts all yaml files to json.
If we have that, I think we could potentially package the gem without the yaml files and only keep the json files, or set up i18n to only load json files. I haven't investigated this idea yet, just throwing it out there to see what you think.
Thanks!