faker icon indicating copy to clipboard operation
faker copied to clipboard

Benchmarking loading JSON file vs YML file

Open salochara opened this issue 1 year ago • 5 comments

Motivation / Background

Continues with https://github.com/faker-ruby/faker/issues/2851

This Pull Request has been created because we are comparing loading times between JSON and YML files, as suggested by @thdaraujo. It turns out that i18n already supports JSON files!

Here's a screenshot on the execution times. While loading with YML file, took 0.320253 seconds, loading the translation with JSON only took 0.019719 seconds. Meaning, it is waay faster, approximately ~%175 percent faster!

CleanShot 2024-01-25 at 17 40 27@2x

Additional information

Also, I did some testing on my machine and ran the TestEsMxLocale test by loading the YML file and then ran it again with the JSON file, and the execution time is significantly faster when passing a JSON.

As a note, the reasoning for choosing YAML.load_file() and JSON.load_file() methods in the task, is because those are the methods that i18n is using.

load_yaml method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L253

load_json method: https://github.com/ruby-i18n/i18n/blob/d1762c6c09e50cb1ed32cf2b4f0f525a0531094b/lib/i18n/backend/base.rb#L271

I believe this path can lead us to make faker much much faster! Please, let me know what you guys think! @thdaraujo @stefannibrasil

All the best! Salo.

Checklist

Before submitting the PR make sure the following are checked:

  • [X] This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • [X] Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • [X] Tests are added or updated if you fix a bug, refactor something, or add a feature.
  • [X] Tests and Rubocop are passing before submitting your proposed changes.

salochara avatar Jan 26 '24 00:01 salochara

Just wanted to say that this is in my todo, will get to it soon. Thanks for the PR description!

stefannibrasil avatar Feb 02 '24 15:02 stefannibrasil

Hey @salochara , sorry for the late reply!

This is pretty exciting, thanks for doing the investigation. It looks very promising. I'll play around with this idea and get back to you soon.

I've been thinking about how we could approach this, and the potential ways are: 1- maybe convert all yaml locales to json, and only work with json from now on (could break other apps that depend on faker locales) 2- adding a build step that does the conversion when we build the gem and get rid of yaml before packaging the gem (would need to run tests against the package to make sure it all still works)

There are pros and cons to both approaches, so I have to think about it.

thdaraujo avatar Feb 07 '24 01:02 thdaraujo

I have written some details on how I made the Haskell fakedata (which uses the same yaml file) faster here: https://psibi.in/posts/2019-09-23-fakedata-perf.html IIRC, the overhead involved was in parsing the yaml file and the IO overhead of reading yaml file was negligible.

psibi avatar Feb 07 '24 07:02 psibi

Hello! 👋 Hope you guys are doing great @thdaraujo @stefannibrasil

Let me know what are your thought on this PR, I'm looking forward to keep on contributing to faker!

salochara avatar Feb 19 '24 17:02 salochara

Thanks for your review @thdaraujo I'm currently traveling 🌴... I'll work on this next week!

salochara avatar Feb 22 '24 02:02 salochara

@thdaraujo Hello! 👋 I'm back.

I've addressed your comments. Let me know what you think!

salochara avatar Mar 07 '24 23:03 salochara

Hi @thdaraujo Hope you're doing great.

Just wondering if you're thinking on moving forward with this? I believe that the time performance is quite good!

Let me know what you think. All the best! Salo.

salochara avatar Mar 25 '24 23:03 salochara

Hey @salochara , sorry for the late reply!

Do you want to investigate how much work it would take to convert all the yaml files to json? Maybe we could have a script that converts all yaml files to json.

If we have that, I think we could potentially package the gem without the yaml files and only keep the json files, or set up i18n to only load json files. I haven't investigated this idea yet, just throwing it out there to see what you think.

Thanks!

thdaraujo avatar Apr 28 '24 18:04 thdaraujo