algorithmic-efficiency icon indicating copy to clipboard operation
algorithmic-efficiency copied to clipboard

Add dataset setup tests

Open priyakasimbeg opened this issue 2 years ago • 0 comments

Description

Most of the code in data_setup.py is untested. There are a few challenges for these tests:

  • datasets are very large (total just under 2TB total I believe)
  • some of them require manual steps (getting the links after signing the user agreements etc, I don't think we can check in the urls). We can at a minimum test the datasets that are downloaded via tfds (ogbg and wmt) and add some unit tests for the other datasets.

priyakasimbeg avatar Aug 14 '23 22:08 priyakasimbeg