mobile_app_open icon indicating copy to clipboard operation
mobile_app_open copied to clipboard

Use the full dataset for accuracy validation in the integration test

Open anhappdev opened this issue 2 years ago • 7 comments

In our integration test we have a check to validate the accuracy results of the benchmarks. Currently, the app uses a tiny subset of the full dataset, so its result may not be reliable.

We should use the full dataset instead. Since this test is run exclusively on our CI, the full dataset can be stored privately on Google Cloud.

anhappdev avatar Feb 15 '23 11:02 anhappdev

@Mostelk to check if we can get space from what Bruno is planning to buy. 20 GiB should be a safe bet for this purpose.
access frequency: maybe 50 times per month.

freedomtan avatar Mar 07 '23 06:03 freedomtan

@Mostelk to check with with Bruno for the progress.

for public data, it's pretty easy.

  • Zenodo: https://zenodo.org/
  • HuggingFace: https://huggingface.co/, provides git-lfs, for public one

The main issue: ImageNet

What we need: 8.x GiB for all the full validation.

freedomtan avatar Mar 21 '23 05:03 freedomtan

Another solution would be using GitHub Release if each file is under 2 GB. The SNUSR dataset was released this way: https://github.com/mlcommons/mobile_models/releases

anhappdev avatar Mar 21 '23 06:03 anhappdev

@freedomtan check if we can put datasets other than ImageNet to GitHub Release.

freedomtan avatar Apr 11 '23 05:04 freedomtan

  • MS COCO: It seems only images with CC licenses were collected and annotations are in CC
  • ADE20K:
    • images: https://groups.csail.mit.edu/vision/datasets/ADE20K/terms/
    • annotations: BSD
  • [SQuAD]: CC BY-SA 4.0
  • SR: supposedly OK.

freedomtan avatar Apr 17 '23 01:04 freedomtan

Let's use github release for datasets other than ImageNet.

freedomtan avatar May 09 '23 05:05 freedomtan

Waiting until https://github.com/mlcommons/mobile_app_open/pull/707 is merged so we save time and bandwidth downloading dataset.

anhappdev avatar May 09 '23 12:05 anhappdev