orange3-single-cell icon indicating copy to clipboard operation
orange3-single-cell copied to clipboard

Single Cell Datasets: reduce the size of the data sets and speed-up loading

Open BlazZupan opened this issue 6 years ago • 4 comments

This issue is related to PR https://github.com/biolab/orange3/pull/3047, which enables saving and loading of compressed pickle files. Once this is merged into Orange and released, I propose to:

  • move all the raw files to file.biolab.si/datasets/sc and there provide a compressed pickled versions
  • update info files for Single Cell Datasets accordingly

This should substantially reduce the transfer and loading time of data sets. For instance, the largest data set currently included (bone marrow with AML) has 64MB, while its pickled xz variant has on 2.4MB.

This update will create an issue with backward compatibility, which will be broken.

BlazZupan avatar Jun 03 '18 17:06 BlazZupan