cytokit icon indicating copy to clipboard operation
cytokit copied to clipboard

Make example data sets public

Open armish opened this issue 6 years ago • 3 comments

Here are the ones that we are familiar with and trust the most:

$ gsutil ls gs://musc-codex/datasets/ \
     | grep "20180706\|20180614" \
     | xargs -I@ -P1 bash -c "gsutil du -sh @"

6.65 GiB    gs://musc-codex/datasets/20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5
9.88 GiB    gs://musc-codex/datasets/20180614_D22_RepB_Tcell_CD4-CD8-DAPI_5by5
9.4 GiB     gs://musc-codex/datasets/20180614_D23_RepA_Tcell_CD4-CD8-DAPI_5by5
8.81 GiB    gs://musc-codex/datasets/20180614_D23_RepB_Tcell_CD4-CD8-DAPI_5by5
5.55 GiB    gs://musc-codex/datasets/20180706-Donor22-R2-Tcell-CODEX_CD3CD4CD85BY5
5.38 GiB    gs://musc-codex/datasets/20180706-Donor23-R2-Tcell-CODEX_CD3CD4CD85BY5

These would also make testing the framework easier for everybody. The only thing we have to make sure is that making these data sets won't be that costly. We can go with a service like figshare but not sure how their downloading bandwidth scales if we need to download it over and over again.

@hammer: any suggestions?

armish avatar Jul 18 '18 20:07 armish

Those are a bit large for Google Drive.

Nature recommends the IDR: cf. their submission guidelines.

If IDR doesn't want our images, maybe Dryad? I dunno. We should ask Anne Carpenter.

hammer avatar Aug 03 '18 16:08 hammer

The Cell Image Library is another alternative

hammer avatar Aug 03 '18 16:08 hammer

We now have a dedicated public bucket: gs://cytokit. Added data sets that are relevant to the manuscript:

$ gsutil ls gs://cytokit/datasets/*
gs://cytokit/datasets/cellsize/:
gs://cytokit/datasets/cellsize/20181024-d38-act-20X-5by5/
gs://cytokit/datasets/cellsize/20181024-d38-unstim-20X-5by5/
gs://cytokit/datasets/cellsize/20181024-d39-act-20x-5by5/
gs://cytokit/datasets/cellsize/20181024-d39-unstim-20x-5by5/
gs://cytokit/datasets/cellsize/20181024-jurkat-20X-5by5/
gs://cytokit/datasets/cellsize/20181024-jurkat2-20X-5by5/
gs://cytokit/datasets/cellsize/20181026-pmel-act-20x-5by5/
gs://cytokit/datasets/cellsize/20181026-pmel-act-60x-1by1/
gs://cytokit/datasets/cellsize/20181026-pmel-act-60x-5b5/
gs://cytokit/datasets/cellsize/20181026-pmel-us-20x-5by5/
gs://cytokit/datasets/cellsize/20181026-pmel-us-60x-1by1/

gs://cytokit/datasets/cellular-marker/:
gs://cytokit/datasets/cellular-marker/20180614_D22_RepA_Tcell_CD4-CD8-DAPI_5by5/
gs://cytokit/datasets/cellular-marker/20180614_D22_RepB_Tcell_CD4-CD8-DAPI_5by5/
gs://cytokit/datasets/cellular-marker/20180614_D23_RepA_Tcell_CD4-CD8-DAPI_5by5/
gs://cytokit/datasets/cellular-marker/20180614_D23_RepB_Tcell_CD4-CD8-DAPI_5by5/
gs://cytokit/datasets/cellular-marker/20180927-Tcell-CD3_CD4_CD8_DAPI-20X-5by5/

Will add pointers from the README before closing this issue.

armish avatar Nov 01 '18 03:11 armish