COVID-19 icon indicating copy to clipboard operation
COVID-19 copied to clipboard

Improve dataset COVID-19 vs Normal

Open elcronos opened this issue 4 years ago • 6 comments

We should try to find more images of COVID-19 and normal cases. Both CTs and Xray.

TO-DO:

  • Create and curate a dataset including images of COVID-19, and Normal cases.
  • Create folders for test, train and validation with their correspondent subfolders (normal, covid19) and split data accordingly.

Recommendations:

  • If you have to split the dataset it should ideally be train: 80%, test: 10%, validation: 10%
  • Images should be resized or crop such that their sizes are 224x224 or 300x300.
  • Please be consistent with the type of images. Only use .jpeg or .jpg images

elcronos avatar Mar 21 '20 11:03 elcronos

Hi Camilo here is the link if the database https://stanfordmlgroup.github.io/competitions/chexpert/. Sorry I forgot to put it later.

davidcp82 avatar Mar 21 '20 11:03 davidcp82

https://www.ajronline.org/doi/full/10.2214/AJR.20.23034. Trying to find the image data

davidcp82 avatar Mar 21 '20 11:03 davidcp82

Hi @elcronos, I love to help with this task if no one is working on this yet.

I have a startup called LinkedAI and we help people with all data issues, mostly with labeling, so we be more than glad to help you with this project.

You can assign me this issue to work on it.

divait avatar Mar 23 '20 15:03 divait

Yes, please do

elcronos avatar Mar 23 '20 18:03 elcronos

Hi Camilo here is the link if the database https://stanfordmlgroup.github.io/competitions/chexpert/. Sorry I forgot to put it later.

If you're using that dataset, remember to use the high-resolution images (439GB). The compressed version (11GB) are not of diagnostic quality.

Unfortunately i've only been able to find 1 open source for COVID-19 positive CXRs/CTs: https://github.com/ieee8023/covid-chestxray-dataset The image format, resolution and quality is all over the place, so it will require cleaning.

There are some here as well, but not available for download: https://bit.ly/BSTICovid19_Teaching_Library You might have to ask for permission, see: https://www.bsti.org.uk/training-and-education/covid-19-bsti-imaging-database/

Here are some other CXR datasets, similar to Stanford's CheXpert: NIH ChestXray https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345 PadChest (requires authorization) http://bimcv.cipf.es/bimcv-projects/padchest/ MIMIC-CXR (requires authorization) https://physionet.org/content/mimic-cxr/

ayhyap avatar Mar 24 '20 04:03 ayhyap

I created a repository to gather chest x-ray and CT images, the goal is to create a collection that can be useful for other projects that are analyzing the covid-19 with computer vision.

https://github.com/arthurfigueiredo/covid-dataset/

arthurfigueiredo avatar Mar 25 '20 12:03 arthurfigueiredo