Results 17 comments of bganglia

If you want to use the dataset itself, you should use xrv.datasets.COVID19_Dataset() from the [torchxrayvision](https://github.com/mlmed/torchxrayvision) library. The scripts you are looking at are used for adding more data to the...

Ok, following the example in [combined_interface.py](https://github.com/ieee8023/covid-chestxray-dataset/blob/78543292f8b01d5e0ed1d0e15dce71949f0657bb/scripts/combined_interface.py#L27), it should work if you run this command in the scripts directory: ``` python combined_interface.py "search terms" image_output_folder/ new_metadata_filename.csv ../metadata.csv 10 internal retry ```...

@amgsharma Use the filename column. You can join the path to the images directory with the filename from each row to obtain the path to each image.

@JiayuanDing100 The patient has COVID-19 if the string "COVID-19" is in the "finding" column.

@JiayuanDing100 It depends on what kind of label you are looking for. You may want the "finding" column or "survival" column. You can read more about the metadata [here](https://github.com/ieee8023/covid-chestxray-dataset/blob/master/CONTRIBUTING.md)

@JiayuanDing100 Check whether the "finding" column equals "COVID-19". I answered your question in #42 as well

@JiayuanDing100 The dataloader here may also be useful for you https://github.com/mlmed/torchxrayvision

@ines321 They are in the "images" folder, but you may prefer to use the dataloader at https://github.com/mlmed/torchxrayvision

@ieee8023 I can work on this

If images from patients who might be healthy are being compared to these images, the small figure labels (e.g. "A", "B") could also lead to data leakage.