approachingalmost icon indicating copy to clipboard operation
approachingalmost copied to clipboard

data set repo

Open hnegi1212 opened this issue 5 years ago • 11 comments

Hi,

I do understand that code used in book cant be shared, but could you tell us where the data set used in the book can be found. Its good to have all data sets used in book at one repo, else we have to google each and everytime the dataset being used in book.

hnegi1212 avatar Aug 09 '20 11:08 hnegi1212

@hnegi1212 I have provided the references to all datasets when they are discussed the first time. One dataset is not publicly available but that is it. Which dataset were you unable to see the reference too? I might also create a markdown with links to datasets used if that makes things easier but I would really like to know which one I missed the reference to?

abhishekkrthakur avatar Aug 09 '20 12:08 abhishekkrthakur

+1 for Datasets mentioned section in the repo

epogrebnyak avatar Aug 09 '20 12:08 epogrebnyak

+1 for markdown with links to datasets Thanks that answers

hnegi1212 avatar Aug 09 '20 12:08 hnegi1212

+1 for dataset mention Reference to data set in the Feature Engineering chapter is missing.

anandsm7 avatar Sep 19 '20 08:09 anandsm7

siim-acr-pneumotorax-segmentation dataset in png format (stage 1) is not avaialble. Where can we find that?

akagrawal2k17 avatar Nov 11 '20 23:11 akagrawal2k17

Yes! I am also struggling to find the correct siim-acr-pneumotorax-segmentation dataset. :)

ranglescabau avatar Nov 13 '20 10:11 ranglescabau

@akagrawal2k17 I found siim-acr-pneumotorax-segmentation dataset.

I don't know whether this is correct, but with a little preprocessing, my model works well using these two datasets.

I hope this helps.

tokuma09 avatar Nov 14 '20 07:11 tokuma09

Is this issue solved now?

abhishekkrthakur avatar Nov 20 '20 11:11 abhishekkrthakur

No. I am still looking for the data set and could not find it. Please help me with the dataset.

akagrawal2k17 avatar Nov 24 '20 12:11 akagrawal2k17

@akagrawal2k17: For the siim-pneumotorax image classification task, we can download images and train.csv file as followed:

Download train and test images at https://www.kaggle.com/abhishek/siim-png-images

Using kaggle command:

cd the_project_data_folder/
kaggle datasets download -d abhishek/siim-png-images
unzip unzip siim-png-images.zip
Download train.csv with target label at https://www.kaggle.com/abhishek/siim-png-train-csv
kaggle datasets download abhishek/siim-png-train-csv
unzip siim-png-train-csv.zip

danhphan avatar Mar 03 '21 01:03 danhphan

Here are plenty of datasets. Haven't checked if all the datasets used in the book are here though.

CarlosBrunner avatar Aug 23 '23 04:08 CarlosBrunner