handson-ml2 icon indicating copy to clipboard operation
handson-ml2 copied to clipboard

Datasets in Chapter 15 Exercises are not found

Open uzi0espil opened this issue 5 years ago • 4 comments

The datasets mentioned in questions 9 and 10 of chapter 15 are not found. For SketchRNN there are four variants of the dataset but tensorflow-datasets have only the quickdraw_bitmap dataset which I think it is more suitable for image classification rather than sequence. Also, the link in question 10 (https://homl.info/bach) yield to 404 page not found.

uzi0espil avatar Jan 31 '20 16:01 uzi0espil

the bach chorales is here https://github.com/ageron/handson-ml2/tree/master/datasets/jsb_chorales

lord8266 avatar Feb 01 '20 12:02 lord8266

@lord8266 Thank you

uzi0espil avatar Feb 03 '20 09:02 uzi0espil

Is the quickdraw_bitmap dataset supposed to be 36 GB, because I can't fit that on my computer.

andrewy12 avatar Feb 22 '20 03:02 andrewy12

Hi @uzi0espil ,

Thanks for your question. I'm sorry about the SketchRNN dataset issue, there was a TFDS Pull Request that seemed ready to merge, about a year ago, and the discussions were looking good, so I assumed it would be included in a matter of weeks, and I decided to include it in the book. Unfortunately this PR hasn't been merged yet, I should have double-checked before the book came out (I usually wrote a TODO:CHECK for myself every time I wrote about something that was supposed to be released later, but in this case I forgot, probably because it was in an exercise).

Anyway, the good news is that the dataset is available as convenient TFRecords file. I uploaded the solutions to the exercises in chapter 15, in case you want to take a look.

I'll leave this issue open until the SketchRNN dataset is available in TFDS... hopefully one day! ;-)

ageron avatar Mar 31 '20 09:03 ageron