Add some stats on the size of the datasets provided in skrub.datasets
Describe the issue linked to the documentation
I am working on an example with some of the datasets provided in skrub.datasets, and for each of them I need to download and open the data to see how big it is, and in general the shape of the data.
Suggest a potential alternative/fix
Adding at least the number of columns and rows would be useful to know whether a table is too large for my use case (e.g., I don't want to work with 1M rows), and it should not be too complicated.
@rcap107 Is this up for grabs?
yes, it is thanks @Neilblaze !! I think the simplest thing would be to manually check the number of rows, columns, and size on disk and write that in the docstring. I doesn't have to be done all in one PR if that's too much work it can be done a few datasets at a time
@jeromedockes sure thing, I'm on it! Thanks!
Hey @Neilblaze, are you still working on this?
Closed by #1503