Matthias Feurer comments

Results 353 comments of


                                            Matthias Feurer

Document arff-trace for parameter sweeps

That sounds like a good proposal.

Document arff-trace for parameter sweeps

Hey, I just checked https://github.com/openml/docs/pull/15/files and the links aren't working yet. Does there need to be a release of the server for them to appear?

Document arff-trace for parameter sweeps

The documentation PR https://github.com/openml/docs/pull/15 just got merged. @joaquinvanschoren who can update the new API docs to add a link to the new description?

Dataset hash issues

Interesting, this is only an issue on the python side, not the command line. I will investigate this further. However, shouldn't the md5 hashes **never** change for the other datasets?

Hm, I think at least dataset 299 is not stored and sent in UTF-8, although it is advertised as such: ``` wget https://www.openml.org/data/v1/download/52200/libras_move.arff file -i libras_move.arff libras_move.arff: text/plain; charset=iso-8859-1 ```...

Dataset hash issues

Hey together, Here are a few comments from my side: > So, should I fix the encoding (and update the hash in the file table) of all dataset files with...

Dataset hash issues

I totally agree that every file should be UTF-8 and that they're ideally converted on upload. @joaquinvanschoren you pointed to the wrong Jan. You wanted to mention @janvanrijn

Added unit test to verify how the dataset object handles comparisons

I just had a look at the attributes of the dataset, and while the test obviously works, I'm wondering whether a few other attributes should also be excluded from checking,...

Listing calls & URL size limit of OpenML server

Is the maximum URL length documented somewhere? Also, what's the server response? If the server response is something we can parse/display nicely, is there a reason to act upon this...

Listing calls & URL size limit of OpenML server

This can be reproduced with: ```python import openml openml.datasets.list_datasets( data_id=list(range(10000)) ) ``` @janvanrijn we can add a helper to either call the get or post request depending on the length...