Joaquin Vanschoren
Joaquin Vanschoren
OK, but keep the original issue in mind: the python md5 hash is somehow different from what linux does. Is the best solution to say 'sorry we don't handle latin-1...
Looks like you need to tell the md5 lib which encoding the file has: https://stackoverflow.com/questions/41068617/python-2-vs-3-same-inputs-different-results-md5-hash https://stackoverflow.com/questions/24766220/hashlib-md5-in-python-returning-incorrect-digests-for-some-unicode-characters Otherwise the hash may be different from the hash computed on the server. On...
Btw, I fixed all files with unknown encoding on the server. I also converted most Latin-1 files, except those with string data. On Sun, 27 May 2018 at 01:28, Joaquin...
Great. Can we conclude the following? * OpenML should store all data as UTF-8 (which I think is a superset of several 8bit encodings). That means that the REST API...
Thanks for reporting, Timothy. This is a really strange error. * The ARFF download works * Downloading the CSV via sklearn.fetch_openml(data_id=42178) and the python API also works * The CSV...
Yes exactly. If a dataset is sparse the get_csv API gives you a sparse representation. I'm not sure if it would be wise to expand to a dense representation in...
No, I don't think that that exists? The (very simple) representation we use goes back to this: https://www.cs.waikato.ac.nz/ml/weka/arff.html If there is a better way to represent sparse data in CSV,...
Hi all, sorry about the hickups, we're doing a bunch of work on the backend migrating to Kubernetes, and occasionally response may be slower. We're definitely planning to resolve it...
Hi all, is there any progress on this issue?
Would it be hard to support multiple configuration files? Of have multiple configurations within one configuration file? I'm especially thinking of cases where you have your own local install, but...