inference
inference copied to clipboard
Could you please provide a processed binary loader of the Terabyte dataset?
The Terabyte dataset is very large and very hard to preprocess, but for the inference task, we only need the last day's data, which is relatively small and affordable. Now many people including me are prohibited from playing with the full dataset because of the hardware limitation, so I am wondering if you could provide a saved binary loader for us? Thanks.
This is exactly what I have been wondering......Processing the whole 1.1 TB data only to get the last day as the test set is too expensive, and I had failed a few times and finally had to switch to the subsampled version. I hope the MLCommons community could help make this experiment more accessable.
@mnaumovfb Can you please take a look at the issue and comment?
We need Criteo to allow last day of logs to be preprocessed by mlcommons and shared with inference wg.
outdated