torchrec
torchrec copied to clipboard
Copy sparse, dense and labels from the right source when shuffling Criteo data
When dense and spare features are located in different directories, i.e., input_dir_labels_and_dense != input_dir_sparse as the input to shuffle() function, this method throws an error in L642 like:
FileNotFoundError: [Errno 2] No such file or directory: '/data/contiguous/day_23_dense.npy'
So I'm fixing input directories in this PR. This is also in line with this remark.
cc @samiwilf @narayanan2004
Thanks @janekl I hadn't seen this PR until yesterday and had made the fix in a past PR. But the subsidiary changes look good and we'll merge them in.
@samiwilf has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.