inference icon indicating copy to clipboard operation
inference copied to clipboard

[GNN] Acc eval script change: incompatibility or corrupted file?

Open attafosu opened this issue 10 months ago • 3 comments

After the change to avoid using memmap when loading the labels #2081, we're encountering pickling errors via numpy. We're unable to load the labels with numpy.load. The previous loading via np.memmap (https://github.com/mlcommons/inference/commit/be6ff52235b74a2f4ef85bf86fa3785045229fa8) did work on our environments without issues

Steps to reproduce:

  1. wget -c https://igb-public.s3.us-east-2.amazonaws.com/IGBH/processed/paper/node_label_2K.npy
  2. python -c "import numpy as np; labels = np.load('node_label_2K.npy', mmap_mode=None)"
  3. Error: ValueError: Cannot load file containing pickled data when allow_pickle=False

After setting allow_pickle=True, we face another error: _pickle.UnpicklingError: invalid load key, '\x00'

Packages used: numpy==1.26.4, torch==2.1.0+cpu

Curious if there's a dependency issue on my side. cc: @arjunsuresh @nv-alicheng

Was the previous mode of loading the labels incorrect leading to incorrect labels? Or the current modification fixes a previously missed bug? If the latest change doesn't concern correctness of loaded labels, can we bring back the old way so as the default, and switch to np.load if --no-memmap is passed?

attafosu avatar Feb 19 '25 04:02 attafosu

Hi @attafosu I can confirm that this issue is there. @ashwin @nvzhihanj can you please confirm from Nvidia side?

arjunsuresh avatar Feb 20 '25 09:02 arjunsuresh

Not sure if this is the best fix, but is an option to get the script working.

https://github.com/mlcommons/inference/pull/2123/files

arjunsuresh avatar Feb 20 '25 11:02 arjunsuresh

The fix looks good to me

attafosu avatar Feb 20 '25 16:02 attafosu