Diffusion_models_from_scratch
Diffusion_models_from_scratch copied to clipboard
Key error with file data/Imagenet64/metadata.pkl
Hello!
I am trying to make to train a model myself using Imagenet64x64 for a test; on a MAC using "mps" device.
It took me a little while to see that after downloading Imagenet64x64, I have to — use "loadImagenet64.py" to generate .pkl files in an "Imagenet64" folder. — THEN use "make_massive_tensor.py" to make a large .pt file. — THEN use "train.py" which will call "model_trainer.py"
Apparently (tell me if I am wrong): — "loadImagenet64.py" needs "Imagenet64_train_part1.zip" and "Imagenet64_train_part2.zip". Imagenet64x64 does not have these files. It rather has: train_data_batch_1, train_data_batch_2, train_data_batch_3... etc
— I changed the code in "loadImagenet64.py" to make a series of img and label .pkl files within the "Imagenet64" folder. — Then, when running "make_massive_tensor.py", I get the following errors:
Shape error with file data/Imagenet64/n.pkl
Key error with file data/Imagenet64/metadata.pkl
— I probably did something wrong in "loadImagenet64.py" with the formatting of pickles. But I do not know where that is happening. Dict's keys seem fine: loadImagenet64 seems to replace 'data', 'mean', 'labels' found in Imagenet64 with 'data', 'mean', 'labels'.
=> Where did you get "Imagenet64_train_part1.zip" OR how did you make them? => How to deal with the shape and dict keys within those .pkl ?
Thank you for your help!!!
O.
After running "loadImagenet64.py", my .pkl files have:
- for label: length 128116
- for img: length 128116
"make_massive_tensor.py" cannot reshape them to (3, 64, 64) (<-- line 62 in "make_massive_tensor.py")
If that can help:
- "label" type is list
- "img" type is numpy.ndarray
Apparently (tell me if I am wrong): — "loadImagenet64.py" needs "Imagenet64_train_part1.zip" and "Imagenet64_train_part2.zip". Imagenet64x64 does not have these files. It rather has: train_data_batch_1, train_data_batch_2, train_data_batch_3... etc
The loadImagenet64.py script loads in the data directly through the zip files downloaded from the ImageNet website. So, don't unzip the archives as this script works with zipped data.
https://github.com/gmongaras/Diffusion_models_from_scratch/blob/main/data/loadImagenet64.py#L19
"img" type is numpy.ndarray Did you download the numpy version of ImageNet? I think the scripts work off the base version as opposed to the numpy version of ImageNet which may be causing the issue here.
Just a heads up: As for training on a mac, I haven't added mps support to the repo as in PyTorch 1.0, the MPS device had all types of weird issues. I'm not sure if that was fixed in PyTorch 2.0. For inference, this may be fine, but for training, I think you may run into multiple issues trying to get it to work properly.
Hope this helps! Let me know if you run into any more issues.
Hello!
Thank you very much for your answer!
— For Imagenet64_train_part1.zip in your loadImagenet64.py, I have been waiting forever to get an authorization for Download from the ImageNet website. I am not sure they are still active. So I eventually downloaded Imagenet64 from Kaggle.com
There are folders with the same name as your ZIP files. But zipping them does not seem to work either.
— I changed a bit of code to get MPS device to work ok on Apple Silicon. In fact, I used MPS device declaration within PyTorch on previous projects and that worked. I do not fully understand as I am not a specialist. But it seems to be working like a charm now with PyToch 2 on MacOS 12.3+.
— I will keep you posted with my tests and if I can make it work. I may ask for help from someone around me too.
Thank you again,
O.
"One more thing": :)
Do you know the shape of the tensors in your .pkl files for "img" and "label"?
oh that makes sense. So the Kaggle dataset is probably in a different format from the ImageNet dataset which is why you're running into issues loading in the data. I haven't added support for the Kaggle dataset, but I'd imagine the process should be the same. Perhaps the Kaggle dataset uses the numpy version?
Anyways, it'd be awesome if you could get the data working with Kaggle! Looking at the .pkl data, it looks like img is a flattened numpy array. For example, 0.pkl would have a flattened size of 12288. As for label, this is just an integer value assigning the class of this image.
{'img': array([ 34, 48, 80, ..., 194, 192, 188], dtype=uint8), 'label': 572}
Kaggle dataset uses the numpy version
Yes it does. But that is not the problem.
it looks like img is a flattened numpy array
That makes sense here too! This is not what I have: img is 128116x12288 (no idea why)
The problem is that just flattening it will not work because 128116 x 12288 = 1574289408 is not a multiple of (3 x 64 x 64).
So I get the error:
RuntimeError: shape '[3, 64, 64]' is invalid for input of size 1574289408
I have a closer look at it!
O.
I think 128116x12288 is batch size by image size. So you have a batch size of 128116 and an image size of 12288 which is a multiple of 3x64x64.