stylegan2-pytorch
stylegan2-pytorch copied to clipboard
error in converting .pkl to .pt
I used google colab to convert from a model trained from a custom dataset with stylegan2-ADA-PyTorch with all default settings and got this error message:
Traceback (most recent call last):
File "/content/stylegan2-pytorch/convert_weight.py", line 236, in
After I pip install torch_utils, I got the following message:
Traceback (most recent call last):
File "/content/stylegan2-pytorch/closed_form_factorization.py", line 18, in
What should I do?
convert_weight.py does not supports stylegan2-ada-pytorch.
I see. Thanks!
On Thu, Aug 19, 2021 at 8:49 PM Kim Seonghyeon @.***> wrote:
convert_weight.py does not supports stylegan2-ada-pytorch.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rosinality/stylegan2-pytorch/issues/250#issuecomment-902348398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSBAG6XBAWB6PYNIEWR37TT5WRATANCNFSM5CO7HA5A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
Hello ! I used $ python convert_weight.py --repo ~/stylegan2 stylegan2-ffhq-config-f.pkl
I downloaded .pkl file from here https://nvlabs-fi-cdn.nvidia.com/stylegan2/networks/ .
The conversion on my machine is in stall, without any signal, why can i get this issue .Have you any ideas?
How can i get clear .pt format in ModeuleScripting torch format to use in CPP. I need format like in torch.jit.save("model.pt")
I used google colab to convert from a model trained from a custom dataset with stylegan2-ADA-PyTorch with all default settings and got this error message: Traceback (most recent call last): File "/content/stylegan2-pytorch/convert_weight.py", line 236, in generator, discriminator, g_ema = pickle.load(f) ModuleNotFoundError: No module named 'torch_utils'
After I pip install torch_utils, I got the following message: Traceback (most recent call last): File "/content/stylegan2-pytorch/closed_form_factorization.py", line 18, in ckpt = torch.load(args.ckpt) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) ModuleNotFoundError: No module named 'torch_utils.persistence'
What should I do?
i got the same question,have you solved it?
No, yet no.. there are problem with converting GAN from .pkl to python torch file format
Compling custom operations could take a long time, and it is required both for official stylegan2 codes and in this repositories. I think you can try to spot the codes which causes the stall.
Regarding pytorch jit, I haven't tried to do it. But I think it is possible if you trace the model without custom operators.
I use docker from NVlab, and add project to internal struccture but in original NVlabs docker there are no TF, Cublast, and other. Have you any docker configuration or dependence list for this project correct start?
Hello, my stack trace, what is you python interpreter version ? i use 3.6. with CUDA 10.2. Why on my machine this project try to find CUDA 10.0 ? I use 10.2. version of CUDA and it corrected installation.
stylegan2-pytorch$ python convert_weight.py --repo ../stylegan2 ./stylegan2-ffhq-config-f.pkl
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "convert_weight.py", line 14, in
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.
deviceQuery$ ./deviceQuery ./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA GeForce GTX 850M" CUDA Driver Version / Runtime Version 11.5 / 10.2 CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 4046 MBytes (4242604032 bytes) ( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores GPU Max Clock rate: 902 MHz (0.90 GHz) Memory Clock rate: 1001 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: No Supports Cooperative Kernel Launch: No Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.5, CUDA Runtime Version = 10.2, NumDevs = 1
MY PIP LIST stylegan2-pytorch$ pip list Package Version
systemd-python 234
tensorboard 1.11.0
tensorflow 1.11.0
tensorflow-estimator 1.13.0
tensorflow-gpu 1.13.1
termcolor 1.1.0
terminado 0.8.1
tesserocr 2.4.0
testpath 0.4.2
tifffile 2020.9.3
toolz 0.10.0
torch 1.8.0
torchaudio 0.8.0
torchvision 0.9.0
tornado 5.1.1
traitlets 4.3.2
typing-extensions 3.7.4.3
May be problem in here ? I use CUDA 10.2 but TF 1.13. -1.15. not support it, only CUDA 10.0? https://stackoverflow.com/questions/50622525/which-tensorflow-and-cuda-version-combinations-are-compatible
@AlexTitovWork Yes, tf 1.X binaries only supports 10. It would be easier to use cuda 10.0 and use the pytorch releases that supports it. https://github.com/rosinality/alias-free-gan-pytorch/blob/main/Dockerfile
Thanks @rosinality for the Docker config. I builded docker images successful. It works well on NVIDIA-docker. But there are questions.
- In docker images I used cuda 10.0, I installed tensorflow-gpu==1.4.0, and also docker used torch1.7.1+cu92. But in the project you have tested on: PyTorch 1.3.1 CUDA 10.1/10.2
- Can this fact affect on execution of the .pkl converter?
3. After running the script, nothing happens ... What could be the reason? Maybe I am using the old architecture with CC 5.0?
I Used
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA GeForce GTX 850M"
CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 4046 MBytes (4242604032 bytes) ( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores GPU Max Clock rate: 902 MHz (0.90 GHz) Memory Clock rate: 1001 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
@AlexTitovWork How long does it stalled? It could take some time. As it does not show error message, it is hard to guess what could be the problem. You can use Ctrl+C to show stack traces.
Hello @rosinality ! It in stall very long, more than 8 hour. Thanks for support.
i geted stack trace by Ctrl+C.
root@1da8e0e33d03:/home/git_repos/stylegan2-pytorch# python convert_weight.py --repo ../stylegan2 stylegan2-ffhq-config-f.pkl
^CTraceback (most recent call last):
File "convert_weight.py", line 11, in
I got the message like in this post: https://github.com/zhou13/neurvps/issues/1
It works: go to your .cache directory, delete the lock file for your cpp extension (it is likely under the directory ~/.cache/torch_extensions/something), and you should be able to run it again.
If you can't find your cache directory, you can run python -m pdb your_program.py and break at your .../lib/python3.X/site-packages/torch/utils/cpp_extension.py line 1179 (specifically the line containing "baton = FileBaton(os.path.join(build_directory, 'lock'))") and then print "build_directory". That should be the cache directory for your programs
.
@rosinality i started model converter, but i have not enought memory, How much memory we need for converting stylegan2-ffhq-config-f.pkl for example?
I have 4Gb Video RAM, but the project crashed when memory filled.
@rosinality
We can close issue !!!)
@AlexTitovWork Converting itself will not require gpu memory. GPU memory consumption is came from sample generation. Converted .pt file should be generated anyway.
GPU usage 100% and more from nvidia-smi request.
I will try RTX 2090...
@AlexTitovWork I think you can use CUDA_VISIBLE_DEVICES=-1 to prevent tensorflow to use gpu.
I used google colab to convert from a model trained from a custom dataset with stylegan2-ADA-PyTorch with all default settings and got this error message: Traceback (most recent call last): File "/content/stylegan2-pytorch/convert_weight.py", line 236, in generator, discriminator, g_ema = pickle.load(f) ModuleNotFoundError: No module named 'torch_utils'
After I pip install torch_utils, I got the following message: Traceback (most recent call last): File "/content/stylegan2-pytorch/closed_form_factorization.py", line 18, in ckpt = torch.load(args.ckpt) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) ModuleNotFoundError: No module named 'torch_utils.persistence'
What should I do?
Hi Sam Did you ever managed to do the model conversion? Looking forward to hear good news from you Thanks
I used google colab to convert from a model trained from a custom dataset with stylegan2-ADA-PyTorch with all default settings and got this error message: Traceback (most recent call last): File "/content/stylegan2-pytorch/convert_weight.py", line 236, in generator, discriminator, g_ema = pickle.load(f) ModuleNotFoundError: No module named 'torch_utils' After I pip install torch_utils, I got the following message: Traceback (most recent call last): File "/content/stylegan2-pytorch/closed_form_factorization.py", line 18, in ckpt = torch.load(args.ckpt) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) ModuleNotFoundError: No module named 'torch_utils.persistence' What should I do?
Hi Sam Did you ever managed to do the model conversion? Looking forward to hear good news from you Thanks
I have been working on other projects lately and have not tried it again after rosinality said it was impossible.
我使用 google colab 从使用 stylegan2-ADA-PyTorch 和所有默认设置的自定义数据集训练的模型进行转换,并收到此错误消息: Traceback (last recent call last): File "/content/stylegan2-pytorch/convert_weight.py ",第 236 行,在生成器、鉴别器、g_ema = pickle.load(f) ModuleNotFoundError: No module named 'torch_utils' 在我 pip install torch_utils 后,我收到以下消息: Traceback(最近一次调用最后一次):文件“/content/stylegan2-pytorch/closed_form_factorization.py”,第 18 行,在 ckpt = torch.load(args.ckpt) 文件中/usr/local/lib/python3.7/dist-packages/torch/serialization.py”,第 608 行,在加载中返回 _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) 文件“/usr/local/lib/python3 .7/dist-packages/torch/serialization.py",第 777 行,在 _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) ModuleNotFoundError: No module named 'torch_utils.persistence' 我应该怎么办?
嗨 Sam 你有没有设法进行模型转换? 期待听到你的好消息 谢谢
Hi,has your problem been solved?
Hello ! Yes , i solved all issue. If we work with big image on weak GPU you can cut image until 400x400 for example and test you algo. On GTX 3090 with 24 Gb, i can convert any model. Internal structure of model used GPU tensors and it took a GPU-memory in translation process.
I used google colab to convert from a model trained from a custom dataset with stylegan2-ADA-PyTorch with all default settings and got this error message: Traceback (most recent call last): File "/content/stylegan2-pytorch/convert_weight.py", line 236, in generator, discriminator, g_ema = pickle.load(f) ModuleNotFoundError: No module named 'torch_utils'
After I pip install torch_utils, I got the following message: Traceback (most recent call last): File "/content/stylegan2-pytorch/closed_form_factorization.py", line 18, in ckpt = torch.load(args.ckpt) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 608, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 777, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) ModuleNotFoundError: No module named 'torch_utils.persistence'
What should I do?
Were you able to finally figure out the conversion? Been struggling with this for a while now too, so wanted to see if you found a solution? @zjgt
To whom it may interest,
on the documentation of stylegan3 it specifies that the code should on the same path with torch_utils and ddnlib to be accessibly via PYTHONPATH. So basically reassure the location of the model you want to convert is on the same path as the stylegan source code.