InsightFace-v2
InsightFace-v2 copied to clipboard
Trained models python / pytorch version?
I tried to use train.py with the BEST_checkpoint_r18.tar as the starting checkpoint, and got the following error:
File "/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "/media/data1/noamgat/InsightFace_v2/models.py", line 355, in forward
output = (one_hot * phi) + ((1.0 - one_hot) * cosine)
RuntimeError: CUDA error: device-side assert triggered
What does this mean? Might be connected to to an warnings:
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.parallel.data_parallel.DataParallel' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.activation.PReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.pooling.MaxPool2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.pooling.AdaptiveAvgPool2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.linear.Linear' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.activation.Sigmoid' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.dropout.Dropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/noamgat/hdd/miniconda3/envs/insightface/lib/python3.6/site-packages/torch/serialization.py:493: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
For reference, this is the conda env I set up, to most closely recreate what I saw in the README, requirements.txt and code:
name: insightface
channels:
- pytorch
- defaults
- conda-forge
- menpo
dependencies:
- python=3.6.8
- pytorch=1.3.0
- matplotlib
- scipy
- tqdm
- opencv
- pillow
- torchvision
- numpy
- scikit-image
- imgaug
- pip
- tensorboard
- pandas
- pip:
- torchsummary
- git+https://github.com/Tramac/torchscope.git
Perhaps the pretrained models were trained on a different version than stated in the README ? Has anyone been able to get train.py working with the pretrained versions as the starting checkpoint?
I have.
in my case I had used pytorch 1.4.0 and 1.5.1 successfully without any issues. I haven't tested with 1.3 though!
also 1.6.0 didnt work for me either, after couple of epochs, I'd get nans. I guess this is becasue of the breaking changes introduced in 1.6.0.