Voice-Cloning-App icon indicating copy to clipboard operation
Voice-Cloning-App copied to clipboard

Error UnpicklingError: invalid load key, '<'. at the start of [training]

Open Rleuc opened this issue 2 years ago • 15 comments

Hello - having issues with the training step of the process on the google colab notebook. I've tried 2 different datasets both created with the latest windows version. All prior steps in the colab execute good. Dataset copied to google drive as a zip and unzipped files with no difference. Runtime set as GPU. Any help is appreciated. Thx.

INFO:root:Setting batch size to 38, learning rate to 0.0003082207001484488. (15GB GPU memory free) INFO:root:Loading model... INFO:root:Loaded model INFO:root:Loading data... 2 train files, 1 test files INFO:root:Loaded data

UnpicklingError Traceback (most recent call last) in () 20 iters_per_checkpoint=checkpoint_frequency, 21 iters_per_backup_checkpoint=backup_checkpoint_frequency, ---> 22 train_size=1-validation_size, 23 )

3 frames /usr/local/lib/python3.7/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args) 775 "functionality.") 776 --> 777 magic_number = pickle_module.load(f, **pickle_load_args) 778 if magic_number != MAGIC_NUMBER: 779 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

Rleuc avatar Feb 22 '22 18:02 Rleuc

oh good, someone else is getting this error too. I'm getting the same error as well.

INFO:root:Setting batch size to 28, learning rate to 0.0002645751311064591. (15GB GPU memory free) INFO:root:Loading model... INFO:root:Loaded model INFO:root:Loading data...

1690 train files, 423 test files

INFO:root:Loaded data


UnpicklingError Traceback (most recent call last)

in () 20 iters_per_checkpoint=checkpoint_frequency, 21 iters_per_backup_checkpoint=backup_checkpoint_frequency, ---> 22 train_size=1-validation_size, 23 )

3 frames

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args) 775 "functionality.") 776 --> 777 magic_number = pickle_module.load(f, **pickle_load_args) 778 if magic_number != MAGIC_NUMBER: 779 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

mellow-tnk avatar Feb 23 '22 19:02 mellow-tnk

Hi @Rleuc and @mellow-tnk. Please could you link me to a zip of your dataset so I can test (or send to me via email if you'd like to keep it private)

BenAAndrew avatar Feb 25 '22 14:02 BenAAndrew

@BenAAndrew

hi Ben,

here's a mega download of the wav dataset. It's just the attenborough set but limited to 500 wavs https://mega.nz/file/xUkkDJaL#RfRvq_v-jPV3E1lWeQwm5GiXRlEmVNdv3H2-NTFtbXg

and same error "UnpicklingError: invalid load key, '<'."

mellow-tnk avatar Feb 26 '22 16:02 mellow-tnk

@BenAAndrew

i think the issue has to do with the checkpoint path. If there's no checkpoint, it defaults to None and an error occurs when it tries to load it. If you add a checkpoint, then the google colab works fine.

To get around the error for now, just run it offline first to get any checkpoint and use that as the start.

mellow-tnk avatar Feb 27 '22 02:02 mellow-tnk

@BenAAndrew I sent you a link to goole drive file with my dataset.

@mellow-tnk Would like to try your soln, but how exactly do you run a collab offline?

Thanks!

Rleuc avatar Feb 27 '22 14:02 Rleuc

@Rleuc

just install it to your computer using the installation guide. https://github.com/BenAAndrew/Voice-Cloning-App/blob/main/install.md

mellow-tnk avatar Feb 27 '22 19:02 mellow-tnk

Thanks @mellow-tnk Unfortunately working with AMD graphics card and local training not an option with CPU only version.

@BenAAndrew I tried with the full David A. dataset and get the same error.

Rleuc avatar Feb 28 '22 14:02 Rleuc

even full dataset will give same error. I think the problem is just the checkpoint path defaulting to None and it's trying to load it instead of creating a new one.

mellow-tnk avatar Feb 28 '22 16:02 mellow-tnk

@BenAAndrew Hi, would it be possible to create an empty or initial checkpoint file manually to get past this error? If so, what is the format? I just don't have a local nvidia card to start. Thanks

Rleuc avatar Mar 05 '22 11:03 Rleuc

@Rleuc uberduck is supporting this. you can go to their discord and use their other voice cloning colab in the meantime. It's in the #notebook room and there's a lot of active participation and people helping each other.

mellow-tnk avatar Mar 05 '22 15:03 mellow-tnk

@Rleuc uberduck is supporting this. you can go to their discord and use their other voice cloning colab in the meantime. It's in the #notebook room and there's a lot of active participation and people helping each other.

can u send the link

Zhongli0401 avatar Aug 16 '22 21:08 Zhongli0401

@steve4655 @Rleuc I have found a solution to this. Take the pretrained.pt file that gets downloaded in the notebook files section and place into the checkpoints folder in your drive. Also, the transfer learning path should be set to 'None'. I also had to edit the checkpoint path in the notebook as the form was not giving an option to select from the drop-down. This should get the training started.

AdityaThakur1 avatar Aug 25 '22 15:08 AdityaThakur1

just using 'transfer_learning_path=None' did the job for me! thanks :)

g2zer0 avatar Aug 26 '22 01:08 g2zer0

@g2zer0 Using transfer_learning_path is very beneficial to improving the quality of your model. Setting the transfer learning path to None will work, but it's not recommended. I've been training a model for quite some time now without the transfer learning and though the sound of voice sounds great, all it says is gibberish.

The cause of the pickling error is because the automatic downloading that happens in the Colab with gdown isn't working properly. The pretrained model is too large for virus scanning, so instead of downloading the model, gdown downloads Google Drive's HTML virus scanning message.

You can verify this by going into the Colab file explorer and downloading the failed model from /content/Voice-Cloning-App/pretrained.pt. It'll be there after running the "Install and import ML packages" section.

The gdown maintainers are aware that large files can't be downloaded with gdown, and you can read about it here.

In short, download tacotron2_statedict.pt, which is pretrained.pt manually in a browser, put it in your drive somewhere, and change the transfer_learning_path to that new location.

colossatr0n avatar Feb 16 '23 04:02 colossatr0n

@colossatr0n ah, that explains the gibberish result :) thank you for explaining.

g2zer0 avatar Feb 16 '23 05:02 g2zer0