autovc How to use this for repo for just testing?

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Nov 28 '20 10:11 sandeshnaroju

Just do this like readme said~

0.Convert Mel-Spectrograms Download pre-trained AUTOVC model, and run the conversion.ipynb in the same directory.

1.Mel-Spectrograms to waveform Download pre-trained WaveNet Vocoder model, and run the vocoder.ipynb in the same the directory.

Please note the training metadata and testing metadata have different formats.

Dec 23 '20 07:12 ruclion

And how make inference after that?

Jan 05 '21 11:01 ghost

The import thing is to get "metadata.pkl" it can be get by run make_spect.py -> python make_metadata.py if you directly run them, then use author's wavs if change wavs to ourselves, metadata.pkl is our's wavs, and then read code conversion.ipynb and run it~

Jan 06 '21 07:01 ruclion

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

Jan 06 '21 09:01 ghost

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

Feb 19 '21 06:02 aneybaby727

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

Hello, I met the same question as you. So could you please share how you solve the question? Thank you in advance!

Apr 09 '21 02:04 hongchengzhu

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

Apr 09 '21 02:04 hongchengzhu

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

so do i，do you have any solution？

Apr 17 '21 13:04 atravler

If you put only one wav file into each speaker directory, this modified make_metadata.py should work:

import pickle
from model_bl import D_VECTOR
from collections import OrderedDict
import numpy as np
import torch

C = D_VECTOR(dim_input=80, dim_cell=768, dim_emb=256).eval().cuda()
c_checkpoint = torch.load('3000000-BL.ckpt')
new_state_dict = OrderedDict()
for key, val in c_checkpoint['model_b'].items():
    new_key = key[7:]
    new_state_dict[new_key] = val
C.load_state_dict(new_state_dict)
num_uttrs = 1
len_crop = 128

# Directory containing mel-spectrograms
rootDir = './spmel'
dirName, subdirList, _ = next(os.walk(rootDir))
print('Found directory: %s' % dirName)


speakers = []

for speaker in sorted(subdirList):
    if len(speaker) != 4:
        continue
    print('Processing speaker: %s' % speaker)
    utterances = []
    utterances.append(speaker)
    _, _, fileList = next(os.walk(os.path.join(dirName,speaker)))
    
    idx_uttrs = np.random.choice(len(fileList), size=num_uttrs, replace=False)

    embs = []
    mel_specs = []
    for i in range(num_uttrs):

        tmp = np.load(os.path.join(dirName, speaker, fileList[idx_uttrs[i]]))
        candidates = np.delete(np.arange(len(fileList)), idx_uttrs)
        melsp = torch.from_numpy(tmp).cuda().unsqueeze(0)
        emb = C(melsp)
        embs.append(emb.detach().squeeze().cpu().numpy())   
        mel_specs.append(melsp.squeeze(0))
        
    utterances.append(np.mean(embs, axis=0)) #this is spker embedding
        
    
    for mel_spec in mel_specs:
        utterances.append(mel_spec.cpu().numpy())
    speakers.append(utterances)

print("len of speaker", len(speakers))
    

with open(os.path.join('metadata_own.pkl'), 'wb') as handle:
    pickle.dump(speakers, handle)

Sep 30 '21 01:09 jlian2

how to use my own source content wav and target style wav ? thank you.

Oct 27 '21 06:10 dragen1860

@dragen1860 have you fixed the issue?

Jan 17 '22 12:01 Ha0Tang

autovc autovc copied to clipboard

How to use this for repo for just testing?

autovc
autovc copied to clipboard