autovc icon indicating copy to clipboard operation
autovc copied to clipboard

How to use this for repo for just testing?

Open sandeshnaroju opened this issue 4 years ago • 11 comments

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

sandeshnaroju avatar Nov 28 '20 10:11 sandeshnaroju

Just do this like readme said~

0.Convert Mel-Spectrograms Download pre-trained AUTOVC model, and run the conversion.ipynb in the same directory.

1.Mel-Spectrograms to waveform Download pre-trained WaveNet Vocoder model, and run the vocoder.ipynb in the same the directory.

Please note the training metadata and testing metadata have different formats.

ruclion avatar Dec 23 '20 07:12 ruclion

And how make inference after that?

ghost avatar Jan 05 '21 11:01 ghost

The import thing is to get "metadata.pkl" it can be get by run make_spect.py -> python make_metadata.py if you directly run them, then use author's wavs if change wavs to ourselves, metadata.pkl is our's wavs, and then read code conversion.ipynb and run it~

ruclion avatar Jan 06 '21 07:01 ruclion

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

ghost avatar Jan 06 '21 09:01 ghost

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

aneybaby727 avatar Feb 19 '21 06:02 aneybaby727

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

Hello, I met the same question as you. So could you please share how you solve the question? Thank you in advance!

hongchengzhu avatar Apr 09 '21 02:04 hongchengzhu

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

hongchengzhu avatar Apr 09 '21 02:04 hongchengzhu

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

so do i,do you have any solution?

atravler avatar Apr 17 '21 13:04 atravler

If you put only one wav file into each speaker directory, this modified make_metadata.py should work:

import pickle
from model_bl import D_VECTOR
from collections import OrderedDict
import numpy as np
import torch

C = D_VECTOR(dim_input=80, dim_cell=768, dim_emb=256).eval().cuda()
c_checkpoint = torch.load('3000000-BL.ckpt')
new_state_dict = OrderedDict()
for key, val in c_checkpoint['model_b'].items():
    new_key = key[7:]
    new_state_dict[new_key] = val
C.load_state_dict(new_state_dict)
num_uttrs = 1
len_crop = 128

# Directory containing mel-spectrograms
rootDir = './spmel'
dirName, subdirList, _ = next(os.walk(rootDir))
print('Found directory: %s' % dirName)


speakers = []

for speaker in sorted(subdirList):
    if len(speaker) != 4:
        continue
    print('Processing speaker: %s' % speaker)
    utterances = []
    utterances.append(speaker)
    _, _, fileList = next(os.walk(os.path.join(dirName,speaker)))
    
    idx_uttrs = np.random.choice(len(fileList), size=num_uttrs, replace=False)

    embs = []
    mel_specs = []
    for i in range(num_uttrs):

        tmp = np.load(os.path.join(dirName, speaker, fileList[idx_uttrs[i]]))
        candidates = np.delete(np.arange(len(fileList)), idx_uttrs)
        melsp = torch.from_numpy(tmp).cuda().unsqueeze(0)
        emb = C(melsp)
        embs.append(emb.detach().squeeze().cpu().numpy())   
        mel_specs.append(melsp.squeeze(0))
        
    utterances.append(np.mean(embs, axis=0)) #this is spker embedding
        
    
    for mel_spec in mel_specs:
        utterances.append(mel_spec.cpu().numpy())
    speakers.append(utterances)

print("len of speaker", len(speakers))
    

with open(os.path.join('metadata_own.pkl'), 'wb') as handle:
    pickle.dump(speakers, handle)

jlian2 avatar Sep 30 '21 01:09 jlian2

how to use my own source content wav and target style wav ? thank you.

dragen1860 avatar Oct 27 '21 06:10 dragen1860

@dragen1860 have you fixed the issue?

Ha0Tang avatar Jan 17 '22 12:01 Ha0Tang