autovc What is the format of the metadata?

What is the format of the metadata? I want to try another audio. I checked the data inside. But I don't know what the second one is. The first one is the name，The third is mel-spectrogram.

And does this apply to Chinese audio? Or I need to retrain the model and use Chinese data. thanks！

Aug 14 '19 07:08 1015720437

For Chinese audio, you need to retrain the model and retune the hyper params.

Aug 15 '19 01:08 auspicious3000

What is the difference between train and test metadata? I create metadata from persian waves, but its format is not like yours. I can train the network, but I can't test it. The third section of my metadata is path of .npy files, that created by make_spect.py please help me, sorry I'm confused Thanks a lot.

Jun 04 '20 20:06 mhosein4

The metadata is all different depending on the use case. It is nothing but some sort of nested list. You can easily make your own by looking into one of these metadata.

Jun 04 '20 21:06 auspicious3000

Thank you for your explanation. I can't understand what is the third section and how to generate it? What is array that's highlight in picture?

Thanks for support

Jun 05 '20 09:06 mhosein4

Can you print the shape of it?

Jun 05 '20 10:06 auspicious3000

Yes I can. but, do you mean I send that for you?

shape.txt

Jun 05 '20 10:06 mhosein4

Just let me know the shape.

Jun 07 '20 06:06 auspicious3000

Shape of your metadata is (4, 3) but for me is (2,).

Jun 07 '20 07:06 mhosein4

I mean the shape of the 3rd section

Jun 07 '20 07:06 auspicious3000

I'm sorry about my fault The third section is String, include the paths of spectograms. Like this ---> 's1\p1_1.npy', 's1\p1_2.npy', 's1\p1_3.npy'

Jun 07 '20 07:06 mhosein4

"I can't understand what is the third section and how to generate it? What is array that's highlight in picture?"

This was your original question. What is the shape of the 3rd section you were refering to?

Jun 07 '20 07:06 auspicious3000

The picture I sent was related to your metadata. The shape of 3rd section of your metadata is (3,). I want to generate the my metadata like the one in the picture. Sorry if I didn't explain well

Jun 07 '20 07:06 mhosein4

There are definitely more than 3 elements in your highlighted area

Jun 07 '20 07:06 auspicious3000

I'm so sorry again my fault (90, 80) (89, 80) (75, 80) (109, 80) Metadata include 4 speakers.

Jun 07 '20 07:06 mhosein4

These are the spectrograms

Jun 07 '20 08:06 auspicious3000

So what is the previous array in the second section? Can you send me the Python file? I'm so confused Thanks again for your good support

Jun 07 '20 08:06 mhosein4

Again, the shape please.

Also, where did you get that metadata?

Jun 07 '20 08:06 auspicious3000

The shapes are (256,) (256,) (256,) (256,) I sent you an email, and you sent your project.

Jun 07 '20 08:06 mhosein4

Those are the speaker embeddings.

In that case, you already had the code to generate this. If not, you can write your own very easily. I don't keep the code, because it is too simple.

Jun 07 '20 08:06 auspicious3000

@mhosein4 did you understand the metadata format? because I'm trying to run this code now, and I can see that the "metadata.pkl" file in the git does NOT the same as the metadata file would be generated by the "make_metadata.py".

in "metadata.pkl" for every singer there are:

str for the id of the singer
embedding
mel-spec for the songs in the dataset

but when generating a metadata file with "metadata.py" it generates:

str for the id of the singer
embedding
name with type !string! of the songs in the dataset.

so I can't use it... I saw in another issue that someone said the metadata for training and test is different, but I can't understand how and where in the code.

thanks!

Jul 29 '21 11:07 amiteliav

@amiteliav in case it's still relevant, you can find an end-to-end implementation in this repo/notebook: https://github.com/KnurpsBram/AutoVC_WavenetVocoder_GriffinLim_experiments/blob/master/AutoVC_WavenetVocoder_GriffinLim_experiments_17jun2020.ipynb

May 04 '22 07:05 lisabecker

autovc autovc copied to clipboard

What is the format of the metadata?

autovc
autovc copied to clipboard