ddsp
ddsp copied to clipboard
how to update sample_rate of ddsp_run
I am using the DDSP train_autoencoder colab notebook here: https://colab.research.google.com/github/magenta/ddsp/blob/master/ddsp/colab/demos/train_autoencoder.ipynb
I want to train the autoencoder to use 48kHz audio, not 16kHz (default).
To do so, I have made the following changes to sections of the notebook ( --sample_rate=48000)
!ddsp_prepare_tfrecord \
--input_audio_filepatterns=$AUDIO_FILEPATTERN \
--output_tfrecord_path=$TRAIN_TFRECORD \
--num_shards=10 \
--sample_rate=48000\
--alsologtostderr
Save dataset statistics for timbre transfer
from ddsp.colab import colab_utils
import ddsp.training
data_provider = ddsp.training.data.TFRecordProvider(TRAIN_TFRECORD_FILEPATTERN,
sample_rate=48000)
dataset = data_provider.get_dataset(shuffle=False)
PICKLE_FILE_PATH = os.path.join(SAVE_DIR, 'dataset_statistics.pkl')
and then for training:
!ddsp_run \
--mode=train \
--alsologtostderr \
--save_dir="$SAVE_DIR" \
--gin_file=models/solo_instrument.gin \
--gin_file=datasets/tfrecord.gin \
--gin_param="TFRecordProvider.file_pattern='$TRAIN_TFRECORD_FILEPATTERN'" \
--gin_param="batch_size=16" \
--gin_param="sample_rate=48000" \
--gin_param="train_util.train.num_steps=30000" \
--gin_param="train_util.train.steps_per_save=300" \
--gin_param="trainers.Trainer.checkpoints_to_keep=10"
The result is below...I cannot figure out why it doesn't work. Can someone please tell me what I am doing wrong? Thanks very much in advance!
2021-05-26 01:50:25.950745: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
WARNING:root:Argument whitelist is deprecated. Please use allowlist.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_probability/python/internal/variadic_reduce.py:115: calling function (from tensorflow.python.eager.def_function) with experimental_compile is deprecated and will be removed in a future version.
Instructions for updating:
experimental_compile is deprecated, use jit_compile instead
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_probability/python/internal/variadic_reduce.py:115: calling function (from tensorflow.python.eager.def_function) with experimental_compile is deprecated and will be removed in a future version.
Instructions for updating:
experimental_compile is deprecated, use jit_compile instead
I0526 01:50:29.118762 139803541604224 ddsp_run.py:176] Restore Dir: /content/drive/MyDrive/drum_sample_solo/SHORT-ddsp-solo-instrument
I0526 01:50:29.119161 139803541604224 ddsp_run.py:177] Save Dir: /content/drive/MyDrive/drum_sample_solo/SHORT-ddsp-solo-instrument
I0526 01:50:29.121088 139803541604224 resource_reader.py:50] system_path_file_exists:optimization/base.gin
E0526 01:50:29.121450 139803541604224 resource_reader.py:55] Path not found: optimization/base.gin
I0526 01:50:29.125206 139803541604224 resource_reader.py:50] system_path_file_exists:eval/basic.gin
E0526 01:50:29.125523 139803541604224 resource_reader.py:55] Path not found: eval/basic.gin
I0526 01:50:29.127723 139803541604224 resource_reader.py:50] system_path_file_exists:models/solo_instrument.gin
E0526 01:50:29.127967 139803541604224 resource_reader.py:55] Path not found: models/solo_instrument.gin
I0526 01:50:29.128289 139803541604224 resource_reader.py:50] system_path_file_exists:models/ae.gin
E0526 01:50:29.128505 139803541604224 resource_reader.py:55] Path not found: models/ae.gin
I0526 01:50:29.135938 139803541604224 resource_reader.py:50] system_path_file_exists:datasets/tfrecord.gin
E0526 01:50:29.136190 139803541604224 resource_reader.py:55] Path not found: datasets/tfrecord.gin
I0526 01:50:29.136540 139803541604224 resource_reader.py:50] system_path_file_exists:datasets/base.gin
E0526 01:50:29.136765 139803541604224 resource_reader.py:55] Path not found: datasets/base.gin
I0526 01:50:29.169140 139803541604224 train_util.py:78] Defaulting to MirroredStrategy
2021-05-26 01:50:29.170495: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-05-26 01:50:29.179300: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.179911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-05-26 01:50:29.179946: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-05-26 01:50:29.182908: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-05-26 01:50:29.182982: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-05-26 01:50:29.184593: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-05-26 01:50:29.184967: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-05-26 01:50:29.186709: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-05-26 01:50:29.187344: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-05-26 01:50:29.187547: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-05-26 01:50:29.187663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.188345: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.188900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-26 01:50:29.189253: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX512F
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-26 01:50:29.189600: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.190186: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-05-26 01:50:29.190261: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.190840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.191472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-26 01:50:29.191521: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-05-26 01:50:29.701107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-26 01:50:29.701162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-05-26 01:50:29.701171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-05-26 01:50:29.701381: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.702050: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.702700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.703268: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-05-26 01:50:29.703314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13787 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:04.0, compute capability: 7.0)
WARNING:tensorflow:Collective ops is not configured at program startup. Some performance features may not be enabled.
W0526 01:50:29.705162 139803541604224 mirrored_strategy.py:379] Collective ops is not configured at program startup. Some performance features may not be enabled.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0526 01:50:29.708120 139803541604224 mirrored_strategy.py:369] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
2021-05-26 01:50:30.023877: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-05-26 01:50:30.024463: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000179999 Hz
2021-05-26 01:50:30.091103: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091209: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091287: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091416: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091468: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091860: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.092173: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.092210: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
Traceback (most recent call last):
File "/usr/local/bin/ddsp_run", line 8, in
colab_utils.save_dataset_statistics(data_provider, PICKLE_FILE_PATH, batch_size=1)
Did you ever make any progress on this? I'd be interested in this as well.
I think I managed to update all the numbers for sample_rate = 44100. For 48000 it should be a bit easier.
The first step is to correct sample rate and everything related to it:
--gin_param="TFRecordProvider.sample_rate=44100" --gin_param="Harmonic.sample_rate=44100" --gin_param="FilteredNoise.n_samples=176400" --gin_param="Harmonic.n_samples=176400" --gin_param="Reverb.reverb_length=132300"
So I have 4 second long files in 44100 Hz and I want a 3 second long reverb. For 48000 Hz you will probably stop here.
But then I got this exception:
ValueError: For upsampling, the target the number of timesteps must be divisible by the number of input frames. (timesteps:176400, frames:1001, add_endpoint=True).
What it actually tells is that it cannot upsample the loudness/f0 values (that have a default frame rate of 250) from 1000 to 176400. So we need the frame rate to be a divider of 44100, e.g. 210. That means re-creating your dataset with
ddsp_prepare_tfrecord --frame_rate=210
and then adding
--gin_param='F0LoudnessPreprocessor.time_steps=840' --gin_param="TFRecordProvider.frame_rate=210"
to the ddsp_run
call
Wow! Thanks a lot @nglazyrin ! I am going to try to follow your method for my 44.1kHz audio