matchering icon indicating copy to clipboard operation
matchering copied to clipboard

Preserve song length

Open demesm opened this issue 1 year ago • 3 comments

Is there a way to preserve the song length? I've noticed it speeds up my audio, usually cutting 10-20s and changes the pitch quite a bit.

demesm avatar Aug 29 '24 16:08 demesm

same. did you solve this? I am using the comfyui node version. It worked fine for one song then did this consistenty after.

mdkberry avatar Aug 24 '25 22:08 mdkberry

going to experiment with this see what I can figure out. I just realised it does it based on the reference mp3 audio file. one of mine does it, the other does not. This is the text cut and paste from command prompt for each so maybe a clue in there. I suspect it is format of mp3 causing it and will see if I can figure it out.

THIS MP3 REF AUDIO FILE DIDNT SPEED THE SONG UP:

Loading and analysis TARGET audio length: 16571412 samples (0:05:45) Resampling TARGET audio from 48000 Hz to 44100 Hz... The TARGET audio sample rate and internal sample rate were different. The TARGET audio was resampled REFERENCE audio length: 11099520 samples (0:04:11)

Matching levels The maximum size of the analyzed piece: 661500.0 samples or 15.00 seconds Normalizing the REFERENCE... The REFERENCE was normalized. Final amplitude coefficient for the TARGET audio is: -0.2103 dB Calculating mid and side channels of the TARGET... The TARGET will be didived into 24 pieces One piece of the TARGET has a length of 634374 samples or 14.38 seconds Calculating RMSes of the TARGET pieces... Extracting the loudest pieces of the TARGET audio with the RMS value more than average -13.3045 dB... The current average RMS value in the loudest pieces is -12.6449 dB Calculating mid and side channels of the REFERENCE... The REFERENCE will be didived into 17 pieces One piece of the REFERENCE has a length of 652912 samples or 14.81 seconds Calculating RMSes of the REFERENCE pieces... Extracting the loudest pieces of the REFERENCE audio with the RMS value more than average -17.4749 dB... The current average RMS value in the loudest pieces is -16.8359 dB The RMS coefficient is: -4.1910 dB Modifying the amplitudes of the TARGET audio... Modifying the amplitudes of the extracted loudest TARGET pieces...

Matching frequencies Calculating the mid FIR for the matching EQ... Calculating the side FIR for the matching EQ... Convolving the TARGET audio with calculated FIRs... The convolution is done in 1.37 seconds Converting MS to LR...

Correcting levels Applying RMS correction #1... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -16.7087 dB The RMS coefficient is: -0.1272 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #2... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -16.8359 dB The RMS coefficient is: 0.0000 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #3... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -16.8359 dB The RMS coefficient is: -0.0000 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #4... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -16.8359 dB The RMS coefficient is: 0.0000 dB Modifying the amplitudes of the RESULT audio...

Final processing and saving The amplitude of the normalized RESULT should be adjusted by -1.1010 dB And by -0.2103 dB after applying some brickwall limiter to it The limiter is started. Preparing the gain envelope... The limiter is not needed!

The task is completed Prompt executed in 14.05 seconds

THIS ONE SPED THE SONG UP:

Loading and analysis TARGET audio length: 16571412 samples (0:05:45) Resampling TARGET audio from 48000 Hz to 44100 Hz... The TARGET audio sample rate and internal sample rate were different. The TARGET audio was resampled REFERENCE audio length: 17062884 samples (0:05:55) Resampling REFERENCE audio from 48000 Hz to 44100 Hz... The REFERENCE audio was resampled

Matching levels The maximum size of the analyzed piece: 661500.0 samples or 15.00 seconds Normalizing the REFERENCE... The REFERENCE was not changed. There is no final amplitude coefficient Calculating mid and side channels of the TARGET... The TARGET will be didived into 24 pieces One piece of the TARGET has a length of 634374 samples or 14.38 seconds Calculating RMSes of the TARGET pieces... Extracting the loudest pieces of the TARGET audio with the RMS value more than average -13.3045 dB... The current average RMS value in the loudest pieces is -12.6449 dB Calculating mid and side channels of the REFERENCE... The REFERENCE will be didived into 24 pieces One piece of the REFERENCE has a length of 653188 samples or 14.81 seconds Calculating RMSes of the REFERENCE pieces... Extracting the loudest pieces of the REFERENCE audio with the RMS value more than average -17.0143 dB... The current average RMS value in the loudest pieces is -15.3745 dB The RMS coefficient is: -2.7296 dB Modifying the amplitudes of the TARGET audio... Modifying the amplitudes of the extracted loudest TARGET pieces...

Matching frequencies Calculating the mid FIR for the matching EQ... Calculating the side FIR for the matching EQ... Convolving the TARGET audio with calculated FIRs... The convolution is done in 1.35 seconds Converting MS to LR...

Correcting levels Applying RMS correction #1... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -15.4777 dB The RMS coefficient is: 0.1032 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #2... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -15.3745 dB The RMS coefficient is: 0.0000 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #3... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -15.3745 dB The RMS coefficient is: 0.0000 dB Modifying the amplitudes of the RESULT audio... Applying RMS correction #4... Calculating RMSes of the RESULT pieces... The current average RMS value in the loudest pieces is -15.3745 dB The RMS coefficient is: 0.0000 dB Modifying the amplitudes of the RESULT audio...

Final processing and saving The amplitude of the normalized RESULT should be adjusted by -0.0187 dB The limiter is started. Preparing the gain envelope... The limiter is not needed!

The task is completed Prompt executed in 24.40 seconds

mdkberry avatar Aug 24 '25 23:08 mdkberry

solved it in my case.

I used Audacity (FOSS) to change the reference mp3 to 44.1khz, it was 48khz and this was causing it.

My target audio was also in 48Khz but that gets resampled, I dont think the reference audio gets resampled given that was the only thing I changed and tested it again. I can confirm it worked in this case for me. There might be other scenarios causing the speed up.

I would also recommend normalising the reference audio (not the target audio) in Audacity, setting it to 44.1Khz and exporting as 320kbps seems to provide best results for me.

mdkberry avatar Aug 24 '25 23:08 mdkberry