node-taglib-sharp icon indicating copy to clipboard operation
node-taglib-sharp copied to clipboard

MP3 duration

Open ion-dev opened this issue 1 year ago • 4 comments

Hey @benrr101 great release by the way!

I've been testing the new duration property and it's spot on as far as I can tell. Nice work.

I was wondering how it's calculated since many apps seem to read the length slightly shorter. Is it possible to write the correct length back to the VBR header so other apps read it correctly?

ion-dev avatar Dec 04 '24 21:12 ion-dev

Hey @ion-dev, thanks!

Duration is calculated in a few different ways depending on what information is available:

  • If Fraunhofer VBRI header is available: total_frames, total_bytes are available, and encoder delay are available
    • duration_seconds = ((total_frames * samples_per_frame) - delay) / samples_per_second)
  • If Xing VBR header is available: total_frames, total_bytes may be available. Additionally, if the LAME extension is available, encoder delay is available.
    • If total_frames is available:
      • If delay is available: duration_seconds = ((total_frames * samples_per_frame) - delay) / samples_per_second
      • If delay is not available: duration_seconds = (total_frames * samples_per_frame) / samples_per_second
    • If total_frames is not available: duration_seconds is calculated as if no VBR header is present (see below). Technically this could yield inaccurate results, but in reality I don't think I've ever seen a Xing header that doesn't define total frames. The only other way to calculate it would be to scan the entire file for frames and calculate the number of samples in the file. There is a file read accuracy enum that allows specifying something better than "average" accuracy, and I've considered using the aforementioned scanning method if "accurate" is specified. Otherwise, scanning the file is expensive!
  • If no VBR header is available: bitrate is available for the frame
    • If total_bytes can be determined (MP3 files): duration_seconds = (total_bytes * 8) / bitrate;
    • If total_bytes cannot be determined easily (MPEG 1/2 container files): duration_seconds cannot be determined. File duration is usually provided by other mechanisms for these files.

As for writing duration back to the VBR header ... I don't really think that's possible. Since duration is calculated from total frames and encoder delay, writing a change to duration would require changing the total frames or encoder delay. Though I think frame count isn't super important, I don't want to touch that. Encoder delay is used by the decoder for gapless playback, so I also wouldn't feel safe tweaking that value. For node-taglib-sharp, although I want to provide as much info as possible, and be accurate, it is still just a tagging library. I don't really want to get into the business of messing with the media itself 😨

Though, I am curious what app is reading the length shorter, it's always possible my math isn't taking something into account.

benrr101 avatar Dec 05 '24 05:12 benrr101

Great info, thanks! I actually first read the buffer and compare the duration with node-tagline-sharp duration and yours always seems to be correct. I think the variation in duration between apps comes down to the decoder used for reading the buffer. The difference is whether the few milliseconds of silence at the beginning exist or not. Many decoders seem to ignore them, which is fine until you move your file from on app to another (for example in DJ apps where the audio file has markers at various times throughout).

Ok another question, is it possible to only calculate the length of silence at the start? I guess that's the VBR header length?

ion-dev avatar Dec 05 '24 07:12 ion-dev

I'm a little less sure what you're asking with this one 🤔 I don't know everything about the encoder/decoder delay, but my understanding is it's basically the number of samples before/after the audio actually begins/ends. It's used to help playback apps with gapless playback. But I don't quite get how it helps with gapless playback - is it padding the encoder adds? And if so, why does adding padding help?

Ultimately, the delay is best effort, since it's less than a single frame worth of samples (which is on the order of 20ms). Fraunhofer VBRI headers always have the delay, while Xing VBR headers only have the delay if they have the LAME extension to the header (and detecting a LAME header is not foolproof).

Are you looking to get the total delay from those headers? Or did you want actual silence at the beginning of a track?

(And also, what DJ app are you using?)

benrr101 avatar Dec 11 '24 05:12 benrr101

Ignore me, I think I was overcomplicating things in my head. However, I have noticed that M4A files alway seem to produce the same duration as the buffer duration, whereas in Serato (where I dj) the duration is often less. Are there any similar improvements that can be made for calculating duration in M4A/MP4 files?

ion-dev avatar Dec 17 '24 13:12 ion-dev