node-id3 Using CTOC Entry Count byte causes issue with large number of entries

When reading the CTOC frame, the entry count is used to get the child elements here. Since entry count is a single unsigned eight bit integer, it overflows when the number of chapters is more than 255.

This is an issue with the spec, really, since there's plenty of room in the frame to list more than 255 chapters. Different platforms handle this differently, but FFmpeg and mpv are two examples where they seem to ignore the entry count byte and just read the actual list of chapters.

Oct 12 '23 16:10 harrisi

Interesting problem, some initial thoughts I'm having:

The length should definitely be maxed so the number cannot overflow
Should we still write more than 255 entries?
Can we really ignore the number?
- If there is an unknown number of null-terminated text entries, how do you know which one is the last?
- After the chapters, there can still be sub-frames. How do we detect that?

CTOC
...
255
FIRST CHAPTER\0		// chapter 1
SECOND CHAPTER\0	// chapter 2
THIRD CHAPTER\0		// chapter 3
TIT2\0\0\020HELLO\0	// sub-frame (title, partly written down)

This could also be read as

CTOC
...
255
FIRST CHAPTER\0		// chapter 1
SECOND CHAPTER\0	// chapter 2
THIRD CHAPTER\0		// chapter 3
TIT2\0			// chapter 4
\0			// chapter 5 (would be wrong ofc as two empty strings are not allowed as CHAP ID, but they could also be different
\0			// chapter 6
20HELLO\0		// chapter 7

I'm not sure if there is an actual clean way to ignore the length

Oct 12 '23 16:10 Zazama

Yeah, it's awkward. jsmediatags also gets this wrong. mpv gets it "right", but I can't find where they're handling it right now. I expect there's an issue with the TIT2 title like you mentioned.

I think instead of entry count you can get the size of the frame and read that many bytes (minus header size and whatever else).

Oct 12 '23 16:10 harrisi

I think instead of entry count you can get the size of the frame and read that many bytes (minus header size and whatever else).

The frames are included in the CTOC size, I don't think it is possible like that. Would be interesting to see how other implementations handle that

Oct 12 '23 16:10 Zazama

I have a file with 255 chapters and one with 257, and the CTOCs for them are:

CTOC
$00 00 06 91 // size
$00 00 // flags
toc // element id
$00 03 // flags
$FF // entry count
chp0 $00
chp1 $00
// ...
chp254 $00
CHAP
// ...

and

CTOC
$00 00 06 9F // size
$00 00 // flags
toc // element id
$00 03 // flags
$01 // entry count
chp0 $00
chp1 $00
// ...
chp256 $00
CHAP
// ...

As you can see, the size is 14 bytes more, since chp255$00chp256$00 is 14 bytes, so it seems to be an option.

Oct 12 '23 16:10 harrisi

Yes, but there can be other tags inside of the CTOC frame, not after it. In my example above, the size of the full TIT2 frame would be included in the CTOC frame's size, leaving the problem open about detecting if there are any frames inside the CTOC frame or not

See here: https://mutagen-specs.readthedocs.io/en/latest/_images/CTOCFrame-1.0.png

Oct 12 '23 16:10 Zazama

Oh right, I missed the subframe part of your first example and was just thinking about chapters with ids as frame ids, which actually I'm not sure if that's allowed. If it's not then you could find the sub frames, right?

Either way, yeah I see what you're saying. I think realistically the problem is actually with the tag writers. FFmpeg uses an unsigned int for the entry count byte as far as I can tell. Really I think the spec implies that only 255 chapters are supported, but it doesn't specify that, and it seems popular tag writers don't honor that.

Oct 12 '23 17:10 harrisi

I think they are allowed, the spec says the IDs must be unique only in respect to the other element IDs. We also can't check if the start of a string is a valid ID3 tag, because we want to support keeping unimplemented tags => tags node-id3 does not know about. I'll have to think about it a little, maybe there is a better solution

Oct 12 '23 17:10 Zazama

I'll have to think about it a little, maybe there is a better solution

Yeah, me too. I still think you're actually handling this correctly on read by reading entryCount number of entries, it's just that tag writers will happily overflow and write more than that. I'll try to figure out how mpv handles this as well.

Oct 12 '23 17:10 harrisi

node-id3 node-id3 copied to clipboard

Using CTOC Entry Count byte causes issue with large number of entries

node-id3
node-id3 copied to clipboard