mutagen
mutagen copied to clipboard
Matroska tags
Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)
From [email protected] on June 16, 2009 08:28:39
This one seems... unlikely. From https://code.google.com/p/quodlibet/issues/detail?id=167 :
Ex Falso currently cannot edit mka tags. The ability to do so would be a
useful addition.
Original issue: http://code.google.com/p/mutagen/issues/detail?id=3
- Bitbucket: https://bitbucket.org/lazka/mutagen/issue/3
Original comment by Freso Fenderson (Bitbucket: Freso, GitHub: Freso):
Remember to also do docs/api/matroska.rst or something like that.
Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):
I've created a branch called "matroska" for steps 1-4, so that code can be reviewed and shared without polluting the default branch.
Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):
I've begun work on this.
My plan is as follows:
- Create a robust EBML parser, and tweak and fine tune it to perform in the optimal way.
- Create a separate Matroska-specific parser, able to read the tags stored within the Matroska EBML container.
- Create a dict-like metadata object, using native strings (utf8) as keys, and allowing bytes and unicode to be set as values. Byte data will be interpreted as the Matroska "binary" type, while unicode data will be converted to utf8 and stored.
- Write tests as I go along, and fill in any gaps at the end.
- Possibly implement support for WebM, since it is derived from Matroska.
Useful documents:
- EBML specification: http://ebml.sourceforge.net/specs/
- Matroska specification: http://www.matroska.org/technical/specs/index.html
- WebM specification: http://www.webmproject.org/docs/container/
Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):
From [email protected] on January 18, 2013 00:31:15
I agree. Matroska files are far more versatile and widely supported than flac or ogg in some applications (like managing both video and audio libraries with a variety of codecs, and expecting them to play on a commercial device). Because of its potential influence on the world's use of open/free technology, I would place supporting Matroska meta-tags for mutagen and Ex-Falso above providing full mp4 container support.
Here is some code from the exaile project: https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py
Still on the list somewhere?
yes
Might be interesting: https://github.com/QBobWatson/python-ebml . It's GPLv3, though.
I'm starting to need WebM manipulation. Is there any way I'd be able to speed this along?
https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py doesn't work
72057594037927935 1
Traceback (most recent call last):
File "test.py", line 164, in parse
key, type_ = self.tags[id]
KeyError: 524531317
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 287, in <module>
parse('/home/lud4ik/work/chats/audio/1348379-1570779757.webm')
File "test.py", line 282, in parse
return Ebml(location, MatroskaTags).parse()
File "test.py", line 186, in parse
value = self.parse(tell, tell + size)
File "test.py", line 166, in parse
self.seek(size, 1)
File "test.py", line 57, in seek
self.file.seek(offset, mode)
OSError: [Errno 22] Invalid argument
https://pypi.org/project/hachoir-metadata/
What is the proper condition if I want parse only header with metadata, not blocks of actual data (audio)? The "Cluster" element contains data, so I must read everything before it and stop until I find it?
The last commit to the matroska
branch was in 2014. Anyone know the state of the implementation by @LordSputnik, and whether there were major challenges, or if it is even still compatible with how mutagen works today?
In the related ticket for Picard (link) there has been some discussion about whether to tag on the container or the stream level. As I understand the docs, in the case of mp4
there can only be container level tags, and only the first track is considered. Would container level tagging also be a sufficient for Matroska?
This was quite some time ago, but from what I remember the parsing was trickier than I expected. Sorry I can't be more helpful!
I don't think there would be much lost if somebody were to pick this up and start from scratch.