mutagen icon indicating copy to clipboard operation
mutagen copied to clipboard

Matroska tags

Open lazka opened this issue 10 years ago • 14 comments

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)


From [email protected] on June 16, 2009 08:28:39

This one seems... unlikely. From https://code.google.com/p/quodlibet/issues/detail?id=167 :

Ex Falso currently cannot edit mka tags. The ability to do so would be a
useful addition.

Original issue: http://code.google.com/p/mutagen/issues/detail?id=3


  • Bitbucket: https://bitbucket.org/lazka/mutagen/issue/3

lazka avatar Jul 04 '14 14:07 lazka

Original comment by Freso Fenderson (Bitbucket: Freso, GitHub: Freso):


Remember to also do docs/api/matroska.rst or something like that.

lazka avatar Nov 25 '14 02:11 lazka

Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):


I've created a branch called "matroska" for steps 1-4, so that code can be reviewed and shared without polluting the default branch.

lazka avatar Sep 25 '14 20:09 lazka

Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):


I've begun work on this.

My plan is as follows:

  1. Create a robust EBML parser, and tweak and fine tune it to perform in the optimal way.
  2. Create a separate Matroska-specific parser, able to read the tags stored within the Matroska EBML container.
  3. Create a dict-like metadata object, using native strings (utf8) as keys, and allowing bytes and unicode to be set as values. Byte data will be interpreted as the Matroska "binary" type, while unicode data will be converted to utf8 and stored.
  4. Write tests as I go along, and fill in any gaps at the end.
  5. Possibly implement support for WebM, since it is derived from Matroska.

Useful documents:

  • EBML specification: http://ebml.sourceforge.net/specs/
  • Matroska specification: http://www.matroska.org/technical/specs/index.html
  • WebM specification: http://www.webmproject.org/docs/container/

lazka avatar Sep 25 '14 20:09 lazka

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


From [email protected] on January 18, 2013 00:31:15

I agree.  Matroska files are far more versatile and widely supported than flac or ogg in some applications (like managing both video and audio libraries with a variety of codecs, and expecting them to play on a commercial device).  Because of its potential influence on the world's use of open/free technology, I would place supporting Matroska meta-tags for mutagen and Ex-Falso above providing full mp4 container support.

lazka avatar Jul 04 '14 14:07 lazka

Here is some code from the exaile project: https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py

lazka avatar May 09 '16 17:05 lazka

Still on the list somewhere?

Moilleadoir avatar Jan 11 '18 16:01 Moilleadoir

yes

lazka avatar Jan 13 '18 09:01 lazka

Might be interesting: https://github.com/QBobWatson/python-ebml . It's GPLv3, though.

phw avatar Aug 21 '18 06:08 phw

I'm starting to need WebM manipulation. Is there any way I'd be able to speed this along?

Freso avatar Feb 18 '19 09:02 Freso

https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py doesn't work

72057594037927935 1
Traceback (most recent call last):
  File "test.py", line 164, in parse
    key, type_ = self.tags[id]
KeyError: 524531317

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 287, in <module>
    parse('/home/lud4ik/work/chats/audio/1348379-1570779757.webm')
  File "test.py", line 282, in parse
    return Ebml(location, MatroskaTags).parse()
  File "test.py", line 186, in parse
    value = self.parse(tell, tell + size)
  File "test.py", line 166, in parse
    self.seek(size, 1)
  File "test.py", line 57, in seek
    self.file.seek(offset, mode)
OSError: [Errno 22] Invalid argument

lud4ik avatar Oct 16 '19 15:10 lud4ik

https://pypi.org/project/hachoir-metadata/

lud4ik avatar Oct 16 '19 16:10 lud4ik

What is the proper condition if I want parse only header with metadata, not blocks of actual data (audio)? The "Cluster" element contains data, so I must read everything before it and stop until I find it?

lud4ik avatar Nov 12 '19 08:11 lud4ik

The last commit to the matroska branch was in 2014. Anyone know the state of the implementation by @LordSputnik, and whether there were major challenges, or if it is even still compatible with how mutagen works today?

In the related ticket for Picard (link) there has been some discussion about whether to tag on the container or the stream level. As I understand the docs, in the case of mp4 there can only be container level tags, and only the first track is considered. Would container level tagging also be a sufficient for Matroska?

ffe4 avatar May 27 '21 09:05 ffe4

This was quite some time ago, but from what I remember the parsing was trickier than I expected. Sorry I can't be more helpful!

I don't think there would be much lost if somebody were to pick this up and start from scratch.

LordSputnik avatar May 27 '21 10:05 LordSputnik