MediaInfoLib icon indicating copy to clipboard operation
MediaInfoLib copied to clipboard

Matroska Date parsing is broken for dates older than 2001-01-01

Open jdayx opened this issue 5 years ago • 4 comments

Hello, this problem is easy to reproduce. The following returns the wrong date and also breaks MKV statistics (Statistics Tags Issue):

$ mkvmerge --engage no_variable_data empty.mpeg -o epoch0.mkv
$ mediainfo epoch0.mkv | grep 2010-02
Encoded date                             : UTC 2010-02-22 21:41:29
Statistics Tags Issue                    : no_variable_data 1970-01-01 00:00:00 / no_variable_data 2010-02-22 21:41:29

The problem is caused by the Date field being treated as unsigned. The spec has:

Date - signed 8 octets integer in nanoseconds with 0 indicating the precise beginning of the millennium (at 2001-01-01T00:00:00,000000000 UTC)

I wrote a fix (but probably not using the right functions):

diff --git a/Source/MediaInfo/File__Analyze.h b/Source/MediaInfo/File__Analyze.h
index 5732e8ab..39725d92 100644
--- a/Source/MediaInfo/File__Analyze.h
+++ b/Source/MediaInfo/File__Analyze.h
@@ -576,6 +576,7 @@ public :
     void Get_B6   (int64u  &Info, const char* Name);
     void Get_B7   (int64u  &Info, const char* Name);
     void Get_B8   (int64u  &Info, const char* Name);
+    void Get_B8S  (int64s  &Info, const char* Name);
     void Get_B16  (int128u &Info, const char* Name);
     void Get_BF2  (float32 &Info, const char* Name);
     void Get_BF4  (float32 &Info, const char* Name);
diff --git a/Source/MediaInfo/File__Analyze_Buffer.cpp b/Source/MediaInfo/File__Analyze_Buffer.cpp
index 9d590384..0d75cb98 100644
--- a/Source/MediaInfo/File__Analyze_Buffer.cpp
+++ b/Source/MediaInfo/File__Analyze_Buffer.cpp
@@ -223,6 +223,14 @@ void File__Analyze::Get_B8(int64u &Info, const char* Name)
     Element_Offset+=8;
 }
 
+void File__Analyze::Get_B8S(int64s &Info, const char* Name)
+{
+    INTEGRITY_SIZE_ATLEAST_INT(8);
+    Info=BigEndian2int64s(Buffer+Buffer_Offset+(size_t)Element_Offset);
+    if (Trace_Activated) Param(Name, Info);
+    Element_Offset+=8;
+}
+
 //---------------------------------------------------------------------------
 void File__Analyze::Get_B16(int128u &Info, const char* Name)
 {
diff --git a/Source/MediaInfo/Multiple/File_Mk.cpp b/Source/MediaInfo/Multiple/File_Mk.cpp
index 2b1ac81b..13c178a4 100644
--- a/Source/MediaInfo/Multiple/File_Mk.cpp
+++ b/Source/MediaInfo/Multiple/File_Mk.cpp
@@ -2945,8 +2945,8 @@ void File_Mk::Segment_Info()
 void File_Mk::Segment_Info_DateUTC()
 {
     //Parsing
-    int64u Data;
-    Get_B8(Data,                                                "Data"); Element_Info1(Data/1000000000+978307200); //From Beginning of the millenium, in nanoseconds
+    int64s Data;
+    Get_B8S(Data,                                                "Data"); Element_Info1(Data/1000000000+978307200); //From Beginning of the millenium, in nanoseconds
 
     FILLING_BEGIN();
         if (Segment_Info_Count>1)

jdayx avatar Mar 18 '20 00:03 jdayx

This would be something like that, true. I have another patch with the same kind of support for signed integer, I'll merge it then fix Segment_Info_DateUTC the way you suggest.

Without rejecting the idea of fixing that... Who has a Matroska file with such date? Matroska exists since only 2002...

Would you mind to share a sample file with such date, for my non regression tests.

JeromeMartinez avatar Mar 18 '20 20:03 JeromeMartinez

As shown in my message, any file generated with mkvmerge --engage no_variable_data will have its date set to epoch 0. If you search the Web for 2010-02-22 21:41:29, you will see about 130k matching pages so a lot of people are apparently using it.

jdayx avatar Mar 18 '20 21:03 jdayx

Hello, is it possible to fix this please? I can make a merge request if you need one, just tell me where you want the function defined.

jdayx avatar May 27 '20 01:05 jdayx

I see what to do, just need some time on my side for tests, on my (busy) todo-list.

JeromeMartinez avatar May 27 '20 10:05 JeromeMartinez

I think I am seeing the same/similar issue, except that it is now happening on current date too, I have attached a sample mkv file that shows the issue, if the date is removed the file displays info correctly, if set to 2023-11-10+ then I get the "Statistics Tags Issue" line in MediaInfo output, and then 4 FromStats lines are listed, if set to 2023-11-09 or less it displays normally.

I guess there will be a lot of people using those dates from now on.

2023-11-10_123929

date_issue.zip

edit: Whoops, sorry just noticed this thread was posted in MediaInfoLib issues, I am actually using MediaInfo CLI v23.10, but I guess the same fix would apply to both anyway, so I'll leave this post here for now.

jupester avatar Nov 10 '23 01:11 jupester

Fixed.

JeromeMartinez avatar Nov 17 '23 12:11 JeromeMartinez

@jupester

I think I am seeing the same/similar issue, except that it is now happening on current date too,

Seems another issue, looks like you edited the encoding date manually so there is an incoherency between the encoding date and stats dates. They should be same else there is a risk of incoherency (remuxing not updating stats tags), reason we show stats separately here.

JeromeMartinez avatar Nov 17 '23 12:11 JeromeMartinez