mem0 icon indicating copy to clipboard operation
mem0 copied to clipboard

Unable to capture precise publication time for YouTube video metadata

Open Esparon1 opened this issue 10 months ago • 9 comments

🐛 Describe the bug

Embedchain currently captures the publish date of YouTube videos but omits the precise publication time. The metadata stored for each video includes the date in the format YYYY-MM-DD 00:00:00, which defaults to a time of 00:00:00, indicating that the time component is not being processed or stored.The issue with just the date is because of time zones.

Example of the issue: For a video published at 3 PM on April 15, 2024, Embedchain stores the publish date as 2024-04-15 00:00:00 Whereas it should be in ISO 8601 format ( example: 2024-04-15T14:30:00Z ) This format allows the publication time to be used globally without confusion about time zone differences.

Esparon1 avatar Apr 20 '24 11:04 Esparon1

Can I work on this?

jsjeon-um avatar Apr 22 '24 21:04 jsjeon-um

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

MoizKhuzema avatar Jun 13 '24 17:06 MoizKhuzema

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

No sorry it has not been merged yet. But please feel free to open a PR for this using yt-dlp.

Dev-Khant avatar Jun 14 '24 06:06 Dev-Khant

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

No sorry it has not been merged yet. But please feel free to open a PR for this using yt-dlp.

@Dev-Khant, I was writing the script for yt-dlp migration and realized that although the youtube dataloader does return meta data for videos, the LLM does not seem to have access to the video metadata. It only seems to have access to the video context.

I ran these 4 queries: 1- what is the publication date of the video? 2- do you have any information regarding this video's metadata? 3- what is the context you were provided with? 4- what is the name of the youtube channel and the length of the video?

here are the results: Capture

I was wondering if this is by design or an issue?

MoizKhuzema avatar Jun 19 '24 17:06 MoizKhuzema

@MoizKhuzema If you use citations==True in app.query() you will get metadata which will return you all the metadata regarding that video. It will include publish date, author, title etc.

So it's by design that you get metadata separately.

Dev-Khant avatar Jun 21 '24 11:06 Dev-Khant

@MoizKhuzema If you use citations==True in app.query() you will get metadata which will return you all the metadata regarding that video. It will include publish date, author, title etc.

So it's by design that you get metadata separately.

Understood, thanks

MoizKhuzema avatar Jun 21 '24 11:06 MoizKhuzema

@Dev-Khant is this issue open for contribution?

02shanks avatar Aug 18 '24 07:08 02shanks

@MoizKhuzema Please let us know if you are working on this or else @02shanks would like to pick this up. Thanks.

Dev-Khant avatar Aug 19 '24 04:08 Dev-Khant

I would like to pick this up as I see it's still open.

shivani-developer avatar Sep 06 '24 08:09 shivani-developer