crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

Add Timestamp to result.links

Open 1933211129 opened this issue 1 year ago • 1 comments

Hi @unclecode ,

Hope you're having a great weekend! Thank you so much for your outstanding contributions to the project—they are truly appreciated.

I was wondering, regarding 'result.links,' could the result also include the time when the link was published or last updated?

No rush at all—looking forward to hearing from you whenever it's convenient. Wishing you a wonderful weekend!

1933211129 avatar Dec 07 '24 08:12 1933211129

@1933211129 Thank you for your kind words and your thoughtful suggestion. I’m glad you’re enjoying using the project!

Your idea to include the publication or last-updated time for links is indeed valuable. It’s now on my backlog. Implementing such a feature can be a bit tricky since we’d need a reliable source of timestamp information, sometimes it’s available through structured data (like JSON-LD or microdata), meta tags, or schema.org properties, but not all pages provide this data consistently. Ensuring accuracy across a variety of websites may require some careful consideration and heuristics.

I’ll look into approaches for extracting these timestamps more reliably and, once I find a suitable method, I’ll work on integrating it. Thanks again.

unclecode avatar Dec 09 '24 12:12 unclecode