archiveweb.page icon indicating copy to clipboard operation
archiveweb.page copied to clipboard

Feature Request: capture social media posts (e.g. tweets) as seperate pages

Open nvanderperren opened this issue 2 years ago • 0 comments

I was wondering if it's possible to detect a tweet, facebook post, instagram post as a page. I believe it's better for the findability of content in archived social media accounts. In my experience, it's not possible to do a full text search in the posts/tweets right now unless you've crawled all posts as seperate pages. Every tweet or facebook post has it's own URL, so it's not clear for me why it's not detected as a page.

Some screenhots:

Just one page. Can't do a text search Schermafbeelding 2023-05-15 om 18 09 09

I first scraped all URL's of the posts and then used browsertrix-crawler to crawl all seperate posts. Can do a full text search on all posts Schermafbeelding 2023-05-15 om 18 08 50

nvanderperren avatar May 15 '23 16:05 nvanderperren