awesome-python icon indicating copy to clipboard operation
awesome-python copied to clipboard

Add ArchiveBox to `Web Content Extracting` section

Open pirate opened this issue 1 year ago • 1 comments

What is this Python project?

Internet archiving / web content extraction tool, supports extracting these content types and more:

  • raw html, html after JS executes in chrome headless
  • screenshot & PDF
  • embedded audio, video, subtitles (using yt-dlp)
  • article text and comments
  • git repositories
  • and lots more...

What's the difference between this Python project and similar ones?

See here:

  • https://github.com/ArchiveBox/ArchiveBox#comparison-to-other-projects
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community#other-archivebox-alternatives

--

Anyone who agrees with this pull request could submit an Approve review to it.

pirate avatar May 04 '24 08:05 pirate

How can I delete my comment? I am new User in GitHub sorry

geraldhingpit avatar May 07 '24 12:05 geraldhingpit