cc-mrjob
cc-mrjob
copied to clipboard
commoncrawl
Reame
Issues
Upgrade to use Python 3, fixes #11
Open
sebastian-nagel
opened this issue 4 years ago
• 0 comments
based on an updated version of the warc library supporting Python 3, see https://github.com/commoncrawl/warc/
optionally replaces by a compatibility-wrapper around
warcio
or
fastwarc
Note:
needs more testing on Hadoop and EMR
will move the main branch to "main" together with this upgrade
Sep 30 '21 15:09
sebastian-nagel