ArchiveSpark icon indicating copy to clipboard operation
ArchiveSpark copied to clipboard

Can ArchiveSpark read and process binary payload in warc files?

Open aysunakarsu opened this issue 4 years ago • 1 comments

Running ArchiveSpark from docker. Enrich function is not adding any payload when peekJson is called. The payload in my warc files are in binary. Can it be the problem? If it is, then is there a way to make ArchiveSpark work with binary payload warc files? Thanks

aysunakarsu avatar Jun 27 '20 12:06 aysunakarsu

Better late than never. You can use the access method and use the inputstream

parismic avatar Nov 27 '20 16:11 parismic