Alex Osborne
Alex Osborne
## Describe the bug When pywb rewrites `eval` it wraps it in a function call. Unfortunately this breaks code which declares function-scoped variables using `var` and then accesses them outside...
The [beandoc.py](https://github.com/internetarchive/heritrix3/blob/master/docs/_ext/beandoc.py) script used to generate the [bean reference](https://heritrix.readthedocs.io/en/latest/bean-reference.html) manual is currently unaware of properties inherited from superclasses. To improve completeness it should resolve and parse parent classes too.
For now use "lein keygen" to make an RSA key, or just use scp with: lein pom scp pom.xml yourlib.jars [email protected]: I'm going to look into whether using Ganymed SSH...
ExtractorPDF is using a very old version of itext from 2006. Updating it to the newer version of itext used in contrib would require distribution of core Heritrix to come...
Heritrix currently records DNS records as text/dns response records. This seems incorrect according to the WARC spec as response records should include the network protocol information and the [DNS wire...
If a seed URL returned a redirect with status code of any of `303 See other`, `307 Temporary Redirect` or `308 Permanent Redirect` then the redirect field is not populated...
Having a subdirectory for crawl/capture artifacts (configuration files, logs, reports etc) would be useful for the use case of storing or transporting an entire crawl job in a form that's...
It would be nice to have an option to output in CDXJ format. Pywb's cdx-indexer uses the command-line option "-j, --cdxj" for that so it'd be nice if we support...
It'd be nice to make use of the Java 11 versions of inflate() and deflate() so that buffers that aren't array-backed can be used. One option would be to produce...
`jwarc filter resource | jwarc exec file` `jwarc filter image | jwarc exec montage`