Andy Jackson

Results 167 comments of Andy Jackson

Looking deeper inside, [we find](https://github.com/iipc/openwayback/blob/6d64391226db88fbf4fc7ef44a9b2bbfbb3166da/wayback-core/src/main/java/org/archive/wayback/resourcestore/indexer/HTTPRecordAnnotater.java#L137): ``` java // Now the sticky part: If it looks like an HTML document, look for // robot meta tags: if(isHTML(mimeType)) { String fileContext =...

@ikreymer's [openwayback-sample-overlay](https://github.com/iipc/openwayback-sample-overlay) goes some way to addressing this, I think.

Java has a system for extensible URL protocol support, we could consider using it if necessary, or checking if these hooks have not already been written by others. - http://stackoverflow.com/questions/26363573/registering-and-using-a-custom-java-net-url-protocol...

IIRC from the Paris meeting, the issue was about how to manage and roll through large indexes with daily changes. But perhaps we should close this is no-one is clamouring...

Note that any refactoring should be done on a fork first. Apart from anything else, SCAPE deliverables may be referencing individual resources in this data set, and so changing the...

Note feedback thus far: - https://twitter.com/beet_keeper/status/509843753901629440 - https://twitter.com/bitsgalore/status/510004805335797760

Yeah, I mean, it's log4j 1 not 2, but that's not great either. Perhaps the whole tools section should just be deleted? Is anyone using any of it? I'm unlikely...

I note these links appear to redirect to http://www2.girona.cat/ca (i.e. with a www2. instead of a www.) -- is it possible that didn't fall into the scope of the crawl?

Related to #29 in terms of UI-level integration? Or would this integration happen within OpenWayback?