Mat Kelly

Results 434 issues of Mat Kelly

The function `getURIsAndDatetimesInCDXJ()` in replay.py iterates through every line in a list of lines to extract the `datetime`, `mime`, and HTTP `status` from a CDXJ file to be used by...

enhancement

When developing ipwb, I have been using a cycle of `pip install`ing the source then testing. This gets tedious, repetitive, and incurs additional iterative temporal cost. It would be useful...

enhancement
ipwb indexer
ipwb replay
Meta

This implementation is heavily inspired by @ibnesayeed's [MementoMap implementation](https://github.com/oduwsdl/mementomap). In the future we _might_ align this with @ibnesayeed [upcoming API](https://github.com/ibnesayeed/binsearch) but for now, this drastically speeds up replay and thus...

This is a meta-ticket. @ibnesayeed has crafted automation of the release and testing process through GitHub actions, inclusive of generating the release notes. These release notes, among other things, contains...

Meta

Related to #165 and an idea I am still hashing (no pun) out. CDXJ output for `ipwb index -e ipwb/samples/warcs/5mementos.warc` then entering the key `goMonarchs` produces: CDXJ output !context ["http://tools.ietf.org/html/rfc7089"]...

enhancement
Privacy/Security/Encryption

The [WARC 1.1 spec](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1/) allows for more precise datetimes. These should be supported in the replay system. Does any tool exist that will generate these yet? If not, some sample...

Memento 🕣
i/o
pending

This occurs in WARCs with WARC Response records containing HTTP responses with a 204 No Content status code. Sample provided in `samples/warcs/HTTP204.warc`. When `pushToIPFS()` in the indexer calls `pushBytesToIPFS()`, which...

bug
External project dependence
ipwb indexer

Missing images both from the page itself as well as the reconstructive logo. WARC created with local webrecorder--built, run, and recorded using Docker and the webrecorder web interface: [temp-20180822005001.warc.gz](https://github.com/oduwsdl/ipwb/files/2308565/temp-20180822005001.warc.gz) ipwb...

bug
ipwb replay
serviceWorker âš™

A CDXJ may be specified to the replay system by a URI or IPFS hash per the README. Testing this requires the content first being referred to in the remote...

ipwb replay
testing
testing-travisci-multinoderequirement