Andy Jackson

Results 180 comments of Andy Jackson

As one of the people involved in setting up this GitHub repository, I can at least give some context. The 'real' (authoritative) WARC specification is the ISO standard. However, the...

Also hitting this when trying to query Parquet files hosted on GitHub Pages. It [looks like](https://github.com/duckdb/duckdb-wasm/issues/1932#issuecomment-3043954005) there is a [new config option in duckdb-wasm](https://github.com/duckdb/duckdb-wasm/pull/2060) that should offer a workaround (`forceFullHTTPReads`)...

Perhaps there advantages to wrapping the zstd dictionary in a WARC record because it means metadata about the dictionary can be added? For example, could a unique ID (or the...

Hm, just realised there are [quite a few tinyletter links dotted about](https://github.com/search?q=repo%3Athe-turing-way%2Fthe-turing-way%20tinyletter&type=code). I don't mind fixing more, although perhaps older pages associated with past events should be left alone?

No worries! Glad to help out. I've just search/replaced `tinyletter.com/TuringWay` to `buttondown.com/turingway` across all files that do not have a year in their name or path (i.e. excluding those that...

Ah, oops, there were some 2019 files. Hang on...

Ahem. Sorry about that. Should have known better than to trust search/replace! Should be good now, I think.

@aleesteele thanks, that's very kind. Glad to help out a little bit!

Hmm, not sure how best to proceed with this. Having a list of focussed/high-yield journals seems reasonable, but as we can see, with digital preservation being so interdisciplinary in nature,...

I'm also seeing the same issue. The `path` of the relation is being recorded instead of the `slug`, i.e. the `/_index` get appended to it. Using `{{title}}` instead is more...