Mat Kelly

Results 844 comments of Mat Kelly

:wave: @ibnesayeed I included that as a self-reference/reminder of the source as well as a good demonstration of the proposed element's features.

The further precision does not be present in the link above. What's the BnF link? Also see https://github.com/iipc/warc-specifications/pull/21.

First line `WARC/1.1` causes an exception in the iterator we currently reuse from pywb to quickly invalidate the WARC and not proceed with processing.

Per the WARC/1.1 spec and https://github.com/iipc/warc-specifications/pull/21, date strings like `2014-01` are legal but currently breaks the indexer with: ```py Traceback (most recent call last): File "/usr/local/bin/ipwb", line 11, in sys.exit(main())...

Added a sample (variableSizedDates) WARC that I believe conforms to the 1.1 standard with variable length datetime strings.

Encountered this again in testing, current master (73f136f48334ed9ca09c50413a2a9f0f51d251b0): ```py % ipwb index samples/warcs/variableSizedDates.warc Traceback (most recent call last):eSizedDates.warc: 1/5 File "/usr/local/bin/ipwb", line 8, in sys.exit(main()) File "/usr/local/lib/python3.7/site-packages/ipwb/__main__.py", line 19, in...

Given the rationale for conversion is from ISO8601 to 14-digit datetime, some options: 1. Assume undefined aspects of the datetime, e.g., 2014-01 to 20140101000000 2. Adapt to allow for fuzziness....

The key here is ISO8601 with "as much precision as is accurately known." I cannot locate a module to accomplish this but a series of tests (e.g., regex) with the...

9cd23ba addresses some of this but I have yet to match the fraction-of-a-second example in that WARC: ```py import datetime datetime.datetime.strptime('%Y-%m-%dT%H:%M:%S.%fZ','2014-02-10T00:00:01.000000002Z') ValueError: time data '%Y-%m-%dT%H:%M:%S.%fZ' does not match format '2014-02-10T00:00:01.000000002Z'...

The parameters above are backward, the format string should be second. This works: ``` datetime.datetime.strptime('2014-02-10T00:00:01.000000Z','%Y-%m-%dT%H:%M:%S.%fZ') datetime.datetime(2014, 2, 10, 0, 0, 1) ``` Note, however, that %f read six 0-padded digits....