warcit icon indicating copy to clipboard operation
warcit copied to clipboard

Revisit records don't seem to adhere to the --fixed-dt timestamp

Open Shrinks99 opened this issue 1 year ago • 0 comments

All other records in the created WARC file seem to adhere to the --fixed-dt flag if set by the user. Revisit records, automatically created by warcit based on the directory structure, are the only ones that seem to exhibit this issue.

This is possibly because revisit records use a different method of deriving warc_date than other records do. See https://github.com/webrecorder/warcit/blob/d94ecd791c43a27b186dba81d5c118c23f1647c9/warcit/warcit.py#L547-L554 vs https://github.com/webrecorder/warcit/blob/d94ecd791c43a27b186dba81d5c118c23f1647c9/warcit/warcit.py#L495-L501

Screenshot

This issue only appears to affect revisit records as shown below.

Screenshot 2023-11-02 at 11 02 23 PM

The current URL timestamp shows the current date of WARC creation instead of the --fixed-dt date. The HTML file displays the correct date displaying the time that these website files would have been seen (according to the user of warcit).

Screenshot 2023-11-02 at 11 07 58 PM

Shrinks99 avatar Nov 03 '23 03:11 Shrinks99