augur icon indicating copy to clipboard operation
augur copied to clipboard

Refactor dates and times in the Commits table

Open jberkus opened this issue 3 years ago • 5 comments

(1) The dates in augur_data.commits have been stored in character varying fields. This forces coversion to and from date types every time they're used, and could lead to code errors:

Fields in question:

cmt_author_date              │ character varying(10)
cmt_committer_date           │ character varying

(2) Of the other datetimes in the table, half are timestamp with time zone, and half are timestamp without time zone. These are certainly already leading to query errors:

cmt_date_attempted           │ timestamp(0) without time zone
 cmt_ght_committed_at         │ timestamp(0) without time zone
cmt_committer_timestamp      │ timestamp(0) with time zone
cmt_author_timestamp         │ timestamp(0) with time zone
data_collection_date         │ timestamp(0) without time zone

Was there a reason this was done, or was it completely a design error?

jberkus avatar Nov 13 '21 01:11 jberkus

We refactored brianwarner/facade into Augur for commits (with permission). It came that way. It can be fixed. We've changed other things and kept it updated. I'll ask Brian the origins of the choice. I suspect it emerges from git version inconsistencies that were more of an issue in the before times and less of an issue today, since all that data comes from reading the git log.

sgoggins avatar Nov 13 '21 01:11 sgoggins

The ones with time zone are additions based on an earlier collaboration. We did that. Adding a column seemed less risky to me than changing one. It allowed for a long period of sustained validation. It's toenail fungus we can clean up now.

sgoggins avatar Nov 13 '21 01:11 sgoggins

We should be able to just change the type for data_collection_date, though, no? I mean, we know that's going to be in the system timezone.

jberkus avatar Nov 13 '21 01:11 jberkus

@jberkus : The data_collection_date is just when we got the data. Are you thinking of using that as some kind of offset?

sgoggins avatar Dec 09 '21 20:12 sgoggins

Yes, but it's TIMESTAMP WITHOUT TIME ZONE. Nothing in the system should be WITHOUT TIME ZONE, and that one seems like an easy fix -- which TZ it is doesn't require guesswork.

jberkus avatar Dec 09 '21 23:12 jberkus