runs: make key
- [x] Verify proposed key
- [x] Implement constraints for proposed key
- [ ] Implement whitespace-normalization constraint for parameters column
Proposed key
The proposed key is
(model_id, site_id, start_time, finish_time, parameter_list, ensemble_id)
The runs table is currently empty, so this can be implemented now on ebi-forecast.
Constraints
ALTER TABLE runs ALTER COLUMN model_id ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN site_id ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN start_time ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN finish_time ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN parameter_list ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN ensemble_id ADD NOT NULL;
ALTER TABLE runs ADD CONSTRAINT unique_time_interval_per_model_site_parameter_list_and_ensemble_combination UNIQUE (model_id, site_id, start_time, finish_time, parameter_list, ensemble_id);
In addition, the key will be strengthened by adding a whitespace-normalization constraint to the paramater_list column.
@dlebauer , @robkooper I just read a note pertaining to this in the "Constraints for BETYdb" document: "finished_at will not be available when record is created" (I assume finished_at really means finish_time.)
So either this has to initially be set, say, to start_time, or we can't have a NOT NULL constraint on this. In the latter case the proposed key really isn't a key (though, it seems likely that finish_time could be dropped from the key without violating uniqueness).
Ah, now I see that the table includes start_time, finish_time, start_date, finish_date, started_at, and finished_at. Which of these should be part of the key?
In the constraints table, it says to add the following comments:
COMMENT ON COLUMN "public"."runs"."finish_time" IS 'end of time period being simulated';
COMMENT ON COLUMN "public"."runs"."finished_at" IS 'system time when run ends; can be null when record is created';
COMMENT ON COLUMN "public"."runs"."start_time" IS 'beginning of time period being simulated';
COMMENT ON COLUMN "public"."runs"."started_at" IS 'system time when run ends';
so the *_time information will be available when the run starts. Only the *_at information will not be available. I think null would be fine, but we could use now() and interpret started_at == finished_at as not finished.
I assume the comment on runs.started_at should be:
COMMENT ON COLUMN "public"."runs"."started_at" IS 'system time when run begins';
This key has been implemented. But enforcing whitespace normalization on parameter_list has not been implemented. On ebi_production,
select count(*) from runs where is_whitespace_normalized(parameter_list);
returns 467261 and
select count(*) from runs where not is_whitespace_normalized(parameter_list);
returns 138.
If we don't care to bother enforcing whitespace normalization of parameter_list, please close this. But see issues #245 and #257 for other runs constraints issues on timestamp columns.