bety icon indicating copy to clipboard operation
bety copied to clipboard

runs: make key

Open gsrohde opened this issue 10 years ago • 5 comments

  • [x] Verify proposed key
  • [x] Implement constraints for proposed key
  • [ ] Implement whitespace-normalization constraint for parameters column

Proposed key

The proposed key is

(model_id, site_id, start_time, finish_time, parameter_list, ensemble_id)

The runs table is currently empty, so this can be implemented now on ebi-forecast.

Constraints

ALTER TABLE runs ALTER COLUMN model_id ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN site_id ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN start_time ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN finish_time ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN parameter_list ADD NOT NULL;
ALTER TABLE runs ALTER COLUMN ensemble_id ADD NOT NULL;
ALTER TABLE runs ADD CONSTRAINT unique_time_interval_per_model_site_parameter_list_and_ensemble_combination UNIQUE (model_id, site_id, start_time, finish_time, parameter_list, ensemble_id);

In addition, the key will be strengthened by adding a whitespace-normalization constraint to the paramater_list column.

gsrohde avatar Jan 29 '15 22:01 gsrohde

@dlebauer , @robkooper I just read a note pertaining to this in the "Constraints for BETYdb" document: "finished_at will not be available when record is created" (I assume finished_at really means finish_time.)

So either this has to initially be set, say, to start_time, or we can't have a NOT NULL constraint on this. In the latter case the proposed key really isn't a key (though, it seems likely that finish_time could be dropped from the key without violating uniqueness).

gsrohde avatar Feb 06 '15 19:02 gsrohde

Ah, now I see that the table includes start_time, finish_time, start_date, finish_date, started_at, and finished_at. Which of these should be part of the key?

gsrohde avatar Feb 06 '15 19:02 gsrohde

In the constraints table, it says to add the following comments:

COMMENT ON COLUMN "public"."runs"."finish_time" IS 'end of time period being simulated';
COMMENT ON COLUMN "public"."runs"."finished_at" IS 'system time when run ends; can be null when record is created';
COMMENT ON COLUMN "public"."runs"."start_time" IS 'beginning of time period being simulated';
COMMENT ON COLUMN "public"."runs"."started_at" IS 'system time when run ends';

so the *_time information will be available when the run starts. Only the *_at information will not be available. I think null would be fine, but we could use now() and interpret started_at == finished_at as not finished.

dlebauer avatar Feb 06 '15 19:02 dlebauer

I assume the comment on runs.started_at should be:

COMMENT ON COLUMN "public"."runs"."started_at" IS 'system time when run begins';

gsrohde avatar Mar 10 '15 16:03 gsrohde

This key has been implemented. But enforcing whitespace normalization on parameter_list has not been implemented. On ebi_production,

select count(*) from runs where is_whitespace_normalized(parameter_list);

returns 467261 and

select count(*) from runs where not is_whitespace_normalized(parameter_list);

returns 138.

If we don't care to bother enforcing whitespace normalization of parameter_list, please close this. But see issues #245 and #257 for other runs constraints issues on timestamp columns.

gsrohde avatar Aug 03 '16 17:08 gsrohde