bety icon indicating copy to clipboard operation
bety copied to clipboard

inputs value constraints: needed cleanup and decisions

Open gsrohde opened this issue 10 years ago • 2 comments

notes

  • [x] Add not NULL constraint to migration

start_date, end_date

  • [ ] Decide if should be not NULL
  • [ ] Decide if there should be other constraints (see below)
  • [ ] Fix violations
  • [ ] Add constraints to migration

name

  • [x] Add not NULL and is_whitespace_normalized constraints to migration

access_level

nullness
  • [x] Decide if should be not NULL
    • should be not null

If so:

  • [x] Fix violations
    • change null to 1 (conservatively assuming private)
  • [x] Add constraint to migration
range
  • [x] Decide whether to require access_level IN (1, 2, 3, 4)
    • yes

If so:

  • [x] Fix violations
  • [x] Make domain and change column type to use it

raw

  • [x] Decide if should be not NULL
    • should be either 0 or 1; null by default (assuming raw)

If so:

  • [x] Fix violations
    • ~~convert t --> 1, f --0 (please confirm @robkooper) to be consistent with other boolean fields~~ I left this as a boolean for now. Why change to a type that requires much more storage when boolean yields the same amount of information?
    • convert NULL -> 1 (assume raw) -- I CHANGED THESE TO TRUE INSTEAD.
  • [x] Add constraint to migration

format_id

  • [ ] Get rid of NULLs (2 currently)
  • [ ] Add non null constraint to migration

Details

start_date, end_date

SELECT start_date, end_date FROM inputs WHERE start_date IS NULL OR end_date IS NULL;

shows 18 rows with null start_date and 19 with null end_date.

SELECT start_date, end_date FROM inputs WHERE start_date >= end_date;

shows 1 row where start_date > end_date and 2 where they are equal.

CHECK (start_date < end_date) seem like a reasonable constraint (unless we want to attach special significance to them being equal or to end_date being before start_date).

SELECT start_date, end_date FROM inputs WHERE start_date > NOW() OR end_date > NOW();

shows 3 end_date values way in the future. We would think that CHECK (end_date < NOW()) is a reasonable constraint.

access_level

SELECT COUNT(*), access_level FROM inputs GROUP BY access_level ORDER BY access_level;

shows that there are 2 rows with an access_level of -1 and 192 rows where access_level is NULL.

It's not clear what access_level does since BETYdb Rails app users with only acces_level 4 can view and edit all inputs.

  • A: access_level defines who should / should not access files, independent of rails app
  • @robkooper shouldn't this be handled in dbfiles (or preferably at filesystem level)?

raw

SELECT id, name FROM inputs WHERE raw IS NULL;

shows 8 rows where raw is NULL.

gsrohde avatar Feb 16 '15 14:02 gsrohde

@dlebauer You said the other day to make raw default to false. Looking back over these comments, I see that I fixed violations of raw NOT NULL constraint by changing NULL to TRUE. Should FALSE still be the default? Also, please see pending issues with start_date, end_date, and format_id--if you want to table these issues, I'll leave related constraints out of the migration.

gsrohde avatar May 20 '15 19:05 gsrohde

Depends on definition of "raw" and specific datasets. I still think default should be null, and perhaps existing files evaluated on a per file basis

Start date / end date do not apply to some inputs. Perhaps these should be allowed to be NULL

dlebauer avatar May 20 '15 19:05 dlebauer