mapvizieR
mapvizieR copied to clipboard
what constraints should we enforce on the roster object?
although there are some student 'facts' that remain (or are generally considered to remain) immutable over time (DOB, gender etc), we don't want to create a data structure to normalize that data. it's not the use case we're trying to solve for.
the roster object needs to have one row per student, per term. if you have demographic data, it should reflect that student's demographic at that moment in time (ie, what was the student's lunch status in 2008?)
note: you need at least one row per student/year/season, but you are not limited to one row per student/year/season. for some roster data (like courses), it's sensible to store these in long form.
task: create a check that each student in the cdf has a roster entry for the relevant year/terms. TBA if that check should fail or simply warn (probably warn).
adding some thoughts here:
- look for weird patterns of student / grade enrollments. does a student get 'promoted' or 'demoted' during the year? would be useful to raise that so the user can investigate.
- look for changes in student first name / last name where the student has the same id. we obviously want to allow students to have different names over time if their legal name changes, but changes in name where the id stays constant often signal a data problem/mismatch.
additional thoughts:
- roster object now has to have studentfirstname and studentlastname, so that we can identify students on scatter plots,
haid_plot
, etc.