SORMAS-Project
SORMAS-Project copied to clipboard
Create a script to update and migrate script_occupation type values
Problem Description
SORMAS Version 1.75 moves the occupation type values into the customizable enum value table. Previously, original values have also been misused for other values by translating them into something totally different.
This has been a huge problem because users will have chosen values based on an entirely different meaning and have been filling the database with values that do not fit that meaning ever since.
In order to correct the false data present on German systems and also make the values consistent across health departments, a script is necessary to correct for the erroneous translations and values.
Proposed Solution
Create an SQL script that does the following:
- [ ] Insert the following values into the customizable enum table that maybe already exist if they were used before:
Value | english caption | german caption |
---|---|---|
FARMER | Farmer | Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin |
RELIGIOUS_LEADER | Religious leader | Geistliche*r |
HOUSEWIFE | Housewife | Hausfrau/-mann |
PUPIL_STUDENT | Pupil / student | Ausbildung |
WORKING_WITH_ANIMALS | Working with animals | Arbeiten mit Tieren |
- [ ] Insert the following new values into the customizable enum table
Value | english caption | german caption |
---|---|---|
CIVIL_SERVANT | Civil servant | Beamt*in, Richter*in, Berufssoldat*in |
EMPLOYEE | Employee | Angestellte*r |
WORKER | Worker | Arbeiter*in |
ACADEMIC | Academic | Akademiker*in in freiem Beruf |
ASSISTING_FAMILY_MEMBER | Assisting family member | Mithelfende*r Angehörige*r |
NOT_SPECIFIED | Not specified | Keine Angabe |
-
[ ] Migrate the occupation types in the person table to conform to the correct translation and meaning
-
[ ] Delete the values from the customizable enum table that were previously miss used and thusly never selected according to their meaning. It should only be necessary to remove them from the table since all ofcustomizableenumvalue the entries were replaced by step 3
-
[ ] Provide the script as part of the release notes
-
[ ] Add a comment into enum.properties right above the occupationtype entries that says: "Temporarily needed for migration of old systems". This does not have to go into a hotfix branch.
Updated by @MartinWahnschaffe based on HZIOrga-OccupationTypesScript.pdf
Possible Alternatives
Additional Information
Written by @kwa20
@SahaLinaPrueger I have extracted the todos from the document and restructured them a bit.
@markusmann-vg The script can be created and executed independently, so the hotfix does not have to wait for it.
Script to be reviewed by @MartinWahnschaffe:
The following script is only relevant for systems that use German as system language:
Update: This is now outdated. Final script is further down.
-- Step 2: Insert new values used in Germany into database that have not existed or were misused before INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'CIVIL_SERVANT', 'Beamtin, Richterin, Berufssoldatin', '[{"languageCode":"en","value":"Civil servant"},{"languageCode":"de-DE","value":"Beamtin, Richterin, Berufssoldatin"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'EMPLOYEE', 'Angestellter', '[{"languageCode":"en","value":"Employee"},{"languageCode":"de-DE","value":"Angestellter"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKER', 'Arbeiterin', '[{"languageCode":"en","value":"Worker"},{"languageCode":"de-DE","value":"Arbeiterin"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ACADEMIC', 'Akademikerin in freiem Beruf', '[{"languageCode":"en","value":"Academic"},{"languageCode":"de-DE","value":"Akademikerin in freiem Beruf"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ASSISTING_FAMILY_MEMBER', 'Mithelfender Angehöriger', '[{"languageCode":"en","value":"Assisting family member"},{"languageCode":"de-DE","value":"Mithelfender Angehöriger"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'NOT_SPECIFIED', 'Keine Angabe', '[{"languageCode":"en","value":"Not specified"},{"languageCode":"de-DE","value":"Keine Angabe"}]');
-- Step 3: Migrate occupation type entries in person table to conform to the corrected translations and meanings UPDATE person SET occupationtype = 'CIVIL_SERVANT' WHERE occupationtype = 'CHILD'; UPDATE person SET occupationtype = 'EMPLOYEE' WHERE occupationtype = 'HUNTER_MEAT_TRADER'; UPDATE person SET occupationtype = 'WORKER' WHERE occupationtype = 'MINER'; UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN'; UPDATE person SET occupationtype = 'BUSINESSMAN_WOMAN' WHERE occupationtype = 'BUTCHER'; UPDATE person SET occupationtype = 'ASSISTING_FAMILY_MEMBER' WHERE occupationtype = 'TRANSPORTER'; UPDATE person SET occupationtype = 'NOT_SPECIFIED' WHERE occupationtype = 'TRADITIONAL_SPIRITUAL_HEALER';
-- Step 4: Delete occupation types that were previously misused and are no longer needed DELETE FROM customizableenumvalue WHERE datatype = 'OCCUPATION_TYPE' AND (value = 'CHILD' OR value = 'HUNTER_MEAT_TRADER' OR value = 'MINER' OR value = 'BUTCHER' OR value = 'BUSINESSMAN_WOMAN' OR value = 'TRANSPORTER' OR value = 'TRADITIONAL_SPIRITUAL_HEALER');
</details>
Afterwards a server restart is needed!
@MateStrysewske Script looks correct and worked as expected on my local system.
@MartinWahnschaffe Which release notes should it be part of?
I think you can put it into the notes of the upcoming release 1.76 release
@MartinWahnschaffe @MateStrysewske I think this should go into 1.75 because that's the version where the occupation types where moved to the customizable enums
Hm, then we should mention it in the existing release notes and also in the upcoming notes of the hotfix.
I run a test on https://survnet-import-test.sormas.netzlink.com/
For the test I have created different test cases on version 1.74.3. The instance was raised to 1.75.4 and the script was executed. After that, for example, the case with the occupation type "child" is displayed as follows:
customizable_enum_values.csv
In the column "Caption" there are German or English entries mixed.
The entries in the column "translation" are partly missing
Drop-down menu
It looks to me like the menus are unchanged.
@MateStrysewske @MartinWahnschaffe Is there a dependency on the ticket https://github.com/hzi-braunschweig/SORMAS-Project/issues/5015 ? Do the two solutions need to be better aligned or does the test need to be adjusted?
Hi @KernB. Yes there is a dependency on #5015, but since it's part of 1.75, updating to this version and then executing this script should work fine.
Where did you take those screenshots? I just had a look at https://survnet-import-test.sormas.netzlink.com/ and there the drop-down looks like this:
So we definitely have a mistake in the script, because the caption should always be in the system language of the server. In this case we should have provided the script with German captions. I will take care of updating this.
Besides that the entries on my screenshot looks fine to me. What I see on the system doesn't match your customizable_enum_values.csv, though.
- Based on the csv WORKING_WITH_ANIMALS was added. Other entries like FARMER and RELIGIOUS_LEADER were already there. In the drop-down of my screenshot all those values seem to be new.
- CIVIL_SERVANT, EMPLOYEE, etc. were added
- CHILD, HUNTER_MEAT_EATER, etc. were removed
I have updated the script above to contain the captions in the right language.
When doing the migration (step 3):
UPDATE person SET occupationtype = 'BUSINESSMAN_WOMAN' WHERE occupationtype = 'BUTCHER';
UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN';
Should the order of these lines not be reversed? Otherwise, every current BUTCHER
value will turn into ACADEMIC
...
Value | German Caption |
---|---|
CHILD | Beamtin, Richterin, Berufssoldat*in |
HUNTER_MEAT_TRADER | Angestellte*r |
MINER | Arbeiter*in |
BUTCHER | Selbstständige*r |
BUSINESSMAN_WOMAN | Akademiker*in in freiem Beruf |
TRANSPORTER | Mithelfender Angehöriger |
TRADITIONAL_SPIRITUAL_HEALER | Keine Angabe |
@MartinWahnschaffe @FredrikSchaeferVitagroup @KernB I think the problem is that I missed an additional value for "Selbstständige*r". We should be fine by just adding SELF_EMPLOYED
and then migrating BUTCHER
to that value. In this way, BUSINESSMAN_WOMAN
and BUTCHER
are disentangled.
VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'SELF_EMPLOYED', 'Self-employed', '[{"languageCode":"de-DE","value":"Selbstständige*r"}]');
UPDATE person SET occupationtype = 'SELF_EMPLOYED' WHERE occupationtype = 'BUTCHER';
@kwa20 Thanks, that makes a lot more sense.
@KernB This is the updated script:
-- Step 1: Insert values used in Germany into database if they have not been used in this particular database yet
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'FARMER', 'Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin', '[{"languageCode":"en","value":"Farmer"},{"languageCode":"de-DE","value":"Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'FARMER');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'RELIGIOUS_LEADER', 'Geistliche*r', '[{"languageCode":"en","value":"Religious leader"},{"languageCode":"de-DE","value":"Geistliche*r"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'RELIGIOUS_LEADER');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'HOUSEWIFE', 'Hausfrau/-mann', '[{"languageCode":"en","value":"Housewife"},{"languageCode":"de-DE","value":"Hausfrau/-mann"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'HOUSEWIFE');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'PUPIL_STUDENT', 'Ausbildung', '[{"languageCode":"en","value":"Pupil / student"},{"languageCode":"de-DE","value":"Ausbildung"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'PUPIL_STUDENT');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKING_WITH_ANIMALS', 'Arbeiten mit Tieren', '[{"languageCode":"en","value":"Working with animals"},{"languageCode":"de-DE","value":"Arbeiten mit Tieren"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'WORKING_WITH_ANIMALS');
-- Step 2: Insert new values used in Germany into database that have not existed or were misused before
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'CIVIL_SERVANT', 'Beamt*in, Richter*in, Berufssoldat*in', '[{"languageCode":"en","value":"Civil servant"},{"languageCode":"de-DE","value":"Beamt*in, Richter*in, Berufssoldat*in"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'EMPLOYEE', 'Angestellte*r', '[{"languageCode":"en","value":"Employee"},{"languageCode":"de-DE","value":"Angestellte*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKER', 'Arbeiter*in', '[{"languageCode":"en","value":"Worker"},{"languageCode":"de-DE","value":"Arbeiter*in"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'SELF_EMPLOYED', 'Selbstständige*r', '[{"languageCode":"en","value":"Self-employed"},{"languageCode":"de-DE","value":"Selbstständige*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ACADEMIC', 'Akademiker*in in freiem Beruf', '[{"languageCode":"en","value":"Academic"},{"languageCode":"de-DE","value":"Akademiker*in in freiem Beruf"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ASSISTING_FAMILY_MEMBER', 'Mithelfende*r Angehörige*r', '[{"languageCode":"en","value":"Assisting family member"},{"languageCode":"de-DE","value":"Mithelfende*r Angehörige*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
(nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'NOT_SPECIFIED', 'Keine Angabe', '[{"languageCode":"en","value":"Not specified"},{"languageCode":"de-DE","value":"Keine Angabe"}]');
-- Step 3: Migrate occupation type entries in person table to conform to the corrected translations and meanings
UPDATE person SET occupationtype = 'CIVIL_SERVANT' WHERE occupationtype = 'CHILD';
UPDATE person SET occupationtype = 'EMPLOYEE' WHERE occupationtype = 'HUNTER_MEAT_TRADER';
UPDATE person SET occupationtype = 'WORKER' WHERE occupationtype = 'MINER';
UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN';
UPDATE person SET occupationtype = 'SELF_EMPLOYED' WHERE occupationtype = 'BUTCHER';
UPDATE person SET occupationtype = 'ASSISTING_FAMILY_MEMBER' WHERE occupationtype = 'TRANSPORTER';
UPDATE person SET occupationtype = 'NOT_SPECIFIED' WHERE occupationtype = 'TRADITIONAL_SPIRITUAL_HEALER';
-- Step 4: Delete occupation types that were previously misused and are no longer needed
DELETE FROM customizableenumvalue WHERE datatype = 'OCCUPATION_TYPE' AND (value = 'CHILD' OR value = 'HUNTER_MEAT_TRADER' OR value = 'MINER' OR value = 'BUTCHER' OR value = 'BUSINESSMAN_WOMAN' OR value = 'TRANSPORTER' OR value = 'TRADITIONAL_SPIRITUAL_HEALER');
The server needs to be restarted.
@MartinWahnschaffe and @kwa20 one question about this: If the user setting is set to German, will all instances then have the same values displayed on UI after the script was executed?
Background of the question: On our test-instances the values are not identical (could be because the script was not executed on all instances, i am not sure about this) and if the case/contact is shared via S2S and one value is selected, that does not exist on the other instance -> this breaks the whole sync of S2S
There is a very minimal set of default occupation types (HEALTHCARE_WORKER and three others). All other occupation types are only present if they had been assigned to a person before the migration to the new customizable occupation types was done. See https://github.com/hzi-braunschweig/SORMAS-Project/issues/5015#issuecomment-1180078511
With the script that was created in this issue the list of occupation types that will definitely be on the systems is a lot bigger, but there can still be occupation types present on one system and not present on another.
This will definitely cause problems for SORMAS-2-SORMAS that we need to take care of. From my point of view it would be best to create an issue that is about discussing how to deal with customizable enum values when syncing data:
A. Reject the data B. Accept the data and automatically create an entry in the customizable enums tables - this value would then also be available for other data objects C. Make the whole customizable enum logic able to deal with values that are not defined in the database.
Besides that I'm wondering whether we should define a full set of occupation types for Germany and reset any different occupation types assigned to persons or migrate those.
@MateStrysewske