SORMAS-Project icon indicating copy to clipboard operation
SORMAS-Project copied to clipboard

Create a script to update and migrate script_occupation type values

Open SahaLinaPrueger opened this issue 2 years ago • 2 comments

Problem Description

SORMAS Version 1.75 moves the occupation type values into the customizable enum value table. Previously, original values have also been misused for other values by translating them into something totally different.

This has been a huge problem because users will have chosen values based on an entirely different meaning and have been filling the database with values that do not fit that meaning ever since.

In order to correct the false data present on German systems and also make the values consistent across health departments, a script is necessary to correct for the erroneous translations and values.

Proposed Solution

Create an SQL script that does the following:

  • [ ] Insert the following values into the customizable enum table that maybe already exist if they were used before:
Value english caption german caption
FARMER Farmer Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin
RELIGIOUS_LEADER Religious leader Geistliche*r
HOUSEWIFE Housewife Hausfrau/-mann
PUPIL_STUDENT Pupil / student Ausbildung
WORKING_WITH_ANIMALS Working with animals Arbeiten mit Tieren
  • [ ] Insert the following new values into the customizable enum table
Value english caption german caption
CIVIL_SERVANT Civil servant Beamt*in, Richter*in, Berufssoldat*in
EMPLOYEE Employee Angestellte*r
WORKER Worker Arbeiter*in
ACADEMIC Academic Akademiker*in in freiem Beruf
ASSISTING_FAMILY_MEMBER Assisting family member Mithelfende*r Angehörige*r
NOT_SPECIFIED Not specified Keine Angabe
  • [ ] Migrate the occupation types in the person table to conform to the correct translation and meaning grafik

  • [ ] Delete the values from the customizable enum table that were previously miss used and thusly never selected according to their meaning. It should only be necessary to remove them from the table since all ofcustomizableenumvalue the entries were replaced by step 3 grafik

  • [ ] Provide the script as part of the release notes

  • [ ] Add a comment into enum.properties right above the occupationtype entries that says: "Temporarily needed for migration of old systems". This does not have to go into a hotfix branch.

Updated by @MartinWahnschaffe based on HZIOrga-OccupationTypesScript.pdf

Possible Alternatives

Additional Information

Written by @kwa20

SahaLinaPrueger avatar Sep 16 '22 08:09 SahaLinaPrueger

@SahaLinaPrueger I have extracted the todos from the document and restructured them a bit.

@markusmann-vg The script can be created and executed independently, so the hotfix does not have to wait for it.

MartinWahnschaffe avatar Sep 19 '22 07:09 MartinWahnschaffe

Script to be reviewed by @MartinWahnschaffe:

The following script is only relevant for systems that use German as system language:

Update: This is now outdated. Final script is further down.

``` -- Step 1: Insert values used in Germany into database if they have not been used in this particular database yet INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'FARMER', 'Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin', '[{"languageCode":"en","value":"Farmer"},{"languageCode":"de-DE","value":"Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin"}]' WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'FARMER'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'RELIGIOUS_LEADER', 'Geistliche*r', '[{"languageCode":"en","value":"Religious leader"},{"languageCode":"de-DE","value":"Geistliche*r"}]' WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'RELIGIOUS_LEADER'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'HOUSEWIFE', 'Hausfrau/-mann', '[{"languageCode":"en","value":"Housewife"},{"languageCode":"de-DE","value":"Hausfrau/-mann"}]' WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'HOUSEWIFE'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'PUPIL_STUDENT', 'Ausbildung', '[{"languageCode":"en","value":"Pupil / student"},{"languageCode":"de-DE","value":"Ausbildung"}]' WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'PUPIL_STUDENT'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKING_WITH_ANIMALS', 'Arbeiten mit Tieren', '[{"languageCode":"en","value":"Working with animals"},{"languageCode":"de-DE","value":"Arbeiten mit Tieren"}]' WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'WORKING_WITH_ANIMALS');

-- Step 2: Insert new values used in Germany into database that have not existed or were misused before INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'CIVIL_SERVANT', 'Beamtin, Richterin, Berufssoldatin', '[{"languageCode":"en","value":"Civil servant"},{"languageCode":"de-DE","value":"Beamtin, Richterin, Berufssoldatin"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'EMPLOYEE', 'Angestellter', '[{"languageCode":"en","value":"Employee"},{"languageCode":"de-DE","value":"Angestellter"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKER', 'Arbeiterin', '[{"languageCode":"en","value":"Worker"},{"languageCode":"de-DE","value":"Arbeiterin"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ACADEMIC', 'Akademikerin in freiem Beruf', '[{"languageCode":"en","value":"Academic"},{"languageCode":"de-DE","value":"Akademikerin in freiem Beruf"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ASSISTING_FAMILY_MEMBER', 'Mithelfender Angehöriger', '[{"languageCode":"en","value":"Assisting family member"},{"languageCode":"de-DE","value":"Mithelfender Angehöriger"}]'); INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations) VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'NOT_SPECIFIED', 'Keine Angabe', '[{"languageCode":"en","value":"Not specified"},{"languageCode":"de-DE","value":"Keine Angabe"}]');

-- Step 3: Migrate occupation type entries in person table to conform to the corrected translations and meanings UPDATE person SET occupationtype = 'CIVIL_SERVANT' WHERE occupationtype = 'CHILD'; UPDATE person SET occupationtype = 'EMPLOYEE' WHERE occupationtype = 'HUNTER_MEAT_TRADER'; UPDATE person SET occupationtype = 'WORKER' WHERE occupationtype = 'MINER'; UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN'; UPDATE person SET occupationtype = 'BUSINESSMAN_WOMAN' WHERE occupationtype = 'BUTCHER'; UPDATE person SET occupationtype = 'ASSISTING_FAMILY_MEMBER' WHERE occupationtype = 'TRANSPORTER'; UPDATE person SET occupationtype = 'NOT_SPECIFIED' WHERE occupationtype = 'TRADITIONAL_SPIRITUAL_HEALER';

-- Step 4: Delete occupation types that were previously misused and are no longer needed DELETE FROM customizableenumvalue WHERE datatype = 'OCCUPATION_TYPE' AND (value = 'CHILD' OR value = 'HUNTER_MEAT_TRADER' OR value = 'MINER' OR value = 'BUTCHER' OR value = 'BUSINESSMAN_WOMAN' OR value = 'TRANSPORTER' OR value = 'TRADITIONAL_SPIRITUAL_HEALER');

</details>
Afterwards a server restart is needed!

MateStrysewske avatar Sep 21 '22 12:09 MateStrysewske

@MateStrysewske Script looks correct and worked as expected on my local system.

MartinWahnschaffe avatar Sep 26 '22 07:09 MartinWahnschaffe

@MartinWahnschaffe Which release notes should it be part of?

MateStrysewske avatar Sep 26 '22 10:09 MateStrysewske

I think you can put it into the notes of the upcoming release 1.76 release

MartinWahnschaffe avatar Sep 27 '22 07:09 MartinWahnschaffe

@MartinWahnschaffe @MateStrysewske I think this should go into 1.75 because that's the version where the occupation types where moved to the customizable enums

kwa20 avatar Sep 27 '22 07:09 kwa20

Hm, then we should mention it in the existing release notes and also in the upcoming notes of the hotfix.

MartinWahnschaffe avatar Sep 27 '22 08:09 MartinWahnschaffe

I run a test on https://survnet-import-test.sormas.netzlink.com/ For the test I have created different test cases on version 1.74.3. The instance was raised to 1.75.4 and the script was executed. After that, for example, the case with the occupation type "child" is displayed as follows: image

customizable_enum_values.csv In the column "Caption" there are German or English entries mixed. The entries in the column "translation" are partly missing image

Drop-down menu It looks to me like the menus are unchanged. image

@MateStrysewske @MartinWahnschaffe Is there a dependency on the ticket https://github.com/hzi-braunschweig/SORMAS-Project/issues/5015 ? Do the two solutions need to be better aligned or does the test need to be adjusted?

KernB avatar Oct 12 '22 15:10 KernB

Hi @KernB. Yes there is a dependency on #5015, but since it's part of 1.75, updating to this version and then executing this script should work fine.

Where did you take those screenshots? I just had a look at https://survnet-import-test.sormas.netzlink.com/ and there the drop-down looks like this:

grafik grafik

So we definitely have a mistake in the script, because the caption should always be in the system language of the server. In this case we should have provided the script with German captions. I will take care of updating this.

Besides that the entries on my screenshot looks fine to me. What I see on the system doesn't match your customizable_enum_values.csv, though.

  1. Based on the csv WORKING_WITH_ANIMALS was added. Other entries like FARMER and RELIGIOUS_LEADER were already there. In the drop-down of my screenshot all those values seem to be new.
  2. CIVIL_SERVANT, EMPLOYEE, etc. were added
  3. CHILD, HUNTER_MEAT_EATER, etc. were removed

MartinWahnschaffe avatar Oct 13 '22 07:10 MartinWahnschaffe

I have updated the script above to contain the captions in the right language.

MartinWahnschaffe avatar Oct 13 '22 08:10 MartinWahnschaffe

When doing the migration (step 3):

UPDATE person SET occupationtype = 'BUSINESSMAN_WOMAN' WHERE occupationtype = 'BUTCHER';
UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN';

Should the order of these lines not be reversed? Otherwise, every current BUTCHER value will turn into ACADEMIC...

FredrikSchaefer avatar Oct 13 '22 10:10 FredrikSchaefer

Value German Caption
CHILD Beamtin, Richterin, Berufssoldat*in
HUNTER_MEAT_TRADER Angestellte*r
MINER Arbeiter*in
BUTCHER Selbstständige*r
BUSINESSMAN_WOMAN Akademiker*in in freiem Beruf
TRANSPORTER Mithelfender Angehöriger
TRADITIONAL_SPIRITUAL_HEALER Keine Angabe

@MartinWahnschaffe @FredrikSchaeferVitagroup @KernB I think the problem is that I missed an additional value for "Selbstständige*r". We should be fine by just adding SELF_EMPLOYED and then migrating BUTCHER to that value. In this way, BUSINESSMAN_WOMAN and BUTCHER are disentangled.

VALUES (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'SELF_EMPLOYED', 'Self-employed', '[{"languageCode":"de-DE","value":"Selbstständige*r"}]');

UPDATE person SET occupationtype = 'SELF_EMPLOYED' WHERE occupationtype = 'BUTCHER';

kwa20 avatar Oct 14 '22 15:10 kwa20

@kwa20 Thanks, that makes a lot more sense.

@KernB This is the updated script:

-- Step 1: Insert values used in Germany into database if they have not been used in this particular database yet
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'FARMER', 'Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin', '[{"languageCode":"en","value":"Farmer"},{"languageCode":"de-DE","value":"Selbstständige*r Landwirt*in bzw. Genossenschaftsbauer/bäuerin"}]' 
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'FARMER');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'RELIGIOUS_LEADER', 'Geistliche*r', '[{"languageCode":"en","value":"Religious leader"},{"languageCode":"de-DE","value":"Geistliche*r"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'RELIGIOUS_LEADER');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'HOUSEWIFE', 'Hausfrau/-mann', '[{"languageCode":"en","value":"Housewife"},{"languageCode":"de-DE","value":"Hausfrau/-mann"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'HOUSEWIFE');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'PUPIL_STUDENT', 'Ausbildung', '[{"languageCode":"en","value":"Pupil / student"},{"languageCode":"de-DE","value":"Ausbildung"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'PUPIL_STUDENT');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
SELECT nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKING_WITH_ANIMALS', 'Arbeiten mit Tieren', '[{"languageCode":"en","value":"Working with animals"},{"languageCode":"de-DE","value":"Arbeiten mit Tieren"}]'
WHERE NOT EXISTS (SELECT value FROM customizableenumvalue WHERE value = 'WORKING_WITH_ANIMALS');

-- Step 2: Insert new values used in Germany into database that have not existed or were misused before
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'CIVIL_SERVANT', 'Beamt*in, Richter*in, Berufssoldat*in', '[{"languageCode":"en","value":"Civil servant"},{"languageCode":"de-DE","value":"Beamt*in, Richter*in, Berufssoldat*in"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'EMPLOYEE', 'Angestellte*r', '[{"languageCode":"en","value":"Employee"},{"languageCode":"de-DE","value":"Angestellte*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'WORKER', 'Arbeiter*in', '[{"languageCode":"en","value":"Worker"},{"languageCode":"de-DE","value":"Arbeiter*in"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'SELF_EMPLOYED', 'Selbstständige*r', '[{"languageCode":"en","value":"Self-employed"},{"languageCode":"de-DE","value":"Selbstständige*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ACADEMIC', 'Akademiker*in in freiem Beruf', '[{"languageCode":"en","value":"Academic"},{"languageCode":"de-DE","value":"Akademiker*in in freiem Beruf"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'ASSISTING_FAMILY_MEMBER', 'Mithelfende*r Angehörige*r', '[{"languageCode":"en","value":"Assisting family member"},{"languageCode":"de-DE","value":"Mithelfende*r Angehörige*r"}]');
INSERT INTO customizableenumvalue (id, uuid, creationdate, changedate, datatype, value, caption, translations)
VALUES
  (nextval('entity_seq'), generate_base32_uuid(), now(), now(), 'OCCUPATION_TYPE', 'NOT_SPECIFIED', 'Keine Angabe', '[{"languageCode":"en","value":"Not specified"},{"languageCode":"de-DE","value":"Keine Angabe"}]');

-- Step 3: Migrate occupation type entries in person table to conform to the corrected translations and meanings
UPDATE person SET occupationtype = 'CIVIL_SERVANT' WHERE occupationtype = 'CHILD';
UPDATE person SET occupationtype = 'EMPLOYEE' WHERE occupationtype = 'HUNTER_MEAT_TRADER';
UPDATE person SET occupationtype = 'WORKER' WHERE occupationtype = 'MINER';
UPDATE person SET occupationtype = 'ACADEMIC' WHERE occupationtype = 'BUSINESSMAN_WOMAN';
UPDATE person SET occupationtype = 'SELF_EMPLOYED' WHERE occupationtype = 'BUTCHER';
UPDATE person SET occupationtype = 'ASSISTING_FAMILY_MEMBER' WHERE occupationtype = 'TRANSPORTER';
UPDATE person SET occupationtype = 'NOT_SPECIFIED' WHERE occupationtype = 'TRADITIONAL_SPIRITUAL_HEALER';

-- Step 4: Delete occupation types that were previously misused and are no longer needed
DELETE FROM customizableenumvalue WHERE datatype = 'OCCUPATION_TYPE' AND (value = 'CHILD' OR value = 'HUNTER_MEAT_TRADER' OR value = 'MINER' OR value = 'BUTCHER' OR value = 'BUSINESSMAN_WOMAN' OR value = 'TRANSPORTER' OR value = 'TRADITIONAL_SPIRITUAL_HEALER');

The server needs to be restarted.

MartinWahnschaffe avatar Oct 17 '22 07:10 MartinWahnschaffe

@MartinWahnschaffe and @kwa20 one question about this: If the user setting is set to German, will all instances then have the same values displayed on UI after the script was executed?

Background of the question: On our test-instances the values are not identical (could be because the script was not executed on all instances, i am not sure about this) and if the case/contact is shared via S2S and one value is selected, that does not exist on the other instance -> this breaks the whole sync of S2S

SahaLinaPrueger avatar Oct 25 '22 17:10 SahaLinaPrueger

There is a very minimal set of default occupation types (HEALTHCARE_WORKER and three others). All other occupation types are only present if they had been assigned to a person before the migration to the new customizable occupation types was done. See https://github.com/hzi-braunschweig/SORMAS-Project/issues/5015#issuecomment-1180078511

With the script that was created in this issue the list of occupation types that will definitely be on the systems is a lot bigger, but there can still be occupation types present on one system and not present on another.

This will definitely cause problems for SORMAS-2-SORMAS that we need to take care of. From my point of view it would be best to create an issue that is about discussing how to deal with customizable enum values when syncing data:

A. Reject the data B. Accept the data and automatically create an entry in the customizable enums tables - this value would then also be available for other data objects C. Make the whole customizable enum logic able to deal with values that are not defined in the database.

Besides that I'm wondering whether we should define a full set of occupation types for Germany and reset any different occupation types assigned to persons or migrate those.

@MateStrysewske

MartinWahnschaffe avatar Oct 27 '22 11:10 MartinWahnschaffe