arches icon indicating copy to clipboard operation
arches copied to clipboard

Arches V7 Internationalization - UTF-8 import/export issues

Open YuvalShafriri opened this issue 3 years ago • 0 comments

Arches Version/Branch: dev/7.1.x, Stable/7.1.1

Describe the bug and how to reproduce it:

There are some issues with importing/exporting Hebrew/Arabic (utf-8) Languages

  1. When exporting resource-model.json, The name of the file will be always "export" without the .json file extension, and the He/Ar chars inside the file are encoded. for example here ( in "name"):

                  "instructions": {
                     "he": ""
                 },
                 "is_editable": false,
                 "name": {
                     "he": "\u05de\u05d9\u05e7\u05d5\u05dd"
                 },
                 "nodegroup_id": "8669f4e0-43cb-11ed-aa20-21159131b3d3",
    
  2. The same with exporting the RDM Thesauri.xml - the He/Ar chars in the exported xml will be displayed encoded

  3. CLI import / export buisness data: I didnt checked it now but when importing utf-8 encoded buisness data there should be an error and no import I also don't know how it works with the new ETL feature.

All import/export issues here derived from the UTF-8 encoding code in python(3). In the past I checked it on arches 5-6. With small code adoptions in some python files in Arches it could be fixed, With no effect on the original work.

YuvalShafriri avatar Oct 12 '22 10:10 YuvalShafriri