datahub icon indicating copy to clipboard operation
datahub copied to clipboard

Korean is not included in metadata

Open rlatjgml38 opened this issue 2 years ago • 2 comments

Describe the bug When I try to input Korean in description an error occurs. I think I need utf8 encoding to input Korean. What should I do?

To Reproduce

  1. Made the json file (test3.json)
[
{
    "auditHeader": null,
    "entityType": "dataset",
    "entityUrn": "urn:li:dataset:(urn:li:dataPlatform:oracle,한글테스트2,PROD)",
    "entityKeyAspect": null,
    "changeType": "UPSERT",
    "aspectName": "datasetProperties",
    "aspect": {
        "value": "{\"description\": \"한글 설명\"}",
        "contentType": "application/json"
    },
    "systemMetadata": null
}
]
  1. Made the yml file (file_to_datahub.yml)
source:
        type: "file"
        config:
                filename: "./test3.json"
sink:
        type: "datahub-rest"
        config:
                server: "http://localhost:8080"
reporting:
        - type: "datahub"
          config:
                  datahub_api:
                          server: "http://localhost:8080"
  1. Run datahub ingest -c file_to_datahub.yml

Screenshots image image image

Desktop (please complete the following information):

  • OS: Ubuntu 18.04
  • Datahub Version: 0.8.32

Additional context I succeeded in putting Korean in urn. When there is no Korean in the description. image However, Korean is not included in metadata. (e.g. description, column name, subtype's typeName, etc.) And I tried to install v0.8.36 and v0.8.39, but I couldn't use it with a red error window saying "Oops, an error occurred".

rlatjgml38 avatar Jul 04 '22 01:07 rlatjgml38

#5306

rlatjgml38 avatar Jul 05 '22 00:07 rlatjgml38

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Aug 29 '22 02:08 github-actions[bot]

Hi is this still an issue? From our side it looks like Python ingestion does support non utf-8 characters

aditya-radhakrishnan avatar Nov 15 '22 19:11 aditya-radhakrishnan