yq icon indicating copy to clipboard operation
yq copied to clipboard

😲 Emoji characters in keys & values in v4 are lost/corrupted

Open elasticdotventures opened this issue 4 years ago • 9 comments

$ yq -V
yq version 4.7.0

The way yq v4 handles emoji is odd, inconsistent, unpredictable (which did not occur on earlier yq 2x versions which had other limitations)

yq should (imho) pass utf8/emoji through unmolested. yq works properly with pinyin (chinese mandarin) fonts but ideograms are so much more powerful and universal it'd be nice to use them with.

For example let's say emojifile.yaml with contents:

---
"bash.🔨/init.10级.🥾.b00t.sh": ""
"bash.🔨/init.20级.🐧.linux.sh": ""
"bash.🔨/init.22级.🐙.git.sh": ""
"bash.🔨/init.30级.🐳.层.docker.sh": ""
"bash.🔨/init.32级.💠.层.hashicorp.sh": ""
"bash.🔨/init.40级.🐍.语.python.sh": ""
"bash.🔨/init.40级.🚀.语.node.sh": ""
"bash.🔨/init.42级.🦄.语.typescript.sh": ""
"bash.🔨/init.43级.🥷.语.vue.sh": ""
"bash.🔨/init.44级.☕.语.java.sh": ""
"bash.🔨/init.44级.🏇.语.go.sh": ""
"bash.🔨/init.50级.👾.云☁️.gcp.sh": ""
"bash.🔨/init.50级.🤖.云☁️.azure.sh": ""
"bash.🔨/init.50级.🦉.云☁️.aws.sh": ""
"bash.🔨/init.60级.🎙️💙.应用.vscode.sh": ""
"bash.🔨/init.70级.☎️.msg.sh": ""
"bash.🔨/init.70级.🎬.video.sh": ""
"bash.🔨/init.70级.📱.mobile.sh": ""
"bash.🔨/init.70级.🕹️.gamesim.sh": ""
"bash.🔨/init.70级.🤑.ecommerce.sh": ""
"bash.🔨/init.70级.🥯.crypto.sh": ""
"bash.🔨/init.70级.🧠.ai.sh": ""
"bash.🔨/init.80级.🐱‍💻.esp32.sh": ""

then

$ cat emojifile.yaml | yq eval

will produce (on my ubuntu system)

"bash.\/init.10级.\.b00t.sh": ""
"bash.\/init.20级.\.linux.sh": ""
"bash.\/init.22级.\.git.sh": ""
"bash.\/init.30级.\.层.docker.sh": ""
"bash.\/init.32级.\.层.hashicorp.sh": ""
"bash.\/init.40级.\.语.python.sh": ""
"bash.\/init.40级.\.语.node.sh": ""
"bash.\/init.42级.\.语.typescript.sh": ""
"bash.\/init.43级.\.语.vue.sh": ""
"bash.\/init.44级.☕.语.java.sh": ""
"bash.\/init.44级.\.语.go.sh": ""
"bash.\/init.50级.\.云☁️.gcp.sh": ""
"bash.\/init.50级.\.云☁️.azure.sh": ""
"bash.\/init.50级.\.云☁️.aws.sh": ""
"bash.\/init.60级.\️\.应用.vscode.sh": ""
"bash.\/init.70级.☎️.msg.sh": ""
"bash.\/init.70级.\.video.sh": ""
"bash.\/init.70级.\.mobile.sh": ""
"bash.\/init.70级.\️.gamesim.sh": ""
"bash.\/init.70级.\.ecommerce.sh": ""
"bash.\/init.70级.\.crypto.sh": ""
"bash.\/init.70级.\.ai.sh": ""
"bash.\/init.80级.\‍\.esp32.sh": ""

This is for b00t framework.

elasticdotventures avatar May 13 '21 01:05 elasticdotventures

cat emojifile.yaml | yq eval -M

"bash.\U0001F528/init.10级.\U0001F97E.b00t.sh": ""
"bash.\U0001F528/init.20级.\U0001F427.linux.sh": ""
"bash.\U0001F528/init.22级.\U0001F419.git.sh": ""
"bash.\U0001F528/init.30级.\U0001F433.层.docker.sh": ""
"bash.\U0001F528/init.32级.\U0001F4A0.层.hashicorp.sh": ""
"bash.\U0001F528/init.40级.\U0001F40D.语.python.sh": ""
"bash.\U0001F528/init.40级.\U0001F680.语.node.sh": ""
"bash.\U0001F528/init.42级.\U0001F984.语.typescript.sh": ""
"bash.\U0001F528/init.43级.\U0001F977.语.vue.sh": ""
"bash.\U0001F528/init.44级.☕.语.java.sh": ""
"bash.\U0001F528/init.44级.\U0001F3C7.语.go.sh": ""
"bash.\U0001F528/init.50级.\U0001F47E.云☁️.gcp.sh": ""
"bash.\U0001F528/init.50级.\U0001F916.云☁️.azure.sh": ""
"bash.\U0001F528/init.50级.\U0001F989.云☁️.aws.sh": ""
"bash.\U0001F528/init.60级.\U0001F399️\U0001F499.应用.vscode.sh": ""
"bash.\U0001F528/init.70级.☎️.msg.sh": ""
"bash.\U0001F528/init.70级.\U0001F3AC.video.sh": ""
"bash.\U0001F528/init.70级.\U0001F4F1.mobile.sh": ""
"bash.\U0001F528/init.70级.\U0001F579️.gamesim.sh": ""
"bash.\U0001F528/init.70级.\U0001F911.ecommerce.sh": ""
"bash.\U0001F528/init.70级.\U0001F96F.crypto.sh": ""
"bash.\U0001F528/init.70级.\U0001F9E0.ai.sh": ""
"bash.\U0001F528/init.80级.\U0001F431‍\U0001F4BB.esp32.sh": ""

elasticdotventures avatar May 13 '21 02:05 elasticdotventures

BUT -j (json) apparently works

$ cat emojifile.yaml | yq eval -j
{
  "bash.🔨/init.10级.🥾.b00t.sh": "",
  "bash.🔨/init.20级.🐧.linux.sh": "",
  "bash.🔨/init.22级.🐙.git.sh": "",
  "bash.🔨/init.30级.🐳.层.docker.sh": "",
  "bash.🔨/init.32级.💠.层.hashicorp.sh": "",
  "bash.🔨/init.40级.🐍.语.python.sh": "",
  "bash.🔨/init.40级.🚀.语.node.sh": "",
  "bash.🔨/init.42级.🦄.语.typescript.sh": "",
  "bash.🔨/init.43级.🥷.语.vue.sh": "",
  "bash.🔨/init.44级.☕.语.java.sh": "",
  "bash.🔨/init.44级.🏇.语.go.sh": "",
  "bash.🔨/init.50级.👾.云☁️.gcp.sh": "",
  "bash.🔨/init.50级.🤖.云☁️.azure.sh": "",
  "bash.🔨/init.50级.🦉.云☁️.aws.sh": "",
  "bash.🔨/init.60级.🎙️💙.应用.vscode.sh": "",
  "bash.🔨/init.70级.☎️.msg.sh": "",
  "bash.🔨/init.70级.🎬.video.sh": "",
  "bash.🔨/init.70级.📱.mobile.sh": "",
  "bash.🔨/init.70级.🕹️.gamesim.sh": "",
  "bash.🔨/init.70级.🤑.ecommerce.sh": "",
  "bash.🔨/init.70级.🥯.crypto.sh": "",
  "bash.🔨/init.70级.🧠.ai.sh": "",
  "bash.🔨/init.80级.🐱‍💻.esp32.sh": ""
}

elasticdotventures avatar May 13 '21 02:05 elasticdotventures

Just confirmed same behavior on yq 4.8.0

elasticdotventures avatar May 13 '21 02:05 elasticdotventures

Just confirmed that the "other" yq project works properly with Emoji. https://github.com/kislyuk/yq

When I said "earlier" versions worked, that was incorrect.
I didn't realize I'd switched repos.

elasticdotventures avatar May 13 '21 02:05 elasticdotventures

Digging a little into this - and as far as I can tell it's an issue with go-yaml, the underlying yaml parser :(

https://github.com/go-yaml/yaml/issues/279

Not sure if I'll be able to work around it

mikefarah avatar May 13 '21 10:05 mikefarah

Raised a new issue here: https://github.com/go-yaml/yaml/issues/737

mikefarah avatar May 14 '21 00:05 mikefarah

Note that '-j' works because the issue is with the yaml Encoder and the json encoder works fine.

mikefarah avatar May 14 '21 00:05 mikefarah

if use shell, could used this command

tr -cd '\11\12\15\40-\176' < 1.yml  > new.yml

zhangguanzhang avatar Aug 17 '22 16:08 zhangguanzhang