danger icon indicating copy to clipboard operation
danger copied to clipboard

Git diff file path escaping

Open kemchenj opened this issue 1 year ago • 2 comments

First I want to thank you guys making this fantastic tool, it works very well and saving a lot time for our team.

And we encounter with some issues recently while working with non-ascii file path. I did some investigation and I think I should file an issue to report it.

Here is the thing, Git commands that output paths (e.g. ls-files, diff), will escape usual characters in the path with backslashes in the same way C escapes control characters.

Currently Danger handle this properly in APIs like git.added_files by using the ruby-git, which unescape the path internally.

But Danger has an separate implementation to extract informations from diff files in /lib/danger/request_sources/github/github.rb#L37, which not handle escaped path correctly. And GitHub inline comment will be affected by this.

I think maybe we could reuse some code from ruby-git, parse the diff file to a more structured ruby class before using it.

kemchenj avatar Jun 03 '22 18:06 kemchenj

@kemchenj

I think maybe we could reuse some code from ruby-git, parse the diff file to a more structured ruby class before using it.

I agree with you.

BTW perhaps, if you set $LANG environment variable like LANG=en_US.UTF-8 bundle exec danger (or your file system's encoding), does the problem still reproduce?

manicmaniac avatar Nov 12 '22 09:11 manicmaniac

@manicmaniac

The "weird encoding" diff file is actually fetched from GitHub. Here is the link to an example pull request, and its diff file:

diff --git "a/\346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md" "b/\346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\346\226\207\344\273\266.md" "b/\346\226\207\344\273\266.md"
new file mode 100644
index 0000000..e69de29

The \346\226\207 \344\273\266 \345\244\271/\346\226\207\344\273\2662.md above is actually 文件.md encoded in the "Git way".

I have tried to add headers like Accept: application/vnd.github.v3.diff;charset=utf-8 or Accept-Charset: utf-8 in the request header, but the response stays the same.

kemchenj avatar Nov 13 '22 10:11 kemchenj