ncdu-export icon indicating copy to clipboard operation
ncdu-export copied to clipboard

emoji file name crashes `ncdu`

Open Atemu opened this issue 1 year ago • 1 comments

Repro:

$ touch 🧡
$ ./find.sh . -maxdepth 1 | ./find2flat.py - | ./unflatten.py - | ncdu -f - -0
thread 3733331 panic: attempt to unwrap error
Unwind information for `:0x1066156` was not available, trace may be incomplete

Aborted (core dumped)

When you diff what ncdu -o exports of the same directory, the JSON contains the emoji as a raw unicode codepoints while both find2flat and unflatten convert it to \ud83e\udde1. This is indeed what's making ncdu crash as manually editing the ncdu-export-generated JSON to change it back to a raw unicode codepoint does not crash.

Interestingly, other unicode codepoints such as ä (\u00e4) do work.

This might actually be a bug in ncdu, though it wouldn't trigger it on its own of course since it'd put the raw codepoints into the JSON.

Atemu avatar Sep 26 '24 12:09 Atemu

I figured out that json.dumps() makes this happen by default and you need to turn it off using the ensure_ascii = False parameter. ~~I could not figure out how to apply this to unflatten.py yet.~~

Edit: I did figure it out, I just messed up in the implementation and python didn't scream at me... O.o

Atemu avatar Sep 26 '24 12:09 Atemu

Hi, thanks for the report and sorry for the late response. I've missed the notification among the flood of notifications from issues I'm subscribed to. I upgraded a mail-filtering rule so now I should react faster.

I closed the PR #6 because I made some other changes to the code. Please check if the code (v0.8.0) works for you.

I've also created a ticket on ncdu's Forgejo.

wodny avatar Nov 02 '24 19:11 wodny

Yup, works for me :)

Thanks!

Atemu avatar Nov 02 '24 21:11 Atemu

BTW, yorhel already committed changes to ncdu. I built it and now ASCII-escaped input not only works but is also displayed the same way the UTF-8 input has been.

wodny avatar Nov 03 '24 10:11 wodny