vifm icon indicating copy to clipboard operation
vifm copied to clipboard

can't rename file in Cyrillic

Open paulcarroty opened this issue 1 year ago • 8 comments

Tested using cw and cW, got something weird: 123 Version: 0.13

paulcarroty avatar May 14 '24 09:05 paulcarroty

OS? Original name is in Cyrillic and looks correct? What's the output of locale in a shell?

xaizek avatar May 14 '24 09:05 xaizek

name is in Cyrillic and looks correct?

Original file can be in Latin/Cyr, the same weird characters appear after typing in Cyr.

$ uname -o
GNU/Linux
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

paulcarroty avatar May 14 '24 12:05 paulcarroty

locale looks fine. What's the filesystem and is it mounted with some locale/encoding-related options? You could also try checking whether the behaviour differs in something like /tmp.

I also assume that you're starting vifm from a shell with that locale output, otherwise it could be an issue of environment setup by, say, your DE when spawning a terminal (e.g., if locale is set in shell's config instead of on login or system-wide).

xaizek avatar May 14 '24 13:05 xaizek

/dev/sda2 on / type ext4 (rw,relatime,stripe=256)
/dev/sda4 on /home type ext4 (rw,nodev,relatime,discard,stripe=256)

you're starting vifm from a shell with that locale output

Correct.

You could also try checking whether the behaviour differs in something like /tmp

Same issue. Also tested on Xterm and Ubuntu 24.04 LTS.

paulcarroty avatar May 14 '24 13:05 paulcarroty

I'm running out of things that could affect the outcome. There is a conversion from wide to narrow encoding because input is processed as Unicode, but is then converted to local encoding, which is why locale could make a difference. One thing you could try is renaming a file in Vifm to абв (to have successive code point numbers) for a name and posting how ls prints that file after the rename. This might suggest how character values get transformed because ls now uses escapes for weird byte sequences.

I'll also need to check in a VM if I get the same on 24.04 LTS. What's the other distribution where you see the issue?

xaizek avatar May 14 '24 15:05 xaizek

абв in terminal appeared as : abc

printf '%s\n'  | od -t x1 -a
0000000  ee  80  80  ee  80  80  ee  80  80  0a
          n nul nul   n nul nul   n nul nul  nl
0000012

paulcarroty avatar May 14 '24 18:05 paulcarroty

Thanks, that is not what I expected.

xaizek avatar May 14 '24 21:05 xaizek

So far I wasn't able to reproduce this inside docker or a VM with Ubuntu 24.04 LTS.

xaizek avatar May 19 '24 11:05 xaizek

OK, tested on Linux Mint and it works. Does Vifm needs any extra build dependency for non-EN lang support? This is my distro .spec file.

paulcarroty avatar May 29 '24 18:05 paulcarroty

No, if locales work, the rest should work. I just tried clearlinux docker container and do see the issue there. Will need to do some tests to see what might be the reason behind it.

xaizek avatar May 29 '24 21:05 xaizek

wget_wch() from ncurses doesn't return non-latin input on Clear Linux. The input result looks like Ctrl+Space combination which internally represented as 0xe000, the character code you get. Don't know why this happens on Clear Linux though, https://github.com/clearlinux-pkgs/ncurses/blob/main/ncurses.spec looks fine. I made a test program:

Test program

// 1. Save as test.c
// 2. Compile:
//      gcc -Wall -g -o test test. -lcursesw
// 3. Run:
//      ./test
// 4. Type in something
// 5. Quit by pressing Escape key or Ctrl+C

#define _XOPEN_SOURCE_EXTENDED

#include <curses.h>

#include <locale.h>
#include <stdlib.h>
#include <wchar.h>
#include <wctype.h>

static WINDOW *win;

static void init(void)
{
    printf("MB_CUR_MAX: %ld\n", MB_CUR_MAX);
    setlocale(LC_ALL, "");
    printf("MB_CUR_MAX: %ld\n", MB_CUR_MAX);

    initscr();
    win = subwin(stdscr, 1, getmaxx(stdscr), 0, 0);
    refresh();
}

static void setup(void)
{
    noecho();
    wtimeout(win, 1000);
}

static void loop(void)
{
    wint_t c = 1;
    int result;

    result = wget_wch(win, &c);
    while (c != 27) {
        if (result == OK) {
            if (iswprint(c)) {
                wchar_t buf[] = { c, 0 };
                waddwstr(win, buf);
            } else {
                wprintw(win, "[0x%04x]", c);
            }
            wrefresh(win);
        }

        c = 1;
        result = wget_wch(win, &c);
    }
}

static void clean(void)
{
    delwin(win);
    endwin();
}

int main(void)
{
    init();
    setup();
    loop();
    clean();
    return 0;
}

It exposes the same behaviour as Vifm which makes ncurses build on Clear Linux a likely cause of the issue. An alternative is that the program has the same issue as Vifm which I just don't see. However if I build ncurses 6.4 manually like here https://github.com/vifm/vifm/blob/fc7c369cde1c7a2083523a75f0f30fc070e8db5c/pkgs/AppImage/genappimage.sh#L40-L49

and link that test program or Vifm to it, I see no issues, so it seems to be related to packaging of ncurses. It's possible there is something that Vifm could do in this regard, but at the moment I don't think what that could be.

ncurses-6.4-20230708.tgz isn't the same as ncurses-6.4.tar.gz, maybe that's just a buggy version (the URL for downloading doesn't even work, because that was a development version).

xaizek avatar Jun 01 '24 12:06 xaizek

Wow, thanks for digging. Seems like it's related to strictly non-English characters: tested français got franais.

paulcarroty avatar Jun 01 '24 17:06 paulcarroty