img2xterm icon indicating copy to clipboard operation
img2xterm copied to clipboard

Unicode characters not correctly printed with cow image

Open Nevon opened this issue 9 years ago • 4 comments

If I use cowsay with an image that has been converted to a cowfile and a string containing 4-byte characters, they are output incorrectly:

➜ cowsay -f cows/Eevee.cow -- "こんにちは世界"
 _______________________
< こんにちは世界 >
 -----------------------
           \
            \
             \
              \
            ▄▄▄
▄▄▄▄ ▄▄▄▄ ▄ ▄
  ▄    ▄▄▄▄▄▄▄▄
▀▄   ▄▄▄▄      ▄
  ▀ ▄  ▄▄ ▄▄   ▄
  ▄▄    ▄▄▄▄▄▄▄▄▄
    ▄▄▄▄▄    ▄ ▄▀
   ▀▀  ▀▄▄▄▄▄▄▀
        ▀▄▄▀▀
screen shot 2016-10-04 at 14 49 04

If I remove binmode STDOUT, ":utf8"; from the cowfile, the message is output correctly, but instead we get a warning from perl:

➜ cowsay -f cows/Pikachu.cow -- "こんにちは世界"
 _______________________
< こんにちは世界 >
 -----------------------
Wide character in print at /usr/local/bin/cowsay line 71.
           \
            \
             \
              \
  ▄▄          ▄▄
▄▄ ▄▄▄▄▄▄     ▄▄▄
▀▄▄  ▄▄  ▄▄▄ ▀▄▄ ▄
  ▀▄▄  ▄▄▄▄ ▄▄▄  ▄▄▄
   ▄▄ ▄▄▀ ▄ ▄▄▄▄  ▄▄▄
   ▀▄ ▄▄▄  ▄▄▄▄ ▄▄▄▄▀
    ▄▄▄▄▄▄  ▄▄ ▄▄▄▄▄▀
     ▀▄▄  ▄▄▄▄  ▄
      ▀▄    ▄▄▄▀
        ▀▄▄▀
screen shot 2016-10-04 at 14 48 08

This is using, for example, this file:

binmode STDOUT, ":utf8";
$the_cow =<<EOC;
           $thoughts
            $thoughts
             $thoughts
              $thoughts
\e[49m           \e[48;5;236m \e[38;5;179m\N{U+2584}\N{U+2584}\e[49m\e[38;5;236m\N{U+2584}
\N{U+2584}\e[48;5;236m\e[38;5;230m\N{U+2584}\e[49m\e[38;5;236m\N{U+2584}\N{U+2584}\e[48;5;236m \e[38;5;185m\N{U+2584}\N{U+2584}\e[49m\e[38;5;236m\N{U+2584}\N{U+2584} \N{U+2584}\e[48;5;236m \e[48;5;179m\e[38;5;137m\N{U+2584}\e[48;5;240m \e[48;5;179m \e[48;5;236m \e[49m
\e[48;5;236m \e[48;5;230m \e[38;5;185m\N{U+2584}\e[48;5;137m \e[48;5;239m \e[48;5;179m \e[48;5;240m \e[48;5;185m\e[38;5;240m\N{U+2584}\e[48;5;137m\e[38;5;179m\N{U+2584}\e[48;5;236m\e[38;5;240m\N{U+2584}\e[38;5;185m\N{U+2584}\e[48;5;185m\e[38;5;179m\N{U+2584}\e[48;5;239m\e[38;5;185m\N{U+2584}\e[38;5;179m\N{U+2584}\e[48;5;240m\e[38;5;239m\N{U+2584}\e[48;5;236m \e[49m
\e[38;5;236m\N{U+2580}\e[48;5;185m\N{U+2584}\e[48;5;137m   \e[48;5;239m\e[38;5;240m\N{U+2584}\e[48;5;179m\e[38;5;239m\N{U+2584}\e[48;5;240m\e[38;5;179m\N{U+2584}\e[38;5;137m\N{U+2584}\e[48;5;179m      \e[48;5;240m\e[38;5;231m\N{U+2584}\e[48;5;236m \e[49m
  \e[38;5;236m\N{U+2580}\e[48;5;236m \e[48;5;240m\e[38;5;137m\N{U+2584}\e[48;5;137m \e[48;5;240m \e[48;5;239m\e[38;5;185m\N{U+2584}\e[48;5;179m\e[38;5;240m\N{U+2584} \e[48;5;240m\e[38;5;236m\N{U+2584}\e[48;5;137m\e[38;5;231m\N{U+2584}\e[48;5;179m   \e[48;5;236m\e[38;5;179m\N{U+2584} \e[49m
  \e[38;5;236m\N{U+2584}\e[48;5;236m\e[38;5;137m\N{U+2584}\e[48;5;137m \e[48;5;240m \e[48;5;185m  \e[48;5;240m\e[38;5;230m\N{U+2584}\e[48;5;137m\e[38;5;240m\N{U+2584}\e[48;5;236m\e[38;5;137m\N{U+2584}\e[48;5;240m\N{U+2584}\e[48;5;179m\N{U+2584}\e[38;5;240m\N{U+2584}\e[48;5;173m\e[38;5;95m\N{U+2584}\e[48;5;236m\e[38;5;230m\N{U+2584}\e[49m\e[38;5;236m\N{U+2584}
  \e[48;5;236m \e[48;5;137m \e[38;5;240m\N{U+2584}\e[48;5;240m\e[38;5;236m\N{U+2584}\e[48;5;239m\N{U+2584}\e[48;5;185m\e[38;5;240m\N{U+2584}\e[48;5;230m\e[38;5;185m\N{U+2584} \e[48;5;185m \e[48;5;230m  \N{U+2584} \e[38;5;236m\N{U+2584}\e[49m\N{U+2580}
   \N{U+2580}\N{U+2580}  \N{U+2580}\e[48;5;137m\N{U+2584}\e[48;5;95m\e[38;5;137m\N{U+2584}\e[48;5;230m\e[38;5;239m\N{U+2584}\e[48;5;185m\N{U+2584}\e[38;5;137m\N{U+2584}\e[48;5;239m\e[38;5;236m\N{U+2584}\e[49m\N{U+2580}
        \N{U+2580}\e[48;5;137m\N{U+2584}\e[48;5;179m\N{U+2584}\e[49m\N{U+2580}\N{U+2580}\e[39m

EOC

Nevon avatar Oct 04 '16 12:10 Nevon

Why do you use -- when invoking pokemonsay?

possatti avatar Oct 06 '16 11:10 possatti

Usage: cowsay [-OPTIONS [-MORE_OPTIONS]] [--] [PROGRAM_ARG1 ...]

It separates the options from the arguments, so that you can do, for example:

➜  cowsay -- -f
 ____
< -f >
 ----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Nevon avatar Oct 06 '16 12:10 Nevon

Hmm, interesting. I am using version 3.03. But both in the man page and the --help option, mine doesn't mention the --:

Usage: cowsay [-bdgpstwy] [-h] [-e eyes] [-f cowfile] [-l] [-n] [-T tongue] [-W wrapcolumn] [message]

Thanks for letting me know. o/

possatti avatar Oct 16 '16 03:10 possatti

I'm not quite sure how to fix this. Getting cowsay to do proper Unicode output was a pain, but I thought that binmode line fixed it. (I am not a Perl expert.) What's the output of locale on your machine?

@possatti Yeah, that -- thing is a common convention in programs that do GNU-style option parsing. It looks like cowsay gets this behaviour from Perl's Getopt::Std.

rossy avatar Nov 03 '16 13:11 rossy