webssh
webssh copied to clipboard
Terminal problems ($TERM, utf-8)
On the server (connected directly through ssh using xterm as terminal):
$ locale
LANG=en_US.UTF-8
$ echo $TERM
xterm-256-color
$ echo -e '\xe2\x82\xac'
€
Running webssh-1.4.5 like so:
$ wssh --port=7681
Now after connecting through the browser to the same server:
echo $TERM
xterm
$ echo -e '\xe2\x82\xac'
€
Why not xterm-256color and why is the encoding broken?
-
About the terminal type That's because
websshcreates a pseudo tty with hardcoded terminal typextermfor every ssh connection. -
About the encoding problem Probably your browser doesn't use
UTF-8as the decoding type. You can check the browser console to see what encoding it uses.
- Why?
- Yes, Chromium supports UTF-8. It's the most used browser in the world and the majority of websites use UTF-8.
- Because
xterm-256coloris less commonly supported thanxtem. - You may take a look at https://github.com/huashengdun/webssh#browser-console section which describes how to deal with encoding.
-
Why not make it configurable? xterm.js supports
xterm-256color. -
In my web browser I can see this in the log:
The deault encoding of your server is ANSI_X3.4-1968
This makes no sense since on the server the default locale is configured correctly in /etc/locale.conf:
LANG=en_US.UTF-8
which is loaded by /etc/profile.d/locale.sh.
With every other client this works correctly and after login I get:
$ locale
LANG=en_US.UTF-8
So it looks like the default encoding detection does not work or does something non-standard that is not compatible.
What kind of server you are using?
What is the output of command locale charmap ?
GNU/Linux, kernel version 5.2
$ locale charmap
UTF-8
That is weird.
webssh uses the command locale charmap to detect the default encoding of the server being connected.
If the output of this command is UTF-8, then the log in your browser console should look like
The deault encoding of your server is UTF-8
Terminal type is configurable now. You can pass a terminal type via url.
http://localhost:8888/?term=xterm-256color
1. Why not make it configurable? xterm.js supports `xterm-256color`. 2. In my web browser I can see this in the log: `The deault encoding of your server is ANSI_X3.4-1968`This makes no sense since on the server the default locale is configured correctly in /etc/locale.conf:
LANG=en_US.UTF-8which is loaded by /etc/profile.d/locale.sh.With every other client this works correctly and after login I get:
$ locale LANG=en_US.UTF-8So it looks like the default encoding detection does not work or does something non-standard that is not compatible.
Here is my locale:
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I guess probably your locale is not configured correctly.
No, it's the same on my system, but above I just pasted the first line.
The full output:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ locale -a
C
en_US.utf8
POSIX
$ locale -m
ANSI_X3.110-1983
ANSI_X3.4-1968
ARMSCII-8
ASMO_449
BIG5
BIG5-HKSCS
BRF
BS_4730
BS_VIEWDATA
CP10007
CP1125
<snip>
T.61-8BIT
TCVN5712-1
TIS-620
TSCII
UTF-8
VIDEOTEX-SUPPL
VISCII
WIN-SAMI-2
WINDOWS-31J
$ locale -c charmap
LC_CTYPE
UTF-8
$ locale charmap
UTF-8
If the log in you browser console is The deault encoding of your server is ANSI_X3.4-1968, then the output of the command locale charmap on your server should be ANSI_X3.4-1968.
But it is UTF-8. I have even started webssh on the same user as in all the commands above.
Can you show me the whole log in your browser console when you connect to this server?
I've added some debug messages to handler.py and it looks like the environment is not loaded correctly.
At the time locale charmap is executed env returns very few environment variables and LANG is missing.
The problem is that locale charmap is sent as direct command through SSH and on my system this means the executing shell is a non-interactive and doesn't read /etc/profile.
Even if LANG was set, this way of detecting the encoding is wrong anyway, as it requires knowing the charset to decode the answer... the answer that you need to know for decoding in the first place.
This is why terminals let the user configure the encoding, which is the correct way to do it, with the default being UTF-8 on pretty much any modern terminal.
Have you tested it on other systems?
Here is a related issue, https://github.com/huashengdun/webssh/issues/21.
Also can you run this command python -c "import sys; print(sys.stdout.encoding)" on your special server?
Tested on ubuntu 19 with latest kernel 5.3, the default encoding detection works.
There's no reason to test on other systems, as I've pointed out what's going on.
I dug a bit deeper though: on Debian, bash is not only patched to detect that it runs non-interactively under ssh and therefore executes bashrc (which doesn't happen in a "normally" compiled bash and doesn't necessarily set LANG anyway), in Debian the system's LANG is also "injected" into ssh shells through PAM regardless if they're (non-)interactive or (non-)login shells.
Neither is necessarily true on non-Debian or related systems.
Also, as I've explained, the way you try to detect the encoding is wrong anyway. You get an encoded response that contains the encoding needed to decode it in the first place. Since you don't know the encoding you just fall back to UTF-8 anyway, which is behavior that will break on non-UTF-8 systems.
--
Why don't any of these problems happen with normal ssh? Because ssh and sshd, if configured that way (and are again by default on Debian), will send the client's environment variables (like LANG) to the server which accepts them. See SendEnv/AcceptEnv in ssh(d)_config.
But that again is not a given on all systems, and not always desired anyway.
In this case, the client has to set the LANG for the command itself like so: LANG=en_US.UTF-8 command.
This is also how you can properly query for available locales: LANG=C locale -a
because now you'll get an answer that is encoded in a known encoding: ASCII in this case. ANSI_X3.4-1968 to be precise.
You get an encoded response that contains the encoding needed to decode it in the first place. Since you don't know the encoding you just fall back to UTF-8 anyway, which is behavior that will >break on non-UTF-8 systems.```
This is because I know the output of locale charmap only contains ascii characters.
For ascii characters, enconding with different encodings will get the same bytes.
And decoding the result bytes with different encodings will get the same string.
Also can you tell me what kind of system(what flavour and what edition) do you use?
Just tested on centos 7, the default encoding detection also works.
Until now I have tested two kinds of Linux flavour (Debian and Redhat) and they all work.
Well, I'm researching the same problem (SSH Shell Encoding) which brought me here (Well, actually, Google brought me here, but anyway
Based on the information that I grabbed from this issue, I think maybe you can try to run locale charmap within xtermjs console rather than directly on server(?). At least xtermjs console is interactive so you should be able to get the correct result there.
But as @xnoreq has suggested, that's NOT how it should be done. Maybe you need to provide a method to allow user to configure the encoding by themselves. I know I will be doing that after reading all comments here, so yeah, I recommend it :)
Also ....
Here is a related issue, #21.
I don't think this two is related. The Issue #21 is caused by unsupported encoding label.
The TextDecoder only supports encoding from this list, and en_IN is not on the list.
You cannot simply feed the output from a SSH command directly to a JavaScript function and expect everything will work just right. Maybe do a mapping?
Hope it helps :)
Maybe you need to provide a method to allow user to configure the encoding by themselves.
Already provided, you can configure an encoding in your url.
http://localhost:8888/#encoding=gbk
Well, I'm researching the same problem (SSH Shell Encoding) which brought me here
I just searched Google with "SSH Shell Encoding", I don't see any result related.
Can you show me some links which are related to this issue?
Also can you tell me what kind of server(flavour and edition) you run on which you met the same problem?
Oh, the keyword was 'ssh encoding "locale charmap"'.
I was trying to figure out whether or not it's a good idea to send locale programmatically to server in order to detect it's encoding, and found out it isn't. Just here to share my findings, sorry if I bothered you.
OK so which dicussion tells you that running command locale charmap is not a good way to detect the encoding?
Actually I never expect that command locale charmap can work on all platforms.
At least I have tested on Linux systems of Debian and Redhat flavour and they all work.
Can you tell us what server you run on which you meet this problem?
So that everyone can test it.
The TextDecoder only supports encoding from this list, and en_IN is not on the list. You cannot simply feed the output from a SSH command directly to a JavaScript function and expect everything will work just right. Maybe do a mapping?
new TextDecoder('en_IN')
This line code will blow up if the encoding is not a valid one. Seems you don't even read my JavaScript code, how could you comment like this?
Oh, sorry just deleted your comment by accident. Here is your comment copied from my email.
First, let me clarify this: I'm not a user of your software. I'm researching this topic, not your software. I come here because that Google search, and I've confirmed what I expected, so I thought maybe I should share some of mine findings as well.
The thing is this, based on the small portion of the SSH specs I have read, as far as I can tell, unlike Telnet, it does not provide any method for the two parties to negotiate charset encoding. To me, it implies that user have to setup that encoding by themselves before connection is made.
Hope this could resolve some confusion created by me :)
Oh, sorry just deleted your comment by accident.
No problem :)
First, let me clarify this: I'm not a user of your software. I'm researching this topic, not your software. I come here because that Google search, and I've confirmed what I expected, so I thought maybe I should share some of mine findings as well.
The thing is this, based on the small portion of the SSH specs I have read, as far as I can tell, unlike Telnet, it does not provide any method for the two parties to negotiate charset encoding. To me, it implies that user have to setup that encoding by themselves before connection is made.
Hope this could resolve some confusion created by me :)
Like I said before,
I never expect that command locale charmap can work on all platforms.
At least I have tested on Linux systems (Debian and Redhat) and they all work.
But thanks for your suggestion.
Also please provide me with the links of your findings and the links of the small portion of the SSH specs you have read.
Created a simple Python script to get the default encoding of your ssh server for anyone would meet this problem in the future. https://gist.github.com/huashengdun/0af95bdafdce46a6ecbfc628dcd07c29
- Make sure the locale of your server is configured properly. https://help.ubuntu.com/community/Locale
- Run this script https://gist.github.com/huashengdun/0af95bdafdce46a6ecbfc628dcd07c29 on your local computer to fetch the default encoding of your server.
- Login your server then run command
locale charmap. - Compare the results of step 2 and step 3.
If these two results are different, please report the information (flavour and edition) of your server here.