serenity icon indicating copy to clipboard operation
serenity copied to clipboard

LibLine+LibVT: Internal functions to erase spaces and to search character. Also ASCII control characters.

Open ronak69 opened this issue 1 year ago • 3 comments

These are two the low-hanging (as i once thought; they turned out to be not-so-low-hanging) internal functions that were yet to be implemented in LibLine.


LibLine: Add internal function to erase consecutive spaces under cursor

LibVT: Ability to generate each of the 32 ASCII control characters

Edit: This is an old commit that has now been updated with new changes and commit message.

    The 128 (7 bits) ASCII codes are divided into 4 groups (with 2 upper
    most significant bits) of 32 codes (5 least significant bits) each.

    The first group (bit pattern: 00x'xxxx) consist of the "non-printable"
    characters which are also the "control" characters. These control
    characters don't have a dedicated key in keyboards; they are supposed to
    be invoked with the "control" key.

    Pressing the control key in combination with another character (which is
    in one of the other three groups) sets the 2 upper most significant bits
    of that character code to zero (by AND-ing with 0x1f), thus generating a
    control character.

    This also means that there are three ways (corresponding to the three
    groups of printable characters) to produce a control character.

    For example, (for the US-English keyboard layout,) these three key
    combinations will produce the ASCII code for ESCAPE:

        Ctrl+[  Ctrl+Shift+[  and  Ctrl+;

    Before, the code allowed to generate only 27 control characters from ^A
    to ^Z and ^\. Now, it is possible to generate the remaining 5 and that
    too with three key combinations. :^)

So, this is the controversial one.

On my Linux box Ctrl+2 generates ^@; but with this commit, on serenity you will have to do Ctrl+Shift+2 to generate ^@. (because Ctrl+2 will generate ^R instead). Same with Ctrl+/ -> ^_ and many others.

I don't know what is exactly going on there. That behaviour looks to be dependent on the keymap, else how would you know that Shift+2 is @?

So is it alright to have three ways to generate a control character? and to maybe deviate from linux? Is there a standard for this?

LibLine: Add internal functions to search character forwards & backwards

Can use Utf8View::decode_leading_byte() and Utf8View::decode_continuation_byte() for early abort but they are private members. But i don't even know how would you input corrupt-illegal (for UTF-8 encoding) bytes via the keyboard such that the function fails.

ronak69 avatar Jan 29 '24 19:01 ronak69

https://github.com/SerenityOS/serenity/blob/69964e10f46166ccafd0959d796a5476b9d6f516/Userland/Libraries/LibLine/InternalFunctions.cpp#L19

This can/should also use 0x1f. That way ctrl('a') and ctrl('A') will be same.

ronak69 avatar Jan 30 '24 17:01 ronak69

I don't know what is exactly going on there. That behaviour looks to be dependent on the keymap, else how would you know that Shift+2 is @?

So is it alright to have three ways to generate a control character? and to maybe deviate from linux? Is there a standard for this?

The translation for 2-8 is just hardcoded due to historical reasons. AFAIK there's no standard but de facto it is almost everywhere. (e.g. in libX11, Linux, OpenBSD) So I think it would be reasonable to be compatible.

summaryInfo avatar Jan 31 '24 13:01 summaryInfo

(e.g. in libX11, Linux, OpenBSD)

Thanks for these great links!

Before, i thought that Ctrl+2 was ^@ because Shift+2 is @. But it seems like those aliases are just for convenience.

So I think it would be reasonable to be compatible.

Yeah i think so now too. I matched X11. (OpenBSD left out {, |, } and ~ for some reason. And Linux has few aliases that neither X11 nor OpenBSD have, like Ctrl+' -> ^G and Ctrl+/ -> DEL instead of ^_)

ronak69 avatar Feb 01 '24 11:02 ronak69