suit icon indicating copy to clipboard operation
suit copied to clipboard

Bug: cut utf8 codepoint sequence in half

Open iacore opened this issue 3 years ago • 1 comments

I don't know how to describe this, but the bug is here: https://github.com/vrld/suit/blob/17677826030a7270b474c5717af43834d583094c/theme.lua#L136-L139

This bug will cause the application to crash when the first codepoint of candidate text (IME) is not in ASCII range.

Here's a minimal reproducible main.lua:

function hex_dump (str)
    local len = string.len( str )
    local dump = ""
    local hex = ""
    local asc = ""
    
    for i = 1, len do
        if 1 == i % 8 then
            dump = dump .. hex .. asc .. "\n"
            hex = string.format( "%04x: ", i - 1 )
            asc = ""
        end
        
        local ord = string.byte( str, i )
        hex = hex .. string.format( "%02x ", ord )
        if ord >= 32 and ord <= 126 then
            asc = asc .. string.char( ord )
        else
            asc = asc .. "."
        end
    end

    
    return dump .. hex
            .. string.rep( "   ", 8 - len % 8 ) .. asc
end

function fromhex(a)
    local result = ""
    for i,x in ipairs(a) do
        result = result .. string.char(x)
    end
    return result
end

font = love.graphics.getFont( )
utf8 = require "utf8"
local ct = {
    text = fromhex({
        0xe5,0x87,0xb9
    }),
    start = 0,
}
print(ct.text)
local ss = ct.text:sub(1, utf8.offset(ct.text, ct.start))
print("ct.text:")
print(hex_dump(ct.text))
print("ss:")
print(hex_dump(ss))
local ws = font:getWidth(ss) -- crash here

iacore avatar Jan 30 '22 12:01 iacore

I don't understand what the code does, but here's what's wrong.

utf8.offset(s, 0) always returns 1 (start of 1-th codepoint in s).

iacore avatar Jan 30 '22 12:01 iacore