wasm4 icon indicating copy to clipboard operation
wasm4 copied to clipboard

Feature: handle tracef's `%c` as unicode code point

Open zetanumbers opened this issue 3 years ago • 6 comments
trafficstars

Previously converted such character to UTF-16 char code, so large unicode characters would have been truncated. Now it's possible to pass unicode characters.

zetanumbers avatar Feb 27 '22 23:02 zetanumbers

We should probably match the same behavior as C's printf, which I think truncates to 8 bits for %c.

For me this program:

printf("Hello %c\n", 12345678);

Prints Hello N.

aduros avatar Mar 01 '22 11:03 aduros

We should probably match the same behavior as C's printf, which I think truncates to 8 bits for %c.

But why? It's not like we are trying to implement libc. With this PR we would able to pass rust's char for example.

zetanumbers avatar Mar 01 '22 13:03 zetanumbers

Btw if we truncate, should we truncate to 7 bits for ASCII, or truncate to 8 bits and allow some UTF-16 char codes? Aren't non-ASCII characters for printf OS dependent?

zetanumbers avatar Mar 01 '22 13:03 zetanumbers

Could we truncate to 8 bits? libc printf semantics aren't perfect, but at least they're well-defined and we don't need to document our own special handling of certain features.

For printing unicode characters, isn't it possible to use %s instead of %c? Or just format the string directly in Rust.

aduros avatar Mar 02 '22 14:03 aduros

Could we truncate to 8 bits? libc printf semantics aren't perfect, but at least they're well-defined and we don't need to document our own special handling of certain features.

Until and even then we truncate to 8 bits, we probably could handle non-ascii chars as unicode code points instead of UTF-16 char codes?

zetanumbers avatar Mar 04 '22 08:03 zetanumbers

For printing unicode characters, isn't it possible to use %s instead of %c? Or just format the string directly in Rust.

Current %s implementation only works on ascii null-terminated strings.

https://github.com/aduros/wasm4/blob/main/runtimes/web/src/runtime.ts#L272

To manually tracef in Rust you would:

  1. Create an empty string;
  2. Gradually write to this string other substrings, numbers, etc. Meanwhile the String would grow (reallocate) gradually increasing its capacity;
  3. Flush the whole string onto a single line via traceUtf8;
  4. Deallocate the string.

This brings some runtime (~7KiB on all code optimizations) into the binary. It could have been better (now only ~2KIB) if there was an ability flush the line by parts, requiring no allocations.

zetanumbers avatar Mar 04 '22 09:03 zetanumbers