mojo
mojo copied to clipboard
[Feature Request] Use unsigned bytes for String's buffer
Review Mojo's priorities
- [X] I have read the roadmap and priorities and I believe this request falls within the priorities.
What is your request?
Right now, String's data is stored as DynamicVector[Int8]
, but it should likely be DynamicVector[UInt8]
.
What is your motivation for this change?
Signed bytes tend to make users think semantically that they are working with numbers rather than raw data. It's also become increasingly popular to describe raw bytes as just a Vector of unsigned bytes (such as Uint8Array in JavaScript or []const u8 in Zig).
Makes sense, although users should not be working directly with the bytes within a string :) Also, we try to match C semantics here which uses char *
for strings
btw. there is a plan to perform optimizations on strings (e.g. small string optimizations), so you should never depend on its layout
+1, the current implementation needs to be improved a bunch.