rbx-dom icon indicating copy to clipboard operation
rbx-dom copied to clipboard

Improved error messages for instance names

Open filiptibell opened this issue 2 years ago • 2 comments

Currently, rbx-dom throws a pretty nondescript error when an instance has a name containing invalid utf-8:

[TRACE rbx_binary::deserializer::state] Known prop, canonical name C0 and type CFrame
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 24, len: 37, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.C1, instance type 59, prop type 59
[TRACE rbx_binary::deserializer::state] Known prop, canonical name C1 and type CFrame
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 20, len: 18, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.Enabled, instance type 59, prop type 59
[TRACE rbx_binary::deserializer::state] Known prop, canonical name Enabled and type Bool
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 30, len: 50, reserved: 0)
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 69, len: 103, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.Name, instance type 59, prop type 59

stream did not contain valid UTF-8

This particular instance is more common in Roblox than one might think and is known as a "virus" instance, one created by a malicious script or plugin to be intentionally annoying or disruptive. Developers that get this error generally don't know that they have these, and the instances are also generally harmless, but make deserialization in rbx-dom fail with the above error message.

It would be nice to throw a better error here (and maybe in other cases for invalid utf-8), maybe even mentioning the exact instance and where it is so developers can deal with it accordingly.

filiptibell avatar Jul 06 '23 07:07 filiptibell

Looked into this more, and it's going to be difficult (or impossible) to give much useful info (e.g. the instance's full name) in these errors because the DOM is necessarily incomplete in the middle of a deserialization operation.

Maybe instead of more specific errors, we could just do a lossy conversion for non UTF-8 Variant::String properties? The main drawback is just the fact the original data would be altered, and there may be legitimate uses for non UTF-8 string properties!

Alternatively, we could start reading Variant::String props into a Vec<u8> instead of a String. This would be more correct, but it would be a breaking change and makes string properties more complicated to work with for consumers.

d��������������ng…i got owned…

kennethloeffler avatar Jul 07 '23 23:07 kennethloeffler

Alternatively, we could start reading Variant::String props into a Vec instead of a String. This would be more correct, but it would be a breaking change and makes string properties more complicated to work with for consumers.

This actually sounds great for Lune as a consumer since we could, in many cases, skip the entire conversion step between dom string -> rust string -> lua (c) string. We could also unify some code we have for differences in handling with normal strings / binary strings. I can't really speak for Rojo or any other tools (are there any other than Remodel/Lune?) though.

filiptibell avatar Jul 08 '23 08:07 filiptibell