hematite_nbt icon indicating copy to clipboard operation
hematite_nbt copied to clipboard

Some tags are parsed with incorrect types

Open ToKiNoBug opened this issue 1 year ago • 4 comments

I found that short, int and long tags will not always be parsed correctly, but an minimal integer type that can represent its value. For example, int 10 will be parsed into byte, while int -2147483648 will be parsed into int. Besides, a list of int will be parsed into int array.

I don't understand why, but I think tags are not parsed with incorrect types. I added a test to check how tags are parsed, and that's the result:

Type of "byte" is TAG_Byte
Type of "short" is TAG_Byte, expected TAG_Short
Type of "int" is TAG_Byte, expected TAG_Int
Type of "long" is TAG_Byte, expected TAG_Long
Type of "float" is TAG_Float
Type of "double" is TAG_Float, expected TAG_Double
Type of "string" is TAG_String
Type of "byte array" is TAG_ByteArray
Type of "int array" is TAG_ByteArray, expected TAG_IntArray
Type of "long array" is TAG_ByteArray, expected TAG_LongArray
Type of "compound" is TAG_Compound
Type of "list of byte" is TAG_ByteArray, expected TAG_List
Type of "list of int" is TAG_ByteArray, expected TAG_List
Type of "list of long" is TAG_ByteArray, expected TAG_List

I'm sorry to say that 9 types of tags are parsed with incorrect type.

This test can be found at my fork of this repo. The nbt file is at https://github.com/ToKiNoBug/hematite_nbt/blob/fix-type-parsing/tests/types.nbt.

Here is a copy of the test code:


#[test]
fn test_types() {
    let file = File::open("tests/types.nbt").unwrap();
    let nbt:Result<nbt::Map<String,nbt::Value>,nbt::Error>=nbt::from_gzip_reader(file);
    let nbt=nbt.unwrap();

    let type_lut=[
        ("byte","TAG_Byte"),
        ("short","TAG_Short"),
        ("int","TAG_Int"),
        ("long","TAG_Long"),
        ("float","TAG_Float"),
        ("double","TAG_Double"),
        ("string","TAG_String"),
        ("byte array","TAG_ByteArray"),
        ("int array","TAG_IntArray"),
        ("long array","TAG_LongArray"),
        ("compound","TAG_Compound"),
        ("list of byte","TAG_List"),
        ("list of int","TAG_List"),
        ("list of long","TAG_List"),
    ];

    let mut mismatch_counter=0;
    for (key,expected_type) in type_lut {
        let tag:&nbt::Value=nbt.get(key).unwrap();
        let mut correct=true;
        if tag.tag_name()!=expected_type {
            mismatch_counter += 1;
            correct=false;
        }
        if correct {
            println!("Type of \"{}\" is {}",key,tag.tag_name());
        }
        else {
            eprintln!("Type of \"{}\" is {}, expected {}",key,tag.tag_name(),expected_type);
        }
    }

    if mismatch_counter>0 {
        panic!("{} types mismatched",mismatch_counter);
    }
}

ToKiNoBug avatar Dec 24 '23 11:12 ToKiNoBug

I'm willing to fix it, but I don't know how this happens even though I've read the code.

ToKiNoBug avatar Dec 24 '23 11:12 ToKiNoBug

This looks to be caused by the same underlying issue as #45. For now, it might be best to avoid using nbt::Value when deserializing with serde, and instead have structures with concrete types, or alternatively use the nbt::Blob::from_*reader and nbt::Value::from_*reader api which would avoid serde entirely.

StackDoubleFlow avatar Mar 16 '24 19:03 StackDoubleFlow

Did some digging around and from what I see the issue appears right about here. The code linked eventually calls InnerDecoder::deserialize_any which then executes this line: https://github.com/PistonDevelopers/hematite_nbt/blob/ce60b817f31b20125644c12fbf13f981809d5324/src/de.rs#L290

And while the code is deserializing data for Map, it reads corrently the data types. Just after the code linked in serde executes, for some reason it performs conversion.

Also to note: while the deserialization on map contents is being performed, serde stores data (keys and values) as such

(
    serde::__private::de::content::Content::String("WanderingTraderSpawnChance"),
    serde::__private::de::content::Content::I32(75)
)

mrghosti3 avatar May 28 '24 22:05 mrghosti3

Also might add the output of derive macros (done with cargo-expand):

mod value {
    use crate::Map;
    use std::fmt;
    use std::io;
    use byteorder::{BigEndian, ReadBytesExt, WriteBytesExt};
    use error::{Error, Result};
    use raw;
    /// Values which can be represented in the Named Binary Tag format.
    #[serde(untagged)]
    pub enum Value {
        Byte(i8),
        Short(i16),
        Int(i32),
        Long(i64),
        Float(f32),
        Double(f64),
        ByteArray(Vec<i8>),
        String(String),
        List(Vec<Value>),
        Compound(Map<String, Value>),
        #[serde(serialize_with = "crate::i32_array")]
        IntArray(Vec<i32>),
        #[serde(serialize_with = "crate::i64_array")]
        LongArray(Vec<i64>),
    }
    #[doc(hidden)]
    #[allow(non_upper_case_globals, unused_attributes, unused_qualifications)]
    const _: () = {
        #[allow(unused_extern_crates, clippy::useless_attribute)]
        extern crate serde as _serde;
        #[automatically_derived]
        impl<'de> _serde::Deserialize<'de> for Value {
            fn deserialize<__D>(
                __deserializer: __D,
            ) -> _serde::__private::Result<Self, __D::Error>
            where
                __D: _serde::Deserializer<'de>,
            {
                let __content = <_serde::__private::de::Content as _serde::Deserialize>::deserialize(
                    __deserializer,
                )?;
                let __deserializer = _serde::__private::de::ContentRefDeserializer::<
                    __D::Error,
                >::new(&__content);
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <i8 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Byte,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <i16 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Short,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <i32 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Int,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <i64 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Long,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <f32 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Float,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <f64 as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Double,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <Vec<i8> as _serde::Deserialize>::deserialize(__deserializer),
                    Value::ByteArray,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <String as _serde::Deserialize>::deserialize(__deserializer),
                    Value::String,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <Vec<Value> as _serde::Deserialize>::deserialize(__deserializer),
                    Value::List,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <Map<
                        String,
                        Value,
                    > as _serde::Deserialize>::deserialize(__deserializer),
                    Value::Compound,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <Vec<i32> as _serde::Deserialize>::deserialize(__deserializer),
                    Value::IntArray,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                if let _serde::__private::Ok(__ok) = _serde::__private::Result::map(
                    <Vec<i64> as _serde::Deserialize>::deserialize(__deserializer),
                    Value::LongArray,
                ) {
                    return _serde::__private::Ok(__ok);
                }
                _serde::__private::Err(
                    _serde::de::Error::custom(
                        "data did not match any variant of untagged enum Value",
                    ),
                )
            }
        }
    };
    #[doc(hidden)]
    #[allow(non_upper_case_globals, unused_attributes, unused_qualifications)]
    const _: () = {
        #[allow(unused_extern_crates, clippy::useless_attribute)]
        extern crate serde as _serde;
        #[automatically_derived]
        impl _serde::Serialize for Value {
            fn serialize<__S>(
                &self,
                __serializer: __S,
            ) -> _serde::__private::Result<__S::Ok, __S::Error>
            where
                __S: _serde::Serializer,
            {
                match *self {
                    Value::Byte(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Short(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Int(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Long(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Float(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Double(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::ByteArray(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::String(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::List(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::Compound(ref __field0) => {
                        _serde::Serialize::serialize(__field0, __serializer)
                    }
                    Value::IntArray(ref __field0) => {
                        _serde::Serialize::serialize(
                            {
                                #[doc(hidden)]
                                struct __SerializeWith<'__a> {
                                    values: (&'__a Vec<i32>,),
                                    phantom: _serde::__private::PhantomData<Value>,
                                }
                                impl<'__a> _serde::Serialize for __SerializeWith<'__a> {
                                    fn serialize<__S>(
                                        &self,
                                        __s: __S,
                                    ) -> _serde::__private::Result<__S::Ok, __S::Error>
                                    where
                                        __S: _serde::Serializer,
                                    {
                                        crate::i32_array(self.values.0, __s)
                                    }
                                }
                                &__SerializeWith {
                                    values: (__field0,),
                                    phantom: _serde::__private::PhantomData::<Value>,
                                }
                            },
                            __serializer,
                        )
                    }
                    Value::LongArray(ref __field0) => {
                        _serde::Serialize::serialize(
                            {
                                #[doc(hidden)]
                                struct __SerializeWith<'__a> {
                                    values: (&'__a Vec<i64>,),
                                    phantom: _serde::__private::PhantomData<Value>,
                                }
                                impl<'__a> _serde::Serialize for __SerializeWith<'__a> {
                                    fn serialize<__S>(
                                        &self,
                                        __s: __S,
                                    ) -> _serde::__private::Result<__S::Ok, __S::Error>
                                    where
                                        __S: _serde::Serializer,
                                    {
                                        crate::i64_array(self.values.0, __s)
                                    }
                                }
                                &__SerializeWith {
                                    values: (__field0,),
                                    phantom: _serde::__private::PhantomData::<Value>,
                                }
                            },
                            __serializer,
                        )
                    }
                }
            }
        }
    };
    #[automatically_derived]
    impl ::core::clone::Clone for Value {
        #[inline]
        fn clone(&self) -> Value {
            match self {
                Value::Byte(__self_0) => {
                    Value::Byte(::core::clone::Clone::clone(__self_0))
                }
                Value::Short(__self_0) => {
                    Value::Short(::core::clone::Clone::clone(__self_0))
                }
                Value::Int(__self_0) => Value::Int(::core::clone::Clone::clone(__self_0)),
                Value::Long(__self_0) => {
                    Value::Long(::core::clone::Clone::clone(__self_0))
                }
                Value::Float(__self_0) => {
                    Value::Float(::core::clone::Clone::clone(__self_0))
                }
                Value::Double(__self_0) => {
                    Value::Double(::core::clone::Clone::clone(__self_0))
                }
                Value::ByteArray(__self_0) => {
                    Value::ByteArray(::core::clone::Clone::clone(__self_0))
                }
                Value::String(__self_0) => {
                    Value::String(::core::clone::Clone::clone(__self_0))
                }
                Value::List(__self_0) => {
                    Value::List(::core::clone::Clone::clone(__self_0))
                }
                Value::Compound(__self_0) => {
                    Value::Compound(::core::clone::Clone::clone(__self_0))
                }
                Value::IntArray(__self_0) => {
                    Value::IntArray(::core::clone::Clone::clone(__self_0))
                }
                Value::LongArray(__self_0) => {
                    Value::LongArray(::core::clone::Clone::clone(__self_0))
                }
            }
        }
    }
    #[automatically_derived]
    impl ::core::fmt::Debug for Value {
        #[inline]
        fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result {
            match self {
                Value::Byte(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Byte",
                        &__self_0,
                    )
                }
                Value::Short(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Short",
                        &__self_0,
                    )
                }
                Value::Int(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Int",
                        &__self_0,
                    )
                }
                Value::Long(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Long",
                        &__self_0,
                    )
                }
                Value::Float(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Float",
                        &__self_0,
                    )
                }
                Value::Double(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Double",
                        &__self_0,
                    )
                }
                Value::ByteArray(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "ByteArray",
                        &__self_0,
                    )
                }
                Value::String(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "String",
                        &__self_0,
                    )
                }
                Value::List(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "List",
                        &__self_0,
                    )
                }
                Value::Compound(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "Compound",
                        &__self_0,
                    )
                }
                Value::IntArray(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "IntArray",
                        &__self_0,
                    )
                }
                Value::LongArray(__self_0) => {
                    ::core::fmt::Formatter::debug_tuple_field1_finish(
                        f,
                        "LongArray",
                        &__self_0,
                    )
                }
            }
        }
    }
    #[automatically_derived]
    impl ::core::marker::StructuralPartialEq for Value {}
    #[automatically_derived]
    impl ::core::cmp::PartialEq for Value {
        #[inline]
        fn eq(&self, other: &Value) -> bool {
            let __self_tag = ::core::intrinsics::discriminant_value(self);
            let __arg1_tag = ::core::intrinsics::discriminant_value(other);
            __self_tag == __arg1_tag
                && match (self, other) {
                    (Value::Byte(__self_0), Value::Byte(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Short(__self_0), Value::Short(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Int(__self_0), Value::Int(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Long(__self_0), Value::Long(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Float(__self_0), Value::Float(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Double(__self_0), Value::Double(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::ByteArray(__self_0), Value::ByteArray(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::String(__self_0), Value::String(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::List(__self_0), Value::List(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::Compound(__self_0), Value::Compound(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::IntArray(__self_0), Value::IntArray(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    (Value::LongArray(__self_0), Value::LongArray(__arg1_0)) => {
                        *__self_0 == *__arg1_0
                    }
                    _ => unsafe { ::core::intrinsics::unreachable() }
                }
        }
    }
    impl Value {
        // Nothing new here
    }

    // Various impls
}

mrghosti3 avatar May 28 '24 22:05 mrghosti3