jomini Differentiate quoted vs unquoted at high level

This PR added distinction between those in tapes: https://github.com/rakaly/jomini/pull/55

Can I distinguish between those in custom serde Deserializer? Basically, I'm asking if the following code is possible:

#[derive(Debug, Deserialize)]
struct GameData {
    a: QuotedString,
    b: UnquotedString,
}

let data = r#"
    a = "@test"
    b = @test
"#;

Dec 13 '23 06:12 rlidwka

Yup, you can write that code today. The string value themselves will be indistinguishable, but the types give you everything you need distinguish. You do need to flesh out the deserialization implementation for QuotedString and UnquotedString .

Dec 13 '23 14:12 nickbabcock

You do need to flesh out the deserialization implementation for QuotedString and UnquotedString.

What do I need to ask from jomini deserializer to distinguish them in my custom Deserialize impl?

This is my deserialization right now:

#[derive(Debug)]
pub enum Test {
    Quoted(String),
    Unquoted(String),
}

impl<'de> serde::Deserialize<'de> for Test {
    fn deserialize<D: serde::Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        struct TVisitor;
        impl<'de> serde::de::Visitor<'de> for TVisitor {
            type Value = Test;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                formatter.write_str("???")
            }

            fn visit_str<E: serde::de::Error>(self, v: &str) -> Result<Self::Value, E> {
                Ok(Test::Quoted(v.to_owned()))
            }
        }

        deserializer.deserialize_str(TVisitor)
        //deserializer.deserialize_newtype_struct("_maybe_deserializer_hint_here", TVisitor)
    }
}

#[derive(Debug, serde::Deserialize)]
#[allow(unused)]
struct GameData {
    a: Test,
    b: Test,
}

fn main() {
    let data = r#"
        a = "test1"
        b = test2
    "#;
    let game_data: GameData = jomini::text::de::from_utf8_slice(data.as_bytes()).unwrap();
    println!("{:?}", game_data);
}

What do I need to change to actually distinguish those types?

If I ask for str, it's going to give me that string without any markers. There's no deserialize_unquoted_str and visit_unquoted_str in serde (can't be, since serde is generic).

I suspect there could be some trickery that involves giving a type hint as a struct name (in a way how Property/Operator is implemented right now). But I haven't found one.

Dec 13 '23 22:12 rlidwka

In the first version you know GameData::b should have quotes based on the type
In the second version it's a different problem as they now have the same type,

I'm not sure which use case you desire. Let me know more specifics about what you are trying to accomplish and maybe I can help.

Dec 14 '23 02:12 nickbabcock

In the first version you know GameData::b should have quotes based on the type

In the second version it's a different problem as they now have the same type,

In the second version the distinction between types is moved into enum Test, and I in my rust code I would differentiate types based on enum discriminant (Test::Quoted or Test::Unquoted). If I managed to deserialize correctly that is, which what I'm asking about.

I'm not sure which use case you desire. Let me know more specifics about what you are trying to accomplish and maybe I can help.

I'm doing variable expansion. A custom thing, but similar to one that's done in HOI4, so lets take that syntax as an example:

# so user can define a variable (here or elsewhere)
@text = "hello world"

# and later she can use that variable:
tooltip = @text
# which would expand into this after some postprocessing:
tooltip = "hello world" 

# but user can write this:
tooltip = "@text"
# which should probably be parsed as a literal text, and not expanded

To do this, I'll need to distinguish between quoted and unquoted strings (to expand one, but not the other).

I can do that already on Tape level, but I would prefer to do check whether the string is quoted in serde Deserializer. Couldn't find a way to do that.

Dec 14 '23 04:12 rlidwka

Ah I see, that is a good use case.

I may need to think more about this. I'm not sure what is the best way to solve this. On one hand, I'm fine that high level deserialization may entail some level of lossiness. But on the other, it'd be ideal to have workaround.

I'm not sure if you've considered this, but you may be able to workaround this issue by creating a preprocessing stage at a lower level that returns text output with all variables expanded (afaik they're defined in the file they are used). Then you can run a deserializer over this output. I can help explore this solution and make sure everything is available if you want to go down this route.

Dec 14 '23 14:12 nickbabcock

If you think that it is a good use-case, and there're no solutions currently, perhaps I can figure out how deserializers work and send a pull request. Already have some ideas...

Yes, I've considered workarounds, but those are very limiting.

Dec 14 '23 15:12 rlidwka

jomini jomini copied to clipboard

Differentiate quoted vs unquoted at high level

jomini
jomini copied to clipboard