Serializing CDATA
My apologies if this is an inappropriate place to raise this issue.
I have the following structure that I am trying to serialize:
use serde::Serialize;
/// `Description` field for `Tag`.
#[derive(Serialize)]
pub(super) struct Description {
#[serde(rename = "$value")]
value: String,
}
impl Description {
/// Create a new `Description`.
pub(super) fn new(input: &str) -> Self {
let value = format!("<![CDATA[{}]]>", input);
Self { value }
}
}
As you can see I'm wrapping it in CDATA as per a file format that I'll be writing that I have no control over. This structure is actually just part of a larger structure:
#[derive(Serialize)]
#[serde(rename_all = "PascalCase")]
pub(crate) struct Tag {
name: String,
tag_type: TagType,
data_type: DataType,
dimensions: Option<usize>,
radix: Radix,
constant: Constant,
external_access: ExternalAccess,
description: Description,
}
impl Tag {
/// Return an XML `String` representation of `Tag`.
pub(crate) fn to_xml(&self) -> Result<String, DeError> {
to_string(self)
}
}
Which is serializing perfectly (except for the description field):
<Tag Name="test_dint" TagType="Base" DataType="DINT" Radix="Decimal" Constant="false" ExternalAccess="Read Only"><Description><![CDATA[Test DINT]]></Description></Tag>
It looks like when it serializes the CDATA portion is being escaped:
<Description><![CDATA[Test DINT]]></Description>
Is there anything I can do to unescape it ?
Actually, the serializer worked as expected. You should not write <![CDATA[]]> manually as part of your data, because it is a part of format. I suppose that currently serializer never uses CDATA when serializes strings. That can be improved.
Because we are limited in a way to pass options from structure to the deserializer due to serde design, the possible solutions could be a setting in a serializer how to serialize strings:
- text always
- CDATA always
- a function to select Text/CDATA representation
- an automatic choice based on the content, as XmlBeans, for example, do. That choice could be implemented as a default format selection function from the previous point
PR is welcome!
any update for this issue?
I plan to make a PR which reworked serializer soon, but support for the CDATA I left for potential contributors. I left TODOs where corresponding code should be added.
I'm willing to work on this if you can point me in the right direction.
For this specific feature corresponding place where you should choose how to write string is https://github.com/tafia/quick-xml/blob/fb079b6714d7238d5180aaa098c5f9b02dbcc7da/src/se/content.rs#L64-L76
You can start from adding tests, I think, just after this mods in dedicated cdata mods:
https://github.com/tafia/quick-xml/blob/fb079b6714d7238d5180aaa098c5f9b02dbcc7da/src/se/content.rs#L605-L606
https://github.com/tafia/quick-xml/blob/fb079b6714d7238d5180aaa098c5f9b02dbcc7da/src/se/content.rs#L794-L795
You are free to add additional tests if you think that they would be valuable.
After that you can start from adding an
pub enum TextFormat {
Text,
CData,
}
and a field of this type in serializer and writing Text or CDATA depending on the value of that format. If you wish, you could also add an Auto variant (and make it default). The algorithm when to choose one or another format can be spied something here.
Has there been any progress on supporting CDATA? Does anyone know if there is a work-around to maybe custom serialize strings into CDATA using serde?
@Mingun May be better define new special value $cdata, de works exact like $text but serialization always as CDATA?
Edit: i dig into code, so - answer no - it will be more complex
How about a solution that allows raw strings without escaped characters? Then at least @keithmss' workaround would work.