libipld
libipld copied to clipboard
dag-cbor: add ability to validate input without fully deserializing
To implement https://github.com/filecoin-project/FIPs/pull/483 securely in the FVM, we need to validate that entry values are well-formed DAG-CBOR payloads. All we need is to perform syntatical validation without incurring in any serde costs/overheads.
Based on https://github.com/filecoin-project/FIPs/pull/483/files#r1013446172 comment, it sounds being more than syntactical validation, e.g. key ordering. I guess using the Serde code path wouldn't add much overhead, if the deserialized values are dropped (I assume something like that is possible, I haven't tried).
If a custom validation function is added, it should be benchmark against a Serde version, to see if the performance justifies having a separate code with potentially introduces bugs.
@vmx How would a serde path work here? This is a dynamic data structure whose schema we don't know.
@raulk: You can deserialize to an Ipld enum. See https://github.com/ipld/serde_ipld_dagcbor/blob/ea9b594421a47ac431627781a65d641ff54a3f2b/tests/de.rs#L88-L98 for a full example.
Yes I know, but validating the whole input would imply deserialising into ipld::Ipld, which uses owned data, so it's not zero-copy syntactical validation?
Here's a PR: https://github.com/ipld/libipld/pull/159 I'm working on removing the recursion. I have some ideas.
Yes I know, but validating the whole input would imply deserialising into ipld::Ipld, which uses owned data, so it's not zero-copy syntactical validation?
Correct. Though I think zero-copy should be possible with Serde, it's just not implemented in serde_ipld_dagcbor.
@Stebalien hinted that deserializing into IgnoredAny may do the trick here.
Unfortunately, we've found that that doesn't quite work as upstream (cbor4ii) doesn't validate minimality.
as upstream (cbor4ii) doesn't validate minimality.
We already kind of patch upstream in serde_ipld_dag_cbor, could it be integrated there? I'd surely be interested in pushing as much as possible upstream.