rustfst icon indicating copy to clipboard operation
rustfst copied to clipboard

Add serialization support to FstAddOn

Open ebraraktas opened this issue 2 years ago • 0 comments

Description

This PR adds the possibility to serialize/deserialize a FstAddOn from binary data. This makes it possible to load a lookahead FST (special MatcherFst) from binary data.

Changes

  • Added SerializeBinary requirement for SerializableFst trait. This made it possible to read/write an FST from/to binary data slice, instead of a path.
  • Added SerializeBinary for IntInterval, VectorIntervalStore, IntervalSet and LabelReachableData. These implementations are ported from relevant OpenFst implementations to make it compatible with files created by OpenFst. Label is assumed as 32 bit data, because OpenFst uses int for Label.
  • Added Fst<W> requirement for fst field of FstAddOn. I think this is better to add this requirement by definition of the struct.
  • Implemented SerializeBinary for FstAddOn with AddOnPair ((Option<Arc<AO1>>, Option<Arc<AO2>>)). For now, this seems to be the only variant of FstAddOn used in the project. More generic implementation may be added.
    • fst_type field is added to FstAddOn to implement this, OpenFst has a type name field in AddOnImpl as well. I have given names {i,o}label_lookahead to FstAddOn variables defined in matcher_fst.rs. Names are taken from OpenFst, and they seem compatible with the binary output of OpenFst.
  • Finally, added new constructor to MatcherFst allowing to create it from already computed (or read) FstAddOn.

Status

  • These changes do not break any API currently available.
  • Current tests of the project are not affected from these changes, and they pass as expected.
  • I have tested this with a olabel_lookahead FST file created with OpenFst. I did not added a proper test yet. I think it may be necessary to add it.

ebraraktas avatar Oct 30 '21 10:10 ebraraktas