crfs-rs icon indicating copy to clipboard operation
crfs-rs copied to clipboard

Decouple inference state from model

Open xd009642 opened this issue 3 years ago • 1 comments
trafficstars

When looking at the code I saw the viterbi state is held inside the context and mutated. This prevents the tagger from running multiple calls to tag concurrently and limits performance for users where they want to do tags concurrently over multiple elements. Instead they'll need to wrap the tagger in a mutex or clone it and all the data (including the mutable state).

An alternative design which would be more multi-threading friendly would be to split the fields that mutate into a new struct something like ViterbiState and change viterbi implementation into fn viterbi(&self, state: &mut ViterbiState) and then make the tag function in the tagger fn tag(&self, xseq: &[T]) where it creates a ViterbiState and passes it into the call to viterbi. This would also remove/simpliy a bunch of the reset code

xd009642 avatar Jul 22 '22 07:07 xd009642

Thanks for posting this, the code was a naive port of crfsuite, I'm sure a lot can be optimized.

messense avatar Jul 22 '22 10:07 messense