swift-models
swift-models copied to clipboard
[WordSeg] Alphabet can only contain characters from training
Currently, the WordSeg dataset uses characters from all datasets to create an instance of Alphabet, but only the training set should be used.
This potentially involves architectural changes to CharacterSequence, but we might be able to handle failures differently instead.