swift-models icon indicating copy to clipboard operation
swift-models copied to clipboard

[WordSeg] Use `Character` instead of `String` in `Alphabet`

Open texasmichelle opened this issue 5 years ago • 0 comments

Currently, Alphabet's dictionary maps from String rather than Character to support tokens of length > 1 character. Using Character instead of String would work if we used special Unicode characters or enums instead of "</s>", "</w>", and "<pad>".

Since this is used in so many places in the WordSeg model, it is potentially worthwhile to make it more efficient.

texasmichelle avatar Jun 10 '20 19:06 texasmichelle