pragmatic_tokenizer icon indicating copy to clipboard operation
pragmatic_tokenizer copied to clipboard

feature overlap with pragmatic_segmenter?

Open maia opened this issue 9 years ago • 1 comments

Currently there is some overlap between pragmatic_tokenizer and pragmatic_segmenter, as both e.g. handle abbreviations. Should rules and constants (especially when language specific) that are shared between both gems be extracted into a sub-gem? Or is there too little shared code to justify this?

And/or: should constant arrays and hashes be converted from ruby to .yml files? Maybe it's possible that the app will then only load them once, even if two gems use them?

maia avatar Jan 13 '16 16:01 maia

I'd definitely be open to this if it reduced memory, improved the speed or made it easier to maintain the gems. This one is not high on my priority list right now but would be of course be open to pull requests.

diasks2 avatar Jan 13 '16 23:01 diasks2