pragmatic_segmenter icon indicating copy to clipboard operation
pragmatic_segmenter copied to clipboard

return String instead of PragmaticSegmenter::Text

Open maia opened this issue 8 years ago • 0 comments

Currently pragmatic_segmenter returns an instance of PragmaticSegmenter::Text, which is a subclass of String. As pragmatic_tokenizer checks if text.class == String and also returning segmented objects of a different class than initially passed, I suggest to return strings instead of instances of the only internally used subclass.

I wonder if there is a smarter idea than using #to_s when returning the result, as it would unnecessarily duplicate the strings in memory. Maybe instead of using a subclass of String rather extend the String class with a module providing that single method used? (and use a method name which won't have a chance to mess with anyones code surprisingly if they also decide to extend the String class)

maia avatar Mar 12 '18 20:03 maia