pragmatic_segmenter
pragmatic_segmenter copied to clipboard
Golden rule for telephone numbers with letters?
I stumbled upon the following case where (the otherwise wonderful) PragmaticSegmenter trips up:
It will split a sentence containing a telephone number with letter characters 800.ACME.NOW
is split after 800.
:
it "Telephone number with letters" do
sentence = "If you have questions, call ACME Enterprises at 800.ACME.NOW (800.123.4567) or visit our website at: ACME-Enterprises.com."
ps = PragmaticSegmenter::Segmenter.new(text: sentence, language: "en")
expect(ps.segment).to eq([sentence])
end