pragmatic_segmenter
pragmatic_segmenter copied to clipboard
Instructions for using on the command line
Would it be possible to get instructions for how to use this on the command line in a pipe? e.g.
$ cat ~/corpora/languages/tatar/wikipedia/wiki.txt | ruby pragmatic_segmenter.rb
This gives no output...
This is kind of what I'd like, but probably someone who actually knows Ruby can make a better/more robust version. :)
$ cat segment.rb
require 'pragmatic_segmenter'
lang = "en"
if ARGV[0]
lang=ARGV[0]
end
STDIN.each_with_index do |line, idx|
ps = PragmaticSegmenter::Segmenter.new(text: line, language: lang)
ps.segment
for i in ps.segment
print(i,"\n")
end
end