pragmatic_segmenter icon indicating copy to clipboard operation
pragmatic_segmenter copied to clipboard

Instructions for using on the command line

Open ftyers opened this issue 6 years ago • 1 comments

Would it be possible to get instructions for how to use this on the command line in a pipe? e.g.

$ cat ~/corpora/languages/tatar/wikipedia/wiki.txt |  ruby pragmatic_segmenter.rb 

This gives no output...

ftyers avatar Jul 14 '18 11:07 ftyers

This is kind of what I'd like, but probably someone who actually knows Ruby can make a better/more robust version. :)

$ cat segment.rb 

require 'pragmatic_segmenter'

lang = "en"
if ARGV[0]
    lang=ARGV[0]
end

STDIN.each_with_index do |line, idx|
    ps = PragmaticSegmenter::Segmenter.new(text: line, language: lang)
    ps.segment
    for i in ps.segment
        print(i,"\n")
    end
end

ftyers avatar Jul 14 '18 11:07 ftyers