pdfocr icon indicating copy to clipboard operation
pdfocr copied to clipboard

Tesseract 3.02 not return list of Language and pdfocr.rb stop execution with error.

Open unclehook opened this issue 11 years ago • 1 comments

--list-langs is not more a parameter of tesseract and return the usage message. To execute ocrpdf.rb I had to comment out the following lines.

From line 253 to 276:

if checklang
  langlist = []
  if usecuneiform
    begin
      langlist = `cuneiform -l`.split("\n")[-1].split(":")[-1].delete(".").split(" ")
    rescue
      puts "Unable to list supported languages from cuneiform"
    end
  end
  if usetesseract
    begin
      langlist = `tesseract --list-langs 2>&1`.split("\n")[1..-1]
    rescue
      puts "Unable to list supported languages from tesseract"
    end
  end
  if langlist and not langlist.empty?()
    if not langlist.include?(language)
      puts "Language #{language} is not supported or not installed. Please choose from"
      puts langlist.join(' ')
      exit
    end
  end
end

unclehook avatar Oct 28 '13 14:10 unclehook

I had the same problem and this just made it work! Thank you

andrecerda avatar Nov 07 '13 19:11 andrecerda