pdfocr
pdfocr copied to clipboard
Tesseract 3.02 not return list of Language and pdfocr.rb stop execution with error.
--list-langs is not more a parameter of tesseract and return the usage message. To execute ocrpdf.rb I had to comment out the following lines.
From line 253 to 276:
if checklang
langlist = []
if usecuneiform
begin
langlist = `cuneiform -l`.split("\n")[-1].split(":")[-1].delete(".").split(" ")
rescue
puts "Unable to list supported languages from cuneiform"
end
end
if usetesseract
begin
langlist = `tesseract --list-langs 2>&1`.split("\n")[1..-1]
rescue
puts "Unable to list supported languages from tesseract"
end
end
if langlist and not langlist.empty?()
if not langlist.include?(language)
puts "Language #{language} is not supported or not installed. Please choose from"
puts langlist.join(' ')
exit
end
end
end
I had the same problem and this just made it work! Thank you