rtesseract
rtesseract copied to clipboard
RTesseract::ConversionError in Ruby on Rails app
- Installed gems fine, tesseract and imagemagik already installed on server.
- Running tesseract command manually in terminal works successfully.
- Running application locally on OS X enviroment works successfully.
uploaded_io = params[:picture]
File.open(Rails.root.join('public', 'uploads', uploaded_io.original_filename), 'wb') do |file|
file.write(uploaded_io.read)
end
dl = RTesseract.new(Rails.root.join('public', 'uploads',uploaded_io.original_filename).to_s)
@string = dl.to_s
Only once I've deployed to my development server does it all break returning the error. The files are been copied across to the public/uploads folder correctly. And are readable as tested by running the tesseract command outside of ruby on the same file.
The result when running the app is RTesseract::ConversionError on the dl.to_s action
Unsure on what I'm missing..
Bump on this, experiencing the same problem.
Yeap,
I try to use the gem in rails. When ocring from console, rails said to me RTesseract::ConversionError: No such file or directory @ rb_sysopen - /tmp/1451631781.39245151432.txt from /.rbenv/versions/2.2.4/lib/ruby/gems/2.2.0/gems/rtesseract-1.3.2/lib/rtesseract.rb:192:in `convert'
can you help me?
Regards,
ustuntas
sudo apt-get install tesseract-ocr
@ustuntas it helped me
Hey I already installed the tesseract-ocr but the error is still same.
This is not help me.
@ustuntas You have installed all the prerequisites? Imagemagick (sudo apt-get install libmagickwand-dev imagemagick on Ubuntu) RMagick or mini_magick or quick_magick - Gem
Try run the tesseract on console with a tif image.
Hi dannylo,
I am getting this error as well, and have tesseract and ImageMagick installed(I can use both on my terminal). The below is what I would get in my logs:
RTesseract::ConversionError: No such file or directory @ rb_sysopen - /tmp/1451631781.39245151432.txt
This happens when I call the to_s method. Seems like it is creating the txt file but not saving it? I looked in the /tmp folder and confirmed that it is not there.
Hi gang—I was able to fix this problem on my machine. It turns out that my installation of tesseract did not include training files. So when rtesseract was invoking the tesseract code, it was silently failing.
ayerie:POETRY simon$ tesseract numbers.png stdout
Error opening data file /opt/local/share/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
The solution is to grab a copy of the training data from googlecode, and put it where tesseract is by default looking for it.
ayerie:POETRY simon$ wget https://tesseract-ocr.googlecode.com/files/eng.traineddata.gz
--2016-03-29 10:02:50-- https://tesseract-ocr.googlecode.com/files/eng.traineddata.gz
Resolving tesseract-ocr.googlecode.com (tesseract-ocr.googlecode.com)... 74.125.69.82, 2607:f8b0:4001:c08::52
Connecting to tesseract-ocr.googlecode.com (tesseract-ocr.googlecode.com)|74.125.69.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 742852 (725K) [application/x-gzip]
Saving to: ‘eng.traineddata.gz’
eng.traineddata.gz 100%[==================================================================================>] 725.44K 831KB/s in 0.9s
2016-03-29 10:02:51 (831 KB/s) - ‘eng.traineddata.gz’ saved [742852/742852]
ayerie:POETRY simon$ gunzip eng.traineddata.gz
ayerie:POETRY simon$ sudo mv -v eng.traineddata /opt/local/share/tessdata/
Password:
eng.traineddata -> /opt/local/share/tessdata/eng.traineddata
Once I did that, rtesseract worked just fine. I hope this helps,
Simon
I'm getting the same issue as @jvalentine, and my base installation of Tesseract works already (e.g. tesseract test.jpg stdout
), and the training data is in the correct spot. Any updates on this?
Hello, Do you use Rmagick or Minimagick like a processors? Do you have the imagemagick dev libs installed?
- Ubuntu systems: sudo apt-get install libmagickwand-dev
- RHEL systems: yum install ImageMagick-devel
- Mac: brew install imagemagick
If all prerequisites are working, please send me the error inspected.
Actually, I think when I was installing libmagickwand-dev yesterday it didn't work correctly. Tried again today and it works. Thanks!
Hello everybody, I was having the same problem using Rails. When I used a rake task to use tesseract it worked perfectly but when using Rails on Apache, for some reason Apache could not find tesseract command in it's path.
The solution was really simple to me. I just added the full path for the command:
RTesseract.new(temp_file, command: "/usr/local/bin/tesseract").to_s.strip
That solved my problem. Hope it can help others.
Hi all, Had a similar problem to this. My solution was to open the file with MiniMagick before processing it. My file was stored at a URL but this would likely work with a local file too.
def extract_text
image = MiniMagick::Image.open(self.file_url)
image = RTesseract.new(image)
image.to_s
end
bump having the same issue. Tried all the solutions above to no avail
test = RTesseract.new(img, :processor => "mini_magick", :lang => "eng", command: "/usr/local/bin/tesseract")
test.to_s
still gives me RTesseract::ConversionError: No such file or directory @ rb_sysopen - /var/folders/ms/ml9k4bbn1bx8d8ccz8lrtq0m0000gp/T/1477343324.75567581123.txt
this is the .png image i'm testing with
What fixed it for me was to also install it through brew:
brew install tesseract
I think this issue may cause when use irb, irb released the file.
I have tried evrything but nothing works does any one have solution for this i have the same conversion error
On macOS I had the same issue, to resolve it I added the absolute path of the tessdata
directory as an option to RTesseract.new
.
find your tessdata directory:
find / -type d -name "tessdata" # from cli
`find / -type d -name "tessdata" -print -quit` # in ruby code, find and return first result
image = RTesseract.new(path, {
:processor => 'mini_magick',
:tessdata_dir => '/usr/local/Cellar/tesseract/3.05.02/share/tessdata'
})
For Heroku 22 Stack I was able to get it working by just needing to add the buildpack;
https://github.com/pathwaysmedical/heroku-buildpack-tesseract
before calling my ruby buildpack.
I was downloading the file temporarily rather than referencing a URL so my job code looked like;
file = record.file.download
file_path = file.path
ocr_text = RTesseract.new(file_path).to_s