tesseract.js
tesseract.js copied to clipboard
Is it possible to obtain the Thresholded Image from tesseract?
I would like to have access to the thresholded image created by tesseract. (See function: GetThresholdedImage)
This feature is implemented by the tesseract wrapper library for pyton as a function: https://github.com/sirfz/tesserocr/blob/master/tesserocr.pyx#L1737 There is also a parameter that writes the thresholded image to the file "tessinput.tif": "tessedit_write_images".
I am using win10 and NodeJS with some tiff files, setting the parameter "tessedit_write_images" to "T" or "1" does nothing.
This library does not currently include an interface for retrieving the thresholded image. It's theoretically possible to do so using low-level functions to read the contents of the wasm filesystem and/or memory--but not something we currently support with high-level functions.
Feel free to thumbs up this issue if you're reading this and would use this feature if added--can probably add in a future release if there's enough demand.
@diogoalmiro This feature has been added in the development branch for version 4 and will be included in that release. That branch is functional at present if you would like to try it out, and is described in more detail in #662. An example has also been included to demonstrate usage.
Closing as this was added in Version 4.
@Balearica I see the example is for browser but can you retrieve the created "threshold" images with Node as well? Thanks
@GitMurf Yes, you can also do this on Node. You should be able to adapt the browser example fairly easily.
@GitMurf I added an example for retrieving processed images using Node, which can be found here.
@Balearica thanks a ton! This is exactly what I needed :) Much appreciated!