pdf2json
pdf2json copied to clipboard
Is the auto-merge broken text blocks capability active in the last stable version (3.0.5)?
I'm experiencing some broken text blocks problems with the latest version of the library.
The pdf looks like this
and this is what I'm getting, as an example:
{ "x": 13.214, "y": 5.237, "w": 2.542, "oc": "#00416e", "sw": 0.44271875, "A": "left", "R": [ { "T": "QUA", "S": -1, "TS": [ 0, 13.72, 1, 0 ] } ] }, { "x": 14.743, "y": 5.237, "w": 4.115, "oc": "#00416e", "sw": 0.44271875, "A": "left", "R": [ { "T": "DRO%20RS", "S": 9, "TS": [ 0, 12, 1, 0 ] } ] },
Is there any way to activate the auto-merging step by using the library on a node environment? Is there any way to eventually tune it? Thank you!