Parsr
Parsr copied to clipboard
NameError and infinite loop in the python client (PIP package)
Summary
Function send_document throws NameError: name 'file' is not defined in when wait_till_finished=True and silent=False.
When wait_till_finished=True and silent=True it falls into an infinite loop instead.
I can see, that on master branch that bug has been probably fixed already, however current PIP package is not up-to-date with these changes.
Steps To Reproduce
pip install parsr-client- In Python shell:
>>> from parsr_client import ParsrClient
>>> parsr = ParsrClient("localhost:3001")
>>> request = parsr.send_document("someDocument.pdf",
config_path="config.json", document_name="someDocument",
wait_till_finished=True, silent=False)
> Polling server for the job <job-id>...
>> Progress percentage: 0
>> Job done!
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<timed exec> in <module>
/opt/anaconda/miniconda3/envs/docspy37/lib/python3.8/site-packages/parsr_client/parsr_client.py in send_document(self, file_path, config_path, server, document_name, revision, wait_till_finished, refresh_period, save_request_id, silent)
148 print('>> Job done!')
149 return {
--> 150 'file': file_path,
151 'config': config,
152 'status_code': r.status_code,
NameError: name 'file' is not defined
- In Python shell:
>>> from parsr_client import ParsrClient
>>> parsr = ParsrClient("localhost:3001")
>>> request = parsr.send_document("someDocument.pdf",
config_path="config.json", document_name="someDocument",
wait_till_finished=True, silent=True)
> Polling server for the job <job-id>...
# infinite loop, not finishing after document has been processed by server
Expected behavior Works without error and infinite looping
Thanks @jvalls-axa
@mkosturek : You're right, there has been an update on the function send_document on the python client but the changes are still in the develop branch - they will be merged into the master branch upon the very next minor release.
Here is the current signature of the function:
https://github.com/axa-group/Parsr/blob/69e6b9bf33f1cc43d5a87d428cedf1132ccc48e8/clients/python-client/parsr_client/parsr_client.py#L73-L83
TLDR: The file argument has been renamed to file_path to avoid using the reserved python keyword file.
Thanks for pointing that out; from here on in, we'll try to keep the python client on PIP up to date with the master branch, and not the develop branch.
My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).
Fixed when dowgrading to v0.12.
My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).
Fixed when dowgrading to v0.12.
Thanks @marcpicaud. Could you open an issue with more details?
My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).
Fixed when dowgrading to v0.12.
Hi @marcpicaud
Could you please try with 0.12.1 ??
I checked it and seems that everything works as expected...
With Docker image v0.12.2 and client v3.2.2, I get stuck here:
[2020-06-19T17:09:34] INFO (parsr-api/6 on c122c2d7f93b): Running module: ReadingOrderDetectionModule, Options: {"minVerticalGapWidth":20,"minColumnWidthInPagePercent":15}
Looks like an infinite loop to me.
It works fine with Docker image v0.12 and client v3.1.
I think the fix that's been applied solves the NameError when silent=False, but I don't think it solves the infinite loop when silent=True.
It looks like the problem is the indenting at 140-148:
https://github.com/axa-group/Parsr/blob/f4410d79154ee184fe4e4ed8c556ddb5fbecfa92/clients/python-client/parsr_client/parsr_client.py#L140-L148
The update to server_status_response is part of the if not silent block, so if silent=True the status is never updated.
As a side effect of the indenting, if you do set silent=False, "Job done!" gets printed on every iteration, even if the job isn't done.
I'd be happy to open a pull request, if tweaking this sounds like the right solution?
I finally solved the "infinite loop" problem after sending document by degrading parsr-client to 3.1.0