Parsr icon indicating copy to clipboard operation
Parsr copied to clipboard

NameError and infinite loop in the python client (PIP package)

Open mkosturek opened this issue 5 years ago • 7 comments

Summary Function send_document throws NameError: name 'file' is not defined in when wait_till_finished=True and silent=False. When wait_till_finished=True and silent=True it falls into an infinite loop instead.

I can see, that on master branch that bug has been probably fixed already, however current PIP package is not up-to-date with these changes.

Steps To Reproduce

  1. pip install parsr-client
  2. In Python shell:
>>> from parsr_client import ParsrClient
>>> parsr = ParsrClient("localhost:3001")
>>> request = parsr.send_document("someDocument.pdf", 
    config_path="config.json", document_name="someDocument", 
    wait_till_finished=True, silent=False)
> Polling server for the job <job-id>...
>> Progress percentage: 0
>> Job done!
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<timed exec> in <module>

/opt/anaconda/miniconda3/envs/docspy37/lib/python3.8/site-packages/parsr_client/parsr_client.py in send_document(self, file_path, config_path, server, document_name, revision, wait_till_finished, refresh_period, save_request_id, silent)
    148                     print('>> Job done!')
    149             return {
--> 150                 'file': file_path,
    151                 'config': config,
    152                 'status_code': r.status_code,

NameError: name 'file' is not defined
  1. In Python shell:
>>> from parsr_client import ParsrClient
>>> parsr = ParsrClient("localhost:3001")
>>> request = parsr.send_document("someDocument.pdf", 
    config_path="config.json", document_name="someDocument", 
    wait_till_finished=True, silent=True)
> Polling server for the job <job-id>...
# infinite loop, not finishing after document has been processed by server

Expected behavior Works without error and infinite looping

mkosturek avatar May 29 '20 10:05 mkosturek

Thanks @jvalls-axa

@mkosturek : You're right, there has been an update on the function send_document on the python client but the changes are still in the develop branch - they will be merged into the master branch upon the very next minor release. Here is the current signature of the function: https://github.com/axa-group/Parsr/blob/69e6b9bf33f1cc43d5a87d428cedf1132ccc48e8/clients/python-client/parsr_client/parsr_client.py#L73-L83 TLDR: The file argument has been renamed to file_path to avoid using the reserved python keyword file.

Thanks for pointing that out; from here on in, we'll try to keep the python client on PIP up to date with the master branch, and not the develop branch.

royjohal avatar May 31 '20 23:05 royjohal

My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).

Fixed when dowgrading to v0.12.

marcpicaud avatar Jun 01 '20 23:06 marcpicaud

My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).

Fixed when dowgrading to v0.12.

Thanks @marcpicaud. Could you open an issue with more details?

royjohal avatar Jun 01 '20 23:06 royjohal

My 2 cents : the parsr service I'm using (via docker) was stuck in an infinite loop when using the parsr API (POST /document).

Fixed when dowgrading to v0.12.

Hi @marcpicaud

Could you please try with 0.12.1 ??

I checked it and seems that everything works as expected...

jvalls-axa avatar Jun 03 '20 13:06 jvalls-axa

With Docker image v0.12.2 and client v3.2.2, I get stuck here:

[2020-06-19T17:09:34] INFO  (parsr-api/6 on c122c2d7f93b): Running module: ReadingOrderDetectionModule, Options: {"minVerticalGapWidth":20,"minColumnWidthInPagePercent":15}

Looks like an infinite loop to me.

It works fine with Docker image v0.12 and client v3.1.

jfilter avatar Jun 19 '20 17:06 jfilter

I think the fix that's been applied solves the NameError when silent=False, but I don't think it solves the infinite loop when silent=True.

It looks like the problem is the indenting at 140-148:

https://github.com/axa-group/Parsr/blob/f4410d79154ee184fe4e4ed8c556ddb5fbecfa92/clients/python-client/parsr_client/parsr_client.py#L140-L148

The update to server_status_response is part of the if not silent block, so if silent=True the status is never updated.

As a side effect of the indenting, if you do set silent=False, "Job done!" gets printed on every iteration, even if the job isn't done.

I'd be happy to open a pull request, if tweaking this sounds like the right solution?

MrAlecJohnson avatar Apr 14 '21 11:04 MrAlecJohnson

I finally solved the "infinite loop" problem after sending document by degrading parsr-client to 3.1.0

Aofei-Chang avatar May 22 '23 08:05 Aofei-Chang