invoice2data icon indicating copy to clipboard operation
invoice2data copied to clipboard

Gcloud is not working

Open suryacaprice opened this issue 6 years ago • 16 comments

Waiting for the operation to finish. Traceback (most recent call last): File "/home/caprice/anaconda3/bin/invoice2data", line 11, in sys.exit(main()) File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/main.py", line 166, in main res = extract_data(f.name, templates=templates, input_module=input_module) File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/main.py", line 90, in extract_data extracted_str = input_module.to_text(invoicefile).decode('utf-8') File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/input/gvision.py", line 79, in to_text json_string = result_blob.download_as_string() AttributeError: 'NoneType' object has no attribute 'download_as_string'

suryacaprice avatar Dec 03 '18 11:12 suryacaprice

Google Vision needs a lot of setup. You need:

  • API key
  • bucket for results

There are no instructions for this as of now, but it should be clear from the source code. Did you do all the setup tasks correctly before encountering this error?

m3nu avatar Dec 03 '18 12:12 m3nu

Hi , I have done all the configuration , Created bucket and the api is mapped to the project with the full access .

suryacaprice avatar Dec 03 '18 12:12 suryacaprice

I dont think gcloud config is the problem here json_string = result_blob.download_as_string() AttributeError: 'NoneType' object has no attribute 'download_as_string'

this line shows the error

suryacaprice avatar Dec 03 '18 12:12 suryacaprice

Then I'd check if you have a result in your bucket because this line just reads the result.

If your configuration is wrong there won't be a result in the bucket and this specific line will fail.

m3nu avatar Dec 03 '18 12:12 m3nu

Let me check the configuration again .

suryacaprice avatar Dec 03 '18 12:12 suryacaprice

does this pdf have multiple pages? if yes there is a problem in gvision.py where it is hardcoded to output1-1.json if pdf is more than one page gvision will create json file as output1-[no. of pages].json, so i have made a code change and it worked for me please find the code below

image

ananthnagan avatar Jun 17 '19 11:06 ananthnagan

so i have made a code change and it worked for me please find the code below

You should make a pull request for your fix. Else your improvement will never make it into the official repo and you need to maintain the change during every update.

m3nu avatar Jun 17 '19 11:06 m3nu

so i have made a code change and it worked for me please find the code below

You should make a pull request for your fix. Else your improvement will never make it into the official repo and you need to maintain the change during every update.

created the pull request

ananthnagan avatar Jun 17 '19 12:06 ananthnagan

Hi, I am trying to make the Gvision work. I have my Google credential's json and am trying to figure out how to properly connect the bucket. I see that it is a default argument but none of the API calls refer to a bucket. Where can I specify my bucket? Thanks

EtienneBerube avatar Jul 02 '19 18:07 EtienneBerube

Hey guys, I was running in the same issue and tried ananthnagan's fix and it works for one multi page pdf but fails for another one. Can't really figure out what the issue would be. Any ideas?

Traceback (most recent call last): File "/usr/local/bin/invoice2data", line 10, in sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/invoice2data/main.py", line 201, in main res = extract_data(f.name, templates=templates, input_module=input_module) File "/usr/local/lib/python3.6/dist-packages/invoice2data/main.py", line 82, in extract_data extracted_str = input_module.to_text(invoicefile).decode('utf-8') File "/usr/local/lib/python3.6/dist-packages/invoice2data/input/gvision.py", line 35, in to_text result_blob_name = result_blob_basename + '/output-1-to-'+str(PdfFileReader(open(path, "rb")).getNumPages())+'.json' File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1084, in init self.read(stream) File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1697, in read line = self.readNextEndLine(stream) File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1938, in readNextEndLine x = stream.read(1)

Venerit avatar Jul 02 '19 20:07 Venerit

Hi, I am trying to make the Gvision work. I have my Google credential's json and am trying to figure out how to properly connect the bucket. I see that it is a default argument but none of the API calls refer to a bucket. Where can I specify my bucket? Thanks

its at top of the gvision.py there you can give your bucket name image

ananthnagan avatar Jul 03 '19 06:07 ananthnagan

Right. When you integrate the lib in your own script, you can pass your bucket as optional keyword arg, as shown by @ananthnagan above.

m3nu avatar Jul 03 '19 07:07 m3nu

This might work for a local solution, but if the code is in a docker which runs pip install the changes would be overridden. @ananthnagan seems to go and get the bucket from the environment variables, which would be a good alternative. Could a PR for this be justifiable?

EtienneBerube avatar Jul 03 '19 14:07 EtienneBerube

a PR is created regarding @ananthnagan's fix https://github.com/invoice-x/invoice2data/pull/241

EtienneBerube avatar Jul 03 '19 18:07 EtienneBerube

I'm starting to look into gvision. As there are no instructions, can someone point me which steps to take to make it work. @rmilecki have you looked into the gvision input module?

bosd avatar Oct 24 '22 15:10 bosd

@bosd: I have zero experience with OCR inputs

rmilecki avatar Jan 25 '23 10:01 rmilecki