patzilla icon indicating copy to clipboard operation
patzilla copied to clipboard

Problem getting drawings from USPTO

Open amotl opened this issue 2 years ago • 4 comments

Just discovered those in the log files.

WARNING  [patzilla.access.epo.ops.api             ][MainThread] No image information for document=US2022110447A1
INFO     [patzilla.access.uspto.image             ][MainThread] USPTO: Fetching first drawing of "US20220110447A1"
INFO     [patzilla.access.uspto.image             ][MainThread] USPTO: Searching for TIFF document "US20220110447A1" at "http://aiw1.uspto.gov/.aiw?Docid=20220110447&idkey=NONE"
ERROR    [patzilla.access.uspto.image             ][MainThread] We failed to open url "http://aiw1.uspto.gov/.aiw?Docid=20220110447&idkey=NONE". reason=[Errno -2] Name or service not known, code=None
WARNING  [patzilla.access.uspto.image             ][MainThread] No content in main document page 'US20220110447A1' (url: http://aiw1.uspto.gov/.aiw?Docid=20220110447&idkey=NONE)
INFO     [patzilla.access.uspto.image             ][MainThread] USPTO: Searching for TIFF document "US20220110447A1" at "http://aiw2.uspto.gov/.aiw?Docid=20220110447&idkey=NONE"
ERROR    [patzilla.access.uspto.image             ][MainThread] We failed to open url "http://aiw2.uspto.gov/.aiw?Docid=20220110447&idkey=NONE". reason=[Errno -2] Name or service not known, code=None
WARNING  [patzilla.access.uspto.image             ][MainThread] No content in main document page 'US20220110447A1' (url: http://aiw2.uspto.gov/.aiw?Docid=20220110447&idkey=NONE)

It looks like all of patimg1.uspto.gov, patimg2.uspto.gov, aiw1.uspto.gov and aiw2.uspto.gov have been decomissioned.

amotl avatar Apr 14 '22 15:04 amotl

As it seems, USPTO does not provide TIFF images any longer, but only PDF: https://patft.uspto.gov/netahtml/PTO/help/images.htm (see "Notices")

Instead of loading a TIFF from http://aiw1.uspto.gov/.aiw?Docid=20220110447&idkey=NONE, you will probably have to load a PDF from https://pdfaiw.uspto.gov/47/2022/04/011/1.pdf.

Apparently, for a Docid "abcd0efghij" the URL to access a PDF of the n-th page is: https://pdfaiw.uspto.gov/ij/abcd/gh/0ef/n.pdf

aghster avatar Apr 14 '22 16:04 aghster

Hi @aghster,

thanks. Nice to see you again. You are absolutely right. Currently, I am trying to figure out if I can trust the observation that "Drawings" are always on section 2 / page 2. Can you spot any contradicting samples?

With kind regards, Andreas.

Summary

I am picking two arbitrary samples here. The application is fairly new.

Application

Previous URL: http://aiw1.uspto.gov/.aiw?Docid=20220110447&idkey=NONE New URL: https://pdfaiw.uspto.gov/.aiw?docid=20220110447&SectionNum=2 Direct access: https://pdfaiw.uspto.gov/47/2022/04/011/2.pdf

Publication

Previous URL: http://patimg1.uspto.gov/.piw?Docid=05123456&idkey=NONE New URL: https://pdfpiw.uspto.gov/.piw?docid=05123456&SectionNum=2 Direct access: https://pdfpiw.uspto.gov/56/234/051/2.pdf

amotl avatar Apr 14 '22 18:04 amotl

I found a contradicting example. Within the document US10194689B2, drawings at section 2 ^1 will only start on page 5 ^2.

amotl avatar Apr 14 '22 23:04 amotl

Hi again.

a49eeae34d has a fix for this issue, and b0d8825cb covers it with corresponding test cases. Both are part of #49. Thank you again!

With kind regards, Andreas.

amotl avatar Apr 15 '22 22:04 amotl

Dear @aghster,

USPTO PatFT and AppFT servers have been decommissioned recently, see #61.

With kind regards, Andreas.

amotl avatar Nov 28 '22 02:11 amotl