uspto-patent-data-parser issues

Results 8 uspto-patent-data-parser issues

Sort by recently updated

Only the first line of claim text is read in

When looking for claim data, only the first line of claim data is ingested. Claims can contain many lines of text. An example: ```xml 1. An imaging lens system including,...

softwaregravy

Doc-number in application-reference overwrites doc-number in publication-reference

When parsing the bibliographical information, we just insert the keys ```python invention_title = root_tree.find(invention_title_path) document_data = {} if publication_info != None: publication_reference_info = {element.tag: element.text for element in list(publication_info)} document_data...

softwaregravy

Suggestion about func "read_and_parse_txt_from_disk"

Moreover, I suggest this func should be changed like this, because I meet the encoding problem: ``` def read_and_parse_txt_from_disk(path_to_file,data_items): try: with open(path_to_file,'r',encoding='utf-8') as f: txt = f.read() except: with open(path_to_file,'r',encoding='latin1')...

GengYuIsland

uspto-patent-data-parser
uspto-patent-data-parser copied to clipboard

Metadata

Only the first line of claim text is read in

Doc-number in application-reference overwrites doc-number in publication-reference

Suggestion about func "read_and_parse_txt_from_disk"

Parse 1998 data error

No summary parsing logic for xml4 format.

Convert doc-number to patent number?

How the UPSTO data is updated.

Update parse_xml_v4_file.py

← Metadata

Owner

Metadata

uspto-patent-data-parser uspto-patent-data-parser copied to clipboard

Metadata

← Metadata

Owner

Metadata

uspto-patent-data-parser
uspto-patent-data-parser copied to clipboard