PatCit
PatCit copied to clipboard
Making Patent Citations Uncool Again
Bumps [lxml](https://github.com/lxml/lxml) from 4.5.2 to 4.9.1. Changelog Sourced from lxml's changelog. 4.9.1 (2022-07-01) Bugs fixed A crash was resolved when using iterwalk() (or canonicalize()) after parsing certain incorrect input. Note...
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.10 to 1.26.5. Release notes Sourced from urllib3's releases. 1.26.5 :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap Fixed...
I am trying to unnest the front-page npl data on BigQuery so I can merge it into PATSTAT data in Stata. However, both the npl_publn_id and the cited_by variables are...
From this page: https://cverluise.github.io/PatCit/ to this page: https://cverluise.github.io/notebook
A given title (in `title_j`, `title_m`) can appear under different forms in the database. This might be due to typos (e.g Ibm Tchnical Disclosure Bulletin), abbreviations (Ibm Tdb), parsing error...
Around 0.8% of the NPL publication in the beta dataset have "Pages" as `title_j`. ## How to reproduce the behaviour ```sql SELECT * FROM ( SELECT * FROM `npl-parsing.patcit.beta` WHERE...
Around 10% of the npl_publn in the beta version have neither `title_j` nor `title_m` nor `title_main_a`. Most of the time, part of these elements are wrongly parsed the `title_main_m`. ##...
## Issue There are dead links in the `target` field. (e.g Http://Edrm.Net/002/Wp-Content/Uploads/2009/09/Edrm-Legaltech.Pdf ) ## How to reproduce the behaviour ```sql SELECT npl_publn_id, target FROM `npl-parsing.patcit.beta` WHERE npl_publn_id=260140 ``` |npl_publn_id|target| |---|----|...
## Motivation 1. There is a 1:n relation between `title_j` and `title_abbrev_j`. E.g. | title_j | count_distinct | title_abbrev_j | |---|---|---| | Inflammation Research | 3 | Inflamm. res.,Inflamm. Res.,Inflamm...
Technical bulletins (e.g. Ibm Technical Disclosure Bulletin) and conferences (e.g. various IEEE conferences) are frequent in npl citations. Many of them are not covered by crossref, meaning that they are...