github-dependents-info WARNING:root:WARNING: Unable to get integer from ""

🔬 How To Reproduce

Steps to reproduce the behavior:

>pipx install github-dependents-info
>github-dependents-info --repo nvuillam/npm-groovy-lint

produces:

WARNING:root:WARNING: Unable to get integer from ""
WARNING:root:WARNING: Unable to get integer from ""
Total: 24
Public: 24 (1625 stars)
Private: -24

Environment

Windows 11, pipx 1.2.0, CPython 3.11.4

📈 Expected behavior

No warnings during normal operation, as described in the readme.

Oct 08 '24 15:10 JamesParrott

It happens without pipx too, in a normal venv.

It is due to an intentional warning, which I personally find to be irritating and pointless. If Github's API returns an empty string instead of "0", then in my opinion empty strings and everything other value the API is likely to return, should be handled gracefully by applications built on that API.

Please change this warning to a debug message:.

https://github.com/nvuillam/github-dependents-info/blob/69ce4e67f022e67a669634d3e5a7d6c3f8d33538/github_dependents_info/gh_dependents_info.py#L482

There is no value in showing it to all users during typical execution, and it detracts from the UX.

Oct 08 '24 16:10 JamesParrott

I have stumbled across this looking for the reason why the total number of dependents extracted is much smaller than the number reported on the GitHub page. The problem is that when accessing the pages unauthenticated, GitHub displays a header that contains the same icon used in identifying the bit of HTML that contains the number we want to get. It is in the "Product" menu, in the "Code Search" entry.

I should also mention that there is a PR https://github.com/nvuillam/github-dependents-info/pull/607 that seeks to address the issue but does not seem to do so successfully, though it makes the warning message disappear. It seems to be reading a number but the wrong one. That's at least for the repo I am interested in.

What works for me is to change line https://github.com/nvuillam/github-dependents-info/blob/69ce4e67f022e67a669634d3e5a7d6c3f8d33538/github_dependents_info/gh_dependents_info.py#L78 like so:

svg_items = soup.find_all("svg", {"class": "octicon-code-square"})
svg_item = svg_items[2]

This simply takes the 3rd occurrence of the SVG, which works today but may well break again tomorrow. Perhaps someone has an idea how we can identify the correct section more robustly?

Edit: turns out this does not solve my original problem, which is that the tool scrapes only a fraction of the dependents I expected. Hence, my journey continues...

Jan 17 '25 16:01 alexvoss