scholarly
scholarly copied to clipboard
Unexpected behavior of `start_index` in `search_citedby` results in an empty generator
Describe the bug
When using the search_citedby
function from the scholarly
with a non-zero start_index
, the returned generator is expected to skip the specified number of articles and return the remaining articles that cite the given PUBLICATION_ID
. However, when start_index
is set to any value greater than 0, the generator unexpectedly contains 0 items, even though the corresponding Google Scholar URI (/scholar?hl=en&cites=16837829726140559426&as_ylo=2023&as_yhi=2023&as_vis=0&as_sdt=0,33&start=270
with cited_by._url
) when accessed directly in a browser shows the correct page with results.
To Reproduce
from scholarly import scholarly
PUBLICATION_ID = 16837829726140559426
# This should skip the first 270 articles and return the rest from the year 2023
cited_by = scholarly.search_citedby(PUBLICATION_ID, start_index=270, year_low=2023, year_high=2023)
total_results = cited_by._get_total_results()
print(f"Total results: {total_results}") # 0
When I use this code snippet with a start_index
equal to 0, it will print out Total results: 3340
but when I use it with any positive value for start_index
, it will print out Total results: 0
Expected behavior
The expected behavior is that the generator should skip the first start_index
number of articles and return the remaining articles that cite the PUBLICATION_ID
from the year 2023. Meaning that the code snippet should print out Total results: 3070
Screenshots Screenshots are not applicable as this is a code execution issue, but the maintainers can reproduce the issue using the provided code snippet.
Desktop:
- Proxy service: None
- python version: 3.10.11, 3.10.12
- OS: Windows 11, Ubuntu 22.04
- Version 1.7.11
Do you plan on contributing? Your response below will clarify whether the maintainers can expect you to fix the bug you reported.
- No, I am not able to contribute a bugfix at this time.