arxivscraper icon indicating copy to clipboard operation
arxivscraper copied to clipboard

A python module to scrape arxiv.org for a date range and category

Results 8 arxivscraper issues
Sort by recently updated
recently updated
newest added

Hi, First of all thanks for the contribution to the community! Perhaps a silly question, is there any specific reason why the title, abstract, and authors are all lowercased by...

question

``` print('k',k,from_day,until_day,filters) //k ComputerScience 2022-04-25 2022-04-25 {'categories': ['cs', 'eess'], 'abstract': ['healthcare', 'medical', 'hospital']} scraper = arxivscraper.Scraper(category=k, date_from=from_day,date_until=until_day,filters=filters) tmp = scraper.scrape() print(tmp) ```

question

I copied the following url from the output of the program. The url looks for records between dates 2019-01-01 and 2019-05-10. URL: http://export.arxiv.org/oai2?verb=ListRecords&from=2019-01-01&until=2019-05-10&metadataPrefix=arXiv&set=cs But lot of records I got lie...

enhancement
question

Given the following code and the master version of the library (commit 2a0e00f81549f74c5e94f4ba0002dc89a7c1a14c) ``` import arxivscraper.arxivscraper as ax scraper = ax.Scraper(category='stat', date_from='2010-01-01', date_until='2012-02-01', t=10, filters={'categories':['physics:hep-ex', 'physics:astro-ph'], 'abstract':['deep learning']}) output =...

question

Hi, I have created a branch for further customized filtering function for scraped records. Please consider merging it to the main branch. Thanks!

added possibility for filters to act on AND behavior

enhancement

I noticed that when I try to scrape the last year of papers in any one category, I get incomplete results even though the overall query does not time out...

bug

If people wish to scrape papers from arXiv that are by a specific author, it might be a really good idea to add a filter criterion for ORCIDs where someone...

enhancement