Fixing skipped tests
Describe the bug We have a few issues with the tests, causing the builds to fail.
To Reproduce Taking a look at the build logs, we see the following errors:
Ubuntu
======================================================================
FAIL: test_download_mandates_csv (test_module.TestScholarly)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/runner/work/scholarly/scholarly/test_module.py", line [66](https://github.com/scholarly-python-package/scholarly/runs/6168033939?check_suite_focus=true#step:7:66)8, in test_download_mandates_csv
self.assertEqual(policy[agency_index], agency_policy[agency])
AssertionError: '82%' != ''
- 82%
+
======================================================================
FAIL: test_related_articles_from_author (test_module.TestScholarly)
Test that we obtain related articles to an article from an author
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/runner/work/scholarly/scholarly/test_module.py", line 578, in test_related_articles_from_author
self.assertEqual(pub[key], same_article[key])
AssertionError: [71](https://github.com/scholarly-python-package/scholarly/runs/6168033939?check_suite_focus=true#step:7:71)856 != 718[82](https://github.com/scholarly-python-package/scholarly/runs/6168033939?check_suite_focus=true#step:7:82)
----------------------------------------------------------------------
Mac
FAIL: test_download_mandates_csv (test_module.TestScholarly)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/runner/work/scholarly/scholarly/test_module.py", line [66](https://github.com/scholarly-python-package/scholarly/runs/6168034015?check_suite_focus=true#step:7:66)8, in test_download_mandates_csv
self.assertEqual(policy[agency_index], agency_policy[agency])
AssertionError: '82%' != ''
- 82%
+
======================================================================
FAIL: test_related_articles_from_author (test_module.TestScholarly)
Test that we obtain related articles to an article from an author
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/runner/work/scholarly/scholarly/test_module.py", line 5[78](https://github.com/scholarly-python-package/scholarly/runs/6168034015?check_suite_focus=true#step:7:78), in test_related_articles_from_author
self.assertEqual(pub[key], same_article[key])
AssertionError: 71856 != 718[82](https://github.com/scholarly-python-package/scholarly/runs/6168034015?check_suite_focus=true#step:7:82)
======================================================================
FAIL: test_related_articles_from_publication (test_module.TestScholarly)
Test that we obtain related articles to an article from a search
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/runner/work/scholarly/scholarly/test_module.py", line 605, in test_related_articles_from_publication
self.assertEqual(related_article['bib']['title'], 'Large Magellanic Cloud Cepheid standards provide '
AssertionError: 'Planck 2015 results-xiii. cosmological parameters' != 'Large Magellanic Cloud Cepheid standards [110 chars]ΛCDM'
- Planck 2015 results-xiii. cosmological parameters
+ Large Magellanic Cloud Cepheid standards provide a 1% foundation for the determination of the Hubble constant and stronger evidence for physics beyond ΛCDM
Windows
======================================================================
ERROR: test_download_mandates_csv (test_module.TestScholarly)
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\a\scholarly\scholarly\test_module.py", line [66](https://github.com/scholarly-python-package/scholarly/runs/6168034099?check_suite_focus=true#step:7:66)7, in test_download_mandates_csv
agency_index = funder.index(agency)
ValueError: 'US National Science Foundation' is not in list
======================================================================
ERROR: test_search_author_single_author (test_module.TestScholarly)
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\a\scholarly\scholarly\test_module.py", line 305, in test_search_author_single_author
scholarly.pprint(author)
File "D:\a\scholarly\scholarly\scholarly\_scholarly.py", line 437, in pprint
print(pprint.pformat(to_print))
File "c:\hostedtoolcache\windows\python\3.8.10\x64\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u015f' in position 6[68](https://github.com/scholarly-python-package/scholarly/runs/6168034099?check_suite_focus=true#step:7:68)1: character maps to <undefined>
======================================================================
FAIL: test_related_articles_from_author (test_module.TestScholarly)
Test that we obtain related articles to an article from an author
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\a\scholarly\scholarly\test_module.py", line 578, in test_related_articles_from_author
self.assertEqual(pub[key], same_article[key])
AssertionError: [71](https://github.com/scholarly-python-package/scholarly/runs/6168034099?check_suite_focus=true#step:7:71)856 != 718[82](https://github.com/scholarly-python-package/scholarly/runs/6168034099?check_suite_focus=true#step:7:82)
----------------------------------------------------------------------
It is a bit confusing what causes the issues, especially the issues associated with the mandates calls
I've seen these and was never able to reproduce this locally. These seem specific to Github Actions which are making it extremely difficult to find and fix.
#424 fixes all of actual bugs that are reported here. Two unsolved issues are:
-
On Windows runners, writing and reading files appears problematic which causes an issue with
download_mandates_csv. Resolving it by skipping the test on Windows only. -
The paper title mismatches is likely due to old caches on the machines. There's no way for us to force cache clearing, and Github automatically clears them after 7 days of not being used. https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy. I suggest we reduce the frequency of cron jobs to be run once a week to not be impacted by this caching.
Repurposing this issue to fix the tests we currently skip so we don't have to skip them.
The tests that passed during the merge are failing again. I propose we use CircleCI in addition to Github Actions for now to see if we can reproduce the failures and potentially debug them, which is not possible in Github Actions. https://circleci.com/docs2/2.0/ssh-access-jobs
ScraperAPI increased the price to $50 per month and I canceled my subscription. Perhaps this is the reason for the failures.
No, I'm pretty sure that's not the reason. The tests are failing because the fetched values are not what they are expected to be. This is restricted to GHAs alone since I can't reproduce this locally. But good to know that the tests are otherwise robust with freeproxies
I think the ScraperAPI key still works, but only for a limited number of queries.
Btw, perhaps the issue appears due to proxies caching the requests for longer than they should?
ScraperAPI increased the price to $50 per month and I canceled my subscription.
To make matters worse, ScraperAPI is charging 25 credits per request for pages in Google domain. https://www.scraperapi.com/documentation/#curl-CreditsAndRequests
This means we can only make 40 requests for Google Scholar per month with the free plan.