DeferringList are not deferring
I understand that the DeferringList class has the purpose to "defer" or delay making a new request to get extra data, until the caller (in courtlistener) runs the DupChecker and decides that the record in question is not a duplicate using the hash of the content got from item["download_url"] (lines 291-299 in scrapers/management/commands/cl_scrape_opinions.py)
However the DeferringList is currently executing it's requests inside Site().parse(), by iterating over items on the assignment and cleaning (L138) of attributes. This happens before the DupChecker is even instantiated.
I have written a test case to be run on juriscraper python -m unittest -v tests.network.test_DeferringList showing this.
Code for it can be found in this branch
https://github.com/freelawproject/juriscraper/compare/main...grossir:juriscraper:deferring_list_not_deferring?expand=1