congress
congress copied to clipboard
[bugfix] Votes task does not cache index pages
Fixing the caching problem from issue #206
Hey. Like I said in the other issue, --force --fast
is probably the right default. I think your PR just gives it the behavior of --force
alone, which will re-download everything every time. And if we make --force
(part of) the default, then it should support a --cache
flag to turn it off.
Heya,
The changes made force fetching the vote_ids
for the house and Senate, but will not force re-downloading all the votes. I may be mistaken, but I was under the impression --force
will force downloading everything from scratch all over again (correct me if I err in my understanding).
The caching is preserved for the votes, at least when I run ./run votes --session=2017 --congress=115 --log=info
the logging reflects caching the votes appears preserved, as evidenced by statements like:
[h216-115.2017] Fetching...
Cached: (cache/115/votes/2017/h216/h216.xml, http://clerk.house.gov/evs/2017/roll216.xml)
[h216-115.2017] Writing to disk...
[h216-115.2017] Updated
(This is the same logging for cached downloads prior to the commit.) As opposed to explicitly using the --force
flag, which produces logging statements of the form
[h216-115.2017] Fetching...
Downloading: http://clerk.house.gov/evs/2017/roll216.xml
GET - http://clerk.house.gov/evs/2017/roll216.xml
[h216-115.2017] Writing to disk...
[h216-115.2017] Updated
Further, it seems the caching for the votes on motions depends on the options passed into utils.process_set(to_fetch, vote_info.fetch_vote, options)
on line 59, which is unchanged by this PR, which would indicate the changes made are not the same as making --force
the default.
I may very well be mistaken and in error (so please correct me if I am wrong), but it does not appear that caching has been invalidated for anything other than the (1) the House index page, (2) the paged listing of votes for the House, and (3) the Senate index page.
Ahhh, that sounds right.
In the past it's been helpful to have a completely off-line mode for testing, though. So I'd still like that we be able to turn force completely off with --cache, rather than removing that functionality.
And as I said, --force --fast
is probably the best default, which is different. --force --fast
will pick up new votes and will re-download recent votes in case they have been changed (which happens often).