congress icon indicating copy to clipboard operation
congress copied to clipboard

[bugfix] Votes task does not cache index pages

Open pqnelson opened this issue 7 years ago • 3 comments

Fixing the caching problem from issue #206

pqnelson avatar Apr 05 '17 14:04 pqnelson

Hey. Like I said in the other issue, --force --fast is probably the right default. I think your PR just gives it the behavior of --force alone, which will re-download everything every time. And if we make --force (part of) the default, then it should support a --cache flag to turn it off.

JoshData avatar Apr 05 '17 14:04 JoshData

Heya,

The changes made force fetching the vote_ids for the house and Senate, but will not force re-downloading all the votes. I may be mistaken, but I was under the impression --force will force downloading everything from scratch all over again (correct me if I err in my understanding).

The caching is preserved for the votes, at least when I run ./run votes --session=2017 --congress=115 --log=info the logging reflects caching the votes appears preserved, as evidenced by statements like:

[h216-115.2017] Fetching...
Cached: (cache/115/votes/2017/h216/h216.xml, http://clerk.house.gov/evs/2017/roll216.xml)
[h216-115.2017] Writing to disk...
[h216-115.2017] Updated

(This is the same logging for cached downloads prior to the commit.) As opposed to explicitly using the --force flag, which produces logging statements of the form

[h216-115.2017] Fetching...
Downloading: http://clerk.house.gov/evs/2017/roll216.xml
GET - http://clerk.house.gov/evs/2017/roll216.xml
[h216-115.2017] Writing to disk...
[h216-115.2017] Updated

Further, it seems the caching for the votes on motions depends on the options passed into utils.process_set(to_fetch, vote_info.fetch_vote, options) on line 59, which is unchanged by this PR, which would indicate the changes made are not the same as making --force the default.

I may very well be mistaken and in error (so please correct me if I am wrong), but it does not appear that caching has been invalidated for anything other than the (1) the House index page, (2) the paged listing of votes for the House, and (3) the Senate index page.

pqnelson avatar Apr 05 '17 18:04 pqnelson

Ahhh, that sounds right.

In the past it's been helpful to have a completely off-line mode for testing, though. So I'd still like that we be able to turn force completely off with --cache, rather than removing that functionality.

And as I said, --force --fast is probably the best default, which is different. --force --fast will pick up new votes and will re-download recent votes in case they have been changed (which happens often).

JoshData avatar Apr 05 '17 22:04 JoshData