rust-cached-path icon indicating copy to clipboard operation
rust-cached-path copied to clipboard

Cache fails to update due to missing ETAG header field

Open dduenker opened this issue 4 years ago • 6 comments

Hi, thanks for this library. :)

I am currently implementing a web scraper, which utilizes it and have come across the following issue: The web server i am contacting does not seem to provide a ETAG Header (https://www.koblenz.de/rathaus/verwaltung/pressemeldungen/). The initial caching worked and i was happy. But then the content of the website got updated and my scraper never got the new content, as cached_path seems to think, that the cached content is still up to date.

Can you confirm this behavior or might this be something i did wrong?

Edit: I am using version 0.5.1

dduenker avatar Apr 06 '21 20:04 dduenker

Hi @dduenker, this is expected given that the server doesn't provide an ETAG header. That said, you could use the freshness_lifetime setting to ensure the cache is periodically updated. Would that work for your use case?

epwalsh avatar Apr 06 '21 20:04 epwalsh

Thank you, i will try that. Is there any reason not to use the last-modified and cache-control headers?

dduenker avatar Apr 07 '21 16:04 dduenker

Is there any reason not to use the last-modified and cache-control headers?

No reason, that's a great idea actually. It wouldn't be hard to implement.. we could just treat either of those as the ETAG when the ETAG isn't present. Would you be interested in making a PR for that?

epwalsh avatar Apr 07 '21 16:04 epwalsh

Is there any reason not to use the last-modified and cache-control headers?

No reason, that's a great idea actually. It wouldn't be hard to implement.. we could just treat either of those as the ETAG when the ETAG isn't present. Would you be interested in making a PR for that?

Yes, i would be interested in doing this. :)

dduenker avatar Apr 07 '21 17:04 dduenker

Great! Let me know if you have any questions

epwalsh avatar Apr 07 '21 17:04 epwalsh

FYI it is possible to store an expiration time in the meta JSON file: https://github.com/epwalsh/rust-cached-path/blob/master/src/meta.rs#L20

epwalsh avatar Apr 07 '21 20:04 epwalsh