jsoup icon indicating copy to clipboard operation
jsoup copied to clipboard

Added useCache parameter to JSoup's Connection. Fix #336

Open Skylion007 opened this issue 11 years ago • 5 comments

Adds a useCache parameter to Connection to allows user to specify if they want to use the system's URL cache. Sometimes the URL cache can cause issues with Java Web Start and Applets so specifying it can be useful. Otherwise, the sets useCache to false as is recommended.

Issue #336 was caused by the system default value being set to true.

Skylion007 avatar Dec 30 '13 17:12 Skylion007

Thanks @Skylion007.

In practise, what does this actually do to HTTP requests? Are there actual local HTTP caches? From my (quick) reading, looks like this might set a Pragma: no-cache header.

How did you identify this was the cause of #336?

When is it ever going to be helpful to turn it on?

jhy avatar Oct 02 '14 05:10 jhy

@jhy I ran into a similar problem to #336 when I tried to launch an app in Java Web Start. Basically, I found this value is set to true in default when launched from Java Web Start (I assume to conserve resources). After searching StackOverflow I discovered that the useCache variable was the root of the problem. I made a workaround initially where I changed the default value in a very hacky way but then decided to do this so it's always false unless otherwise specified. To be honest, having this feature enabled by default causes more headaches than benefits.

Unfortunately, Java VM's url cache is really awful and locks up really easy when you through multi-threading into the mix. It tells Java to cache URL data. You might want to turn it on in very specific cases. (Let's say you want to load a specific web page only once on start up for instance and reference it again later.) If you try to reload the webpage though without setting it back to false you'll get an error. To be honest, I personally have never enabled the feature, but power to the end user you know?

I actually found an odd use case to set it to true with a little bit of googling/ https://stackoverflow.com/questions/2065868/need-help-with-strange-classgetresource-issue/4741663#4741663

In conclusion, I don't think it does anything to HTTPHeader because in normal operation, it's set to false by default. Setting to true makes the JVM do some funky things before the URL request is made.

Skylion007 avatar Oct 08 '14 04:10 Skylion007

Thanks for the detail @Skylion007. So the verdict is that we should always force set it to "false", so that it overrides the bad default of Web Start enabling it, and doesn't change behavior for other non-Web Start users?

jhy avatar Oct 08 '14 06:10 jhy

@jhy Exactly. Additionally, if the user ever encounters a use case where they want to use the cache, this gives them the option to enable it.

Skylion007 avatar Oct 08 '14 14:10 Skylion007

I've fixed the merge conflicts if that is what was preventing you from merging this pull request.

Skylion007 avatar Aug 09 '15 14:08 Skylion007

(Closing old PRs; thanks for your interest and apologies for the late review. I don't plan to merge this, but if you believe there is still an issue, please leave a comment.)

jhy avatar Dec 30 '23 01:12 jhy