presto-hive-apache
presto-hive-apache copied to clipboard
Switch AVRO InstanceCache to use Caffeine cache
Switch AVRO InstanceCache to use Caffeine cache
The InstanceCache was made time based cache to workaround google/guava#2408 as a part of 1d2f31e. The time-based cache did not work for all workloads, especially with the hardcoded 1 MINUTE ttl value. In addition the default concurrencyLevel of 4 was not good enough for the cache to work out-of-the-box. So switch the InstanceCache to use size-based Caffeine (https://github.com/ben-manes/caffeine) cache which has the fix for the original guava bug google/guava#2408.
Also in terms of concurrencyLevel switching to Caffeine helps. From https://github.com/ben-manes/caffeine/wiki/Benchmarks: "Caffeine and ConcurrentLinkedHashMap size their internal structures based on the number of CPUs"
CC: @anusudarsan
cc @wagnermarkd
Can you give details of how the 1 minute TTL was failing? @findepi , IIRC the TTL configuration was the only one that wouldn't have the GC issue from #28. Tuning that TTL is then the only option and I'd rather switch to a different cache that works for everyone than expose the TTL as a tunable value.
Can you give details of how the 1 minute TTL was failing?
@anusudarsan Do you remember the exact problem that was solved with this change?
We tried to get the information from the client, but did not get much details. We wanted the timeout and concurrencyLevel be configurable, to try different timeouts for different use cases, and having it hard-coded to 1 minute did not help anyway. So the better solution was to fix the cache itself to handle all workloads.