media-core icon indicating copy to clipboard operation
media-core copied to clipboard

use off-heap cache

Open hamsterksu opened this issue 8 years ago • 13 comments

Hi @hrosa

i have refactored cache. Now i use off-heap cache and string key instead of URL. These changes should solve #283

URL class has trouble with hascode method: it executes getHostAddress which is synchronized

https://docs.oracle.com/javase/7/docs/api/java/net/URLStreamHandler.html#getHostAddress(java.net.URL)

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/net/URLStreamHandler.java#428

hamsterksu avatar Jan 16 '17 14:01 hamsterksu

Thank you @hamsterksu for another important contribution :) I will allocate some time this week to review. Thanks!

hrosa avatar Jan 17 '17 14:01 hrosa

thanks @hrosa i will do more tests before commit in future :(

i don't want to store custom object in cache, because it should be serializable also LazyLoader will be muitabe - so we need to have workaround with listener to put object to cache again when data is loaded.

hamsterksu avatar Jan 25 '17 09:01 hamsterksu

@hamsterksu any news on this one?

hrosa avatar Feb 23 '17 14:02 hrosa

hi @hrosa i can't find solution, i have removed "bytes" from "in memory" object - https://github.com/dataart-telco/mediaserver/commit/23dba6890834fa3a613c4ad601822e5845083acb

but old generation heap is still growing

hamsterksu avatar Feb 27 '17 15:02 hamsterksu

@hrosa i don't see a solution for this. Because this behavior is predictable. when we have 300 simultaneous calls we will download 300 audio files and create 300 ByteArrayInputStream. so app keeps 300 byte arrays in memory for 1 min. seems they will be moved to old gen. Am i right?

hamsterksu avatar Feb 28 '17 14:02 hamsterksu

@hamsterksu im a bit confused from your latests comments.. either is growing or is predictable? do you mean its growing until reaching cache maxsize?
why do we use old gean heap space if cache is set as offheap?

jaimecasero avatar Mar 01 '17 13:03 jaimecasero

Hi @hamsterksu That seems to be the case, the "downloader" is keeping the entire audio file in memory before passing it to the cache. The problem will get worse as file size increases (either by length or codec quality).

The only workaround that comes to mind is to use DirectByteBuffers. Needless to say, this will be very costly and makes me wonder about the usefulness of the cache in this scenario.

The other solution would be to stream the contents of the file directly into the ehcache's entry, but you were against it IIRC.

hrosa avatar Mar 01 '17 17:03 hrosa

Hi @hamsterksu

I've been thinking about this. In cases where no caching is used, we've seen that streaming seems to be doing good for RMS in terms of GC. Streaming allows a timely resource consumption naturally at IO level.

On the other hand, this caching implementation is not. Because it holds the entire payload in-memory before actually placing it into the cache, the data gets promoted to old gen. The problem gets worse the bigger the file (either by length and/or quality).

In last analysis, seems the problem here is the lack of support for stream-like behaviour in ehcache. I'm not very knowledgable with EhCache. Please correct me if this assumption is wrong.

The workaround I had in mind would be to keep a pool of DirectByteBuffers to store the payload before passing to the cache. But this solution is heavy and hard to maintain. And actually makes me wonder about the usefulness of the off-heap cache...

In case we don't find a proper workaround to justify the off-heap cache, then maybe the answer will be to cache the audio tracks locally in the filesystem. We can make use of streaming capabilities to reduce GC overhead and still have the tracks nearby for quick read operations.

What do you think about this?

hrosa avatar Mar 03 '17 00:03 hrosa

hi @hrosa,

i have updated implementation. I use ByteBuffer to store bytes of audio file. also i have replaced ehcache with guava cache. the first result of heap usage is pretty good.

but i have got some limitation of guava cache implementation. i can't predict how many files it can store

Guava implements cache with segments for concurrent access. Entity will be evicted if segment is full even if other segments are not.

Segments count can be managed: https://google.github.io/guava/releases/16.0/api/docs/com/google/common/cache/CacheBuilder.html#concurrencyLevel(int)

Their recommendation Ideally, you should choose a value to accommodate as many threads as will ever concurrently modify the table

I suppose our numbers of threads is number of players. am i right?

Also i want to add functionality to specify url pattern for cached audio. in this case we will have possibility to use cache for some static media and don't use it for dynamical content. (it was the main idea of cache feature)

WDYT?

one more comment: i see another issue with cache implementation - we need to download whole file before playing.

Thanks,

hamsterksu avatar Mar 14 '17 23:03 hamsterksu

@hrosa one additional issue - seems cache was removed from guava in the latest release. they recommended to use https://github.com/ben-manes/caffeine but seems it's for java 8.

which version of java do you recommend for mediaserver?

hamsterksu avatar Mar 15 '17 09:03 hamsterksu

Guava hasn't removed or deprecated their cache and it would take a minimum of 2 years from deprecation to removal.

If you require pre-JDK8, then another alternative is Cassandra's off-heap cache. However, it is also segmented.

ben-manes avatar Mar 15 '17 16:03 ben-manes

anyway we don't use "off-heap" cache from any library anymore. we use common cache + dircet ByteBuffer. but result will be off-heap cache

hamsterksu avatar Mar 15 '17 16:03 hamsterksu

Tks @hamsterksu

I will try to find some time in upcoming week to review your work.

Unfortunately, we are currently stuck on Java 7 for now. But I plan to move on to version 8 soon. Caffeine looks very interesting.

hrosa avatar Mar 17 '17 19:03 hrosa