trie4j
trie4j copied to clipboard
use bytebuffer abstraction instead of byte[]
Hey, I would love it if trie4j could use ByteBuffer instead of byte[]. This could be useful for avoiding heap space for large static tries, by using DirectByteBuffer, MappedByteBuffer etc..
I only looked at the code briefly, and found that it could be quite simple to modify BytesSuccinctBitVector.java to use ByteBuffer. Wanted to know if there are more places I should be looking at?
WDYT?
Thanks!
I'm not sure, but using ByteBuffer may cause serious performance decrease because of copy operation from native memory to heap. But, anyway, you can try and measure performance. I recommend creating ByteBufferSuccinctBitVector class by copy and modify BytesSuccinctBitVector and use it in TailLOUDSTrie class such like:
new TailLOUDSTrie(
firstTrie,
new LOUDSBvTree(new ByteBufferSuccinctBitVector(firstTrie.nodeSize() * 2)),
new ConcatTailArrayBuilder(firstTrie.size() * 4));
There might be some performance decrease, but in some cases (like mine, where things are really huge) eliminating extra GC could prove to be beneficial in the big picture.. In any case this assumption has to be measured.
When Ill get to this, should I bother with the specialized BitVectors as well (such as BytesRank0OnlySuccinctBitVector)?
I see.
FYI, according to the URL below, using Unsafe class and DirectBuffer class seem to provide faster access to natively allocated memory like this:
Field field = sun.misc.Unsafe.class.getDeclaredField("theUnsafe");
field.setAccessible(true);
unsafe = (sun.misc.Unsafe) field.get(null);
sun.nio.ch.DirectBuffer buffer=(sun.nio.ch.DirectBuffer)ByteBuffer.allocateDirect(size);
unsafe.getInt(byteBuffer.address(), offset);
http://stackoverflow.com/questions/11174231/compare-direct-and-non-direct-bytebuffer-get-put-operations
I don't think you must consider about bit vector variations because these classes provides slightly low performance improvements (But if you need it, you can implement it :).
Thanks :) Initial testing with AllTries.java and jawiki-20150805-all-titles-in-ns0.gz gives:
with 1564854 words and 12611354 chars in wikipedia titles.
warming up... running all tries 10 times.
TailLOUDSTrie(LOUDSBvTree(BytesSuccinctBitVector),ConcatTailArrayBuilder)
TailLOUDSTrie(LOUDSBvTree(ByteBufferSuccinctBitVector),ConcatTailArrayBuilder)
1 2 3 4 5 6 7 8 9 10 ... done.
executiong each process 10 times.
TailLOUDSTrie(LOUDSBvTree(BytesSuccinctBitVector),ConcatTailArrayBuilder), 265, 721, 38914280
TailLOUDSTrie(LOUDSBvTree(ByteBufferSuccinctBitVector),ConcatTailArrayBuilder), 247, 754, 38592176
So there is a little delay for a little improvement in memory consumption. Looking further with memory analyzer, it appears that TailLOUDSTrie.labels is actually taking the majority of space. Ill make a BBTailLOUDSTrie with similar optimizations and recheck.
That's so nice!!! I can't wait your pull-req (Do you plan to do that?)
Ahh.. Yes, indeed, the tail array is a memory eater. Though I want to improve that, currently no idea.
Thanks! Yes, I plan on sending a pull request, still some more things to check though.
Here is another attempt, this time adding a BBTailLOUDSTrie:
TailLOUDSTrie(LOUDSBvTree(BytesSuccinctBitVector),ConcatTailArrayBuilder), 226, 655, 39325432
BBTailLOUDSTrie(LOUDSBvTree(ByteBufferSuccinctBitVector),ConcatTailArrayBuilder), 244, 751, 35004480
We again see some slight degradation in speed but this time a more noticeable improvement in memory consumption! I am sure we could propagate these changes to more places and reduce memory usage even further..
Do you have any papers you were basing your implementation on that I could read? I would love improve my understanding of this project.
That's great! I'm looking forward to your pull req :)
I read Japanese book http://www.amazon.co.jp/gp/product/4774149934/ (Technologies to support Japanese input method editors) and also gather information from web (all Japanese):
- http://nanika.osonae.com/DArray/dary.html
- http://d.hatena.ne.jp/echizen_tm/20110811/1313083180
- http://rn.hatenablog.com/archive/category/Java?page=2
I think base implementation is similar to original papers because the authors of documents above refer those, but I add following some little optimizations:
- use index to speed up bitvector access (BytesSuccinctBitVector.indexCache0)
- use Character to code map to make DoubleArray dense (DoubleArray.charToCode)
and so on. But I will not be surprised if other implementation has those kind of optimization because those are very simple.