dione icon indicating copy to clipboard operation
dione copied to clipboard

No need for caching in sorted-iterator

Open uzadude opened this issue 2 years ago • 7 comments

Summary

In #71 we changed the block order. while checking the full iteration on Object Store (S3), we saw it is amazingly slow. What happened is that it switched to Random IO mode: .. 15:42:50,084 INFO fs.s3a.S3AInputStream - Switching to Random IO seek policy

Detailed Description

After debugging it looks like it happened because we try to seek back compared to the "natural" readahead buffer.

Luckily, now that we changed the block order we can just read the file sequentially without hopping at all. So just removing the seeking part works!

How was it tested?

regular unit tests

uzadude avatar Jan 04 '23 14:01 uzadude

What... How is that possible? When you reach a leaf block, you have to go back and seek elsewhere, no?

shay1bz avatar Jan 04 '23 16:01 shay1bz

I was also surprised that it just worked, but what happens is just that the parent blocks are saved in memory, and every time you need to read the next block from the file it is always the "correct" block as we saved it pre-order.

uzadude avatar Jan 04 '23 18:01 uzadude

Wow

shay1bz avatar Jan 05 '23 07:01 shay1bz

Are the parents explicitly cached by us? Do we make sure to uncache them? Just making sure there is no memory leak or something

shay1bz avatar Jan 05 '23 07:01 shay1bz

good point. I actually thought about that and forgot to check. but it looks like we're good. the sorted-iterator only has one pointer to Node next and the Node object only has one pointer to parent. So we will maximum log(n) nodes in memory.

BTW, I also need to verify this "Random IO" in the "search" iterator (get()).

uzadude avatar Jan 05 '23 07:01 uzadude

ok.. so after looking at get(key) , I saw that we can hop backwards there if we were calling get() multiple times. So I added assertion to allow only bigger keys, and added functionality to reuse the current state.

uzadude avatar Jan 12 '23 09:01 uzadude

Is this ready for release? Is it good for all three of our versions?

eyala avatar May 01 '23 11:05 eyala