arctic icon indicating copy to clipboard operation
arctic copied to clipboard

Optimize Chunkstore mongo read query

Open BaiBaiHi opened this issue 4 years ago • 17 comments

Optimize chunkstore read query by joining metadata collection instead of querying for metadata on each individual date which accounted for about 40% of the processing time. Profile results below are run on data with 365 chunks.

Original Profile image

New Profile image

BaiBaiHi avatar Apr 02 '20 18:04 BaiBaiHi

Looks like tests are failing since it's trying to build with mongodb version 3.4.10. Syntax for the $lookup/$let pattern is only available after mongodb version 3.6...

BaiBaiHi avatar Apr 03 '20 17:04 BaiBaiHi

@BaiBaiHi there are other things deprecated that are breaking the build as well. I'm working on fixing them. @jamesblackburn whats the min mongodb version we need to support? 3.4 is quite old, can we bump it to 3.6 or 4.0+?

bmoscon avatar Apr 03 '20 18:04 bmoscon

Yep I think we can bump to 4.0.x (CC @rob256 )

jamesblackburn avatar Apr 06 '20 10:04 jamesblackburn

Yep I think we can bump to 4.0.x (CC @rob256 )

Yep, let's go to the latest 4.0 (4.0.17).

rob256 avatar Apr 06 '20 14:04 rob256

I've already updated the version of mongo on my branch that addresses a lot of the deprecations and what not, so once thats merged in, this branch can be rebased off that

bmoscon avatar Apr 06 '20 14:04 bmoscon

@BaiBaiHi can you rebase your fork off mainline? I updated many things to fix the build, so assuming after that the tests pass on this PR we can get it merged

bmoscon avatar Apr 07 '20 14:04 bmoscon

@bmoscon Looks like the Python 3 build is failing for tests/integration/tickstore/test_ts_write.py: test_ts_write_pandas due to column order.

The change I made should have no effect on tickstore and it looks like it's only an issue in the python 3 build.

BaiBaiHi avatar Apr 07 '20 16:04 BaiBaiHi

@BaiBaiHi sure but can you fix the style issues?

arctic/chunkstore/chunkstore.py:278:55: W291 trailing whitespace arctic/chunkstore/chunkstore.py:282:27: E261 at least two spaces before inline comment arctic/chunkstore/chunkstore.py:282:134: W291 trailing whitespace

bmoscon avatar Apr 07 '20 16:04 bmoscon

Yeah of course. My bad. Just set up my IDE on this machine. Didn't realize that my auto-formatter/checker wasn't turned on.

BaiBaiHi avatar Apr 07 '20 16:04 BaiBaiHi

@jamesblackburn it seems reasonable and it passes all the tests. I'm not sure how it will perform if actually need to spill out onto disk

bmoscon avatar Apr 07 '20 18:04 bmoscon

@BaiBaiHi @jamesblackburn what do you want to do with this?

bmoscon avatar May 02 '20 15:05 bmoscon

@jamesblackburn - just another ping - I think this seems reasonable from my testing

bmoscon avatar Sep 04 '20 13:09 bmoscon

Should we merge this?

shashank88 avatar Apr 09 '21 16:04 shashank88

If it's good to go we should merge this. If not, are there outstanding issues that I could help with? Would love to have consistent performance.

crazy25000 avatar Apr 14 '21 17:04 crazy25000

Can we hold off merging

You are correct in the issue diagnostics but the root cause of this is a missing index for the query. I have a fix which is simply adding the correct index and should be more efficient. Let me put out a PR and you can test locally? Let me know if this works!

Tom

TomTaylorLondon avatar Apr 15 '21 00:04 TomTaylorLondon

See https://github.com/man-group/arctic/pull/902

TomTaylorLondon avatar Apr 15 '21 00:04 TomTaylorLondon

@BaiBaiHi this is an exciting PR. Do you want to carry on and test against fix #902?

vietlq avatar Apr 10 '22 20:04 vietlq