semantic-kernel icon indicating copy to clipboard operation
semantic-kernel copied to clipboard

Rewriting CosmosDB memory connector and memroy connector unit tests

Open JohnMasen opened this issue 1 year ago • 9 comments

Hi guys, I've improved the CosmosDB memory connector, before creating the PR I'm re-wrting the unit tests for memory connectors.

Improves for CosmosDB connector:

  1. Use serverside UDF for cosin simularity calculation, greatly reduce the network bandwith
  2. improve parition key implementation
  3. improve DoesCollectionExistAsync() and GetCollectionsAsync().

Improves to unit tests

  1. Create a base class targets to test IMemoryStore
  2. Imherit CosmosDBTester from the base class

This ensures every IMemoryStore implementation can has a consistent test cases and make unit test much more easier. The question here is I don't now if my work is duplicated with your guys? If you're already working on this please let me know. I will change my effort to other area.

Thanks

JohnMasen avatar Apr 15 '23 16:04 JohnMasen

Use serverside UDF for cosin simularity calculation, greatly reduce the network bandwith

This is interesting and would be a major improvement over what we have today (which was mostly intended as a stopgap)

@awharrison-28 would you mind commenting on this? If it's not being actively worked on could we take @JohnMasen up on the offer of help here?

craigomatic avatar Apr 17 '23 01:04 craigomatic

@JohnMasen there is currently no duplication of your work on our end. This is so appreciated, thank you for your efforts!

awharrison-28 avatar Apr 17 '23 16:04 awharrison-28

That's great, just made all unit tests passed. The basic features are all complete. Code optimization and cleanup in progress. Will create a PR after that. The latest change can be found at https://github.com/JohnMasen/semantic-kernel

JohnMasen avatar Apr 18 '23 16:04 JohnMasen

@JohnMasen thanks again for doing this, can't wait to see the PR!

evchaki avatar Apr 18 '23 20:04 evchaki

Just created a PR for this #536

JohnMasen avatar Apr 19 '23 16:04 JohnMasen

@JohnMasen we can't wait to get your PR in. Let us know if you have any questions.

evchaki avatar May 16 '23 20:05 evchaki

@evchaki the PR 536 was stucked, seems the integration test keep failing due to cosmosDB emulator was not installed on the test host. I've asked help in the PR but seems not caught any attention. you can find the lastest CosmosDB code at https://github.com/JohnMasen/semantic-kernel

JohnMasen avatar May 18 '23 04:05 JohnMasen

@JohnMasen I apologize for the delayed response here. We will be deprecating CosmosMemoryStore very soon due to it's expense and inefficiencies. We would really like to have your IMemoryStore unit test abstraction though. Feel free to keep your own implementation of CosmosMemoryStore though.

awharrison-28 avatar May 23 '23 22:05 awharrison-28

@awharrison-28 Thanks for your response. you can get my latest code from my reporsitory https://github.com/JohnMasen/semantic-kernel . Due to lack of experience in Github PR, I can't solve the integratinon test soon. Please feel free to get & merge it to your main branch with your own account, all I want is solve the cosmosDB efficient issue.

By the way, I'm thinking a new 2 layer memory system, it separates the search and storage into 2 different parts. this allows us to save the memory data to DBMS and still have fast in-memory embedding search feature.

JohnMasen avatar May 24 '23 01:05 JohnMasen

Closing this issue since the COSMOS DB connector was deprecated.

matthewbolanos avatar Jun 27 '23 16:06 matthewbolanos