orleans
orleans copied to clipboard
Initial/Migration CosmosDB Providers
This PR brings OrleansContrib CosmosDB providers to the main repo.
It also implement the features mentioned in #7726 and some other pending PRs on the original repo among with several QoL improvements.
Closes: #7726
Microsoft Reviewers: Open in CodeFlow
cc @mumby0168
cc @mumby0168
Hi thanks for tagging me in this, I've been away for the last few weeks, I get back this weekend. I'd love to take a look at this, when I am back :)
Thanks, @mumby0168! Apologies for the slow response, @galvesribeiro - I've been distracted. I hope to review it soon. cc @bradygaster
No problem. For some reason the service container for the emulator doesn't seems to be running on some of the test jobs. I've ran it against a real Cosmos Account and it just work.
This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.
cc @galvesribeiro I added some comments
Feedback addressed @ReubenBond. Sorry for taking so long.
Persistence and Reminder tests are not passing on my machine, targeting a real cosmosdb endpoint. Is there any specific config to do @galvesribeiro ?
There shouldn't be any differences or special config. They all pass on my machine either on the emulator or targeting a real account @benjaminpetit. I'll pull your changes in and try it again when I get home Tonight.
@benjaminpetit anything remaining here? If we decide to merge this for 7.0, perhaps we should add a beta tag to it until we have sufficient confidence
It seems like there is something with the setup on GH for the tests, will look at it. It run fine on the local emulator (tried on a windows VM over the weekend) or real account in Azure. Sorry I forgot to report back.
I believe the issue is currently that the certificate bundle needs to be split into multiple files (one per cert).
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL rehash: warning: skipping emulatorcert.pem,it does not contain exactly one certificate or CRL
The failure now is more basic... it can't even connect to the emulator to retrieve the cert.
Note that I disabled all other GH Actions jobs in this PR so that we can try to fix the CosmosDB one. We should revert that before merging.
The emulator sometimes takes a very long time to start. I don't think we need to split the cert, my guess it that sometimes the emulator isn't really started, and it downloads some garbage.
We had run in the past were the SSL part was fine.
Doesn't that error message indicate that the cert needs to be split? It says it's skipping the import
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL rehash: warning: skipping emulatorcert.pem,it does not contain exactly one certificate or CRL
On previous run we didn't had this warning, and we didn't had SSL errors during the test. Unless I am mistaken...
Maybe we can make the job loop until the emulator is available, since even 90s doesn't seem to cut it
Rebased, refactored, and tried to fix tests. I'm testing against a real-world instance of Cosmos DB and the tests are very flaky - too flaky to commit right now. I'm wondering what we can do to improve that.
I updated the options to reflect the style we've been using for other providers: one config method for each constructor/factory overload on the underlying storage client (CosmosClient
in this case).
I am also wrapping the unserializable CosmosException
in a new WrappedException
type.
I also removed the System.Text.Json serializer as a default: if users want it, they can use it, but I opted to use the Cosmos SDK's default instead.
This storage provider doesn't currently support pluggable serialization. I wonder if it should... I think that the IGrainStorageSerializer
interface would need to be updated to have some way to know that can output a JSON string instead of always binary. cc @benjaminpetit for fyi/thoughts.
TODO:
- [ ] Add Cosmos DB account to internal test infrastructure and make sure that tests work there
- [x] ~Decide what to do with
IGrainStorageSerializer
for JSON. We could take what it gives and Base64 encode it, but that largely defeats the purpose of using a document DB and we should figure this out for other JSON-compatible stores or stores where text is preferred.~ When a database offers its own serialization system, I believe we should delegate to that and prevent having multiple.
@galvesribeiro aside from the GitHub Actions emulator issues, is there anything remaining on this PR which comes to mind?
Nope. We sorted out everything and it was recently rebased on main.
What's left on this PR?
@scalablecory we are trying to get the tests passing using the Cosmos DB emulator. It has been very flaky for us, however. Eg:
We want to avoid merging things which we cannot test in CI. We've added CI tests for everything which we can, ideally using containers or emulators (Azure Event Hubs being an exception: it runs against a real instance, in private CI)
@scalablecory we are trying to get the tests passing using the Cosmos DB emulator. It has been very flaky for us, however. Eg:
We want to avoid merging things which we cannot test in CI. We've added CI tests for everything which we can, ideally using containers or emulators (Azure Event Hubs being an exception: it runs against a real instance, in private CI)
For the record, we've already tried Linux emulator container and now running on a Windows agent using the PowerShell startup. It seems really flaky...
We will disable GitHub Actions CI tests and instead we will run test via the private CI against a live instance of Cosmos DB. I've asked the Cosmos DB SDK team what their preferred package names would be. I will also give the PR a final once-over before merging.
I gave the PR a few once-overs and renamed everything in accordance with Cosmos' team's preferences. I believe this is ready to merge now.