aspire icon indicating copy to clipboard operation
aspire copied to clipboard

Add WithPartitionCount and WithDataVolume to Azure.Hosting.Azure.CosmosDB

Open hansmbakker opened this issue 1 year ago • 9 comments

I found #3295 and added the code to this PR.

Fixes https://github.com/dotnet/aspire/issues/5136 Fixes https://github.com/dotnet/aspire/issues/3295

Microsoft Reviewers: Open in CodeFlow

hansmbakker avatar Aug 02 '24 13:08 hansmbakker

@martincostello can you please help me with this PR? I don't see the relation between my change and the build error, and I cannot see its details.

hansmbakker avatar Aug 09 '24 10:08 hansmbakker

Sorry I don't know - it just looks like a flaky test.

martincostello avatar Aug 09 '24 11:08 martincostello

Ok, is it possible for you to rerun the failed builds? Or in general - what would you suggest as a next step for me?

hansmbakker avatar Aug 09 '24 12:08 hansmbakker

I don't have permissions to re-run any builds. If you close the PR and re-open it, AzDO should re-run the build.

martincostello avatar Aug 09 '24 12:08 martincostello

@martincostello all checks are green now 👍🏼

hansmbakker avatar Aug 09 '24 13:08 hansmbakker

Do you think you could also add public api tests, we are going through many of such PRs and that would help to anticipate these, e.g. https://github.com/dotnet/aspire/pull/5149

And is it possible to add functional testing for these features? e.g. https://github.com/dotnet/aspire/blob/main/tests/Aspire.Hosting.PostgreSQL.Tests/PostgresFunctionalTests.cs#L69

sebastienros avatar Aug 09 '24 19:08 sebastienros

@sebastienros some issues:

  • TestDistributedApplicationBuilder.CreateWithTestContainerRegistry(testOutputHelper) results in an error with netaspireci.azurecr.io/cosmosdb/linux/azure-cosmos-emulator:latest not found. TestDistributedApplicationBuilder.Create(options => { }, testOutputHelper) is a workaround
  • The cosmosdb emulator is extremely slow to start (around 2 minutes I guess) and will be a pain for all people / CI using this test as I guess it is not possible to keep it alive across tests (https://github.com/dotnet/aspire/issues/923 is still far away).

Is it an idea to add your request as a separate issue independent from this PR?

PS: Is there a possibility for you to request internally to improve the cosmosdb emulator docker? Beside the slow start, there is a major issue with the cosmosdb emulator image that causes flaky integration tests in Azure Pipelines / GitHub runners. Depending on the allocated CI agent, it does not work properly. There is not much feedback from the maintaining team on it.

hansmbakker avatar Aug 12 '24 07:08 hansmbakker

@hansmbakker we have a new testing API to wait for some text to be logged per resource. Here is an example for cosmosdb: https://github.com/dotnet/aspire/pull/5251/files/42d137900d1ec727e6ba09dcecf83ce9d8995a6b#diff-5051cd0c129b32d78bdb46214c5ec1f3db3b004e77ea64fe95791a1afc0ed520R144

Note that this PR will be required as I had to add support for predicates instead of just texts, specifically for cosmosdb since the "Ready text" is used before as part of other logs.

As for the netaspireci.azurecr.io issue I will upload the image. In the meantime your suggestion with the Create call is fine.

For the other potential issue we haven't experienced it yet on the CI machines we use.

sebastienros avatar Aug 12 '24 16:08 sebastienros

As for the netaspireci.azurecr.io issue I will upload the image. In the meantime your suggestion with the Create call is fine.

Errata: we won't use netaspireci.azurecr.io since the image is on mcr.microsoft.com, so Create is the way to go. We do that in a few other places already. I also think that a new overload could be added to skip the first empty lambda.

sebastienros avatar Aug 12 '24 16:08 sebastienros

@sebastienros that looks useful but in my experience, just waiting for that text Started to be logged by the CosmosDB container is not enough.

I noticed that between the moment it logged Started and the moment that it starts returning valid responses to requests is still quite some time.

Other than that, it doesn't fix the point that the emulator takes about 2 minutes to start combined with the point that the Aspire tests won't reuse the container instance. In my opinion, that will result in very time-expensive tests.

hansmbakker avatar Aug 13 '24 07:08 hansmbakker

cc @eerhardt

davidfowl avatar Sep 07 '24 08:09 davidfowl

I would love to have this in Aspire 9 @eerhardt @sebastienros.

cc @radical to see if he has any advice WRT time to boot the container for testing.

davidfowl avatar Sep 08 '24 03:09 davidfowl

cosmos emulator starts up faster with:

                var cosmos = AppBuilder
                                .AddAzureCosmosDB("cosmos")
                                .RunAsEmulator(resource => resource.WithEnvironment("AZURE_COSMOS_EMULATOR_PARTITION_COUNT", "2"));

I'm adding this to the integration tests.

radical avatar Sep 10 '24 01:09 radical

We should change our default too

davidfowl avatar Sep 10 '24 02:09 davidfowl

@radical thank you for the feedback! I'll try adding some tests, but will have to timebox it.

@davidfowl would it be possible for you to reach out internally regarding the reliability issues of the CosmosDB emulator image? As mentioned above, it is not only the slow startup of this image but rather the reliability issues that make working with this image a pain - e.g. when using it in integration tests (with TestContainers on Azure Pipelines) it can lead to flaky tests.

Some issues my team ran into are:

  • https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/87
  • https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/45

Note that there are more reliability issues for other people:

  • https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/100
  • https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/104

hansmbakker avatar Sep 10 '24 07:09 hansmbakker

/azp run

davidfowl avatar Sep 16 '24 08:09 davidfowl

Pull request contains merge conflicts.

azure-pipelines[bot] avatar Sep 16 '24 08:09 azure-pipelines[bot]

@radical @sebastienros I updated the PR according to the feedback I got and would like to hear what you think.

Also I see there are issues issues with running multiple consecutive cosmosdb emulator instances and I don't have more time to spend on this unfortunately. Would any of you be able to have a look?

One option could be preventing teardown/spinup of the cosmosdb emulator in-between tests. Not great but a consequence of using the cosmosdb emulator in tests.

cc @davidfowl

hansmbakker avatar Sep 16 '24 11:09 hansmbakker

I'd like to make sure we get this in for Aspire 9.

davidfowl avatar Sep 22 '24 05:09 davidfowl

@davidfowl that's great! I'd also be happy to have this upstream - now I'm just using it as code in my own project.

Could you please be more specific as to who you're asking for a next step?

I'm happy to contribute what I've done so far, but the cosmosdb emulator is not great to be used in automated tests, and unfortunately I cannot commit an unpredictable amount of time to this.

I asked for a hand on this, but nobody stepped in yet. The branch is open to commits for maintainers of Aspire, so if this is important to have in Aspire 9, they are not blocked.

hansmbakker avatar Sep 22 '24 11:09 hansmbakker

First, resolve the merge conflicts. Second, @eerhardt I think we should split out the azure resource tests, but I'm not sure it should be done as part of this PR.

@hansmbakker add the test here:

https://github.com/dotnet/aspire/blob/main/tests/Aspire.Hosting.Azure.Tests/AzureCosmosDBEmulatorFunctionalTests.cs

Delete the extra project.

If this test is unreliable (which it might be because the cosmos emulator usually is), we'll disable the test (but keep the code for it).

davidfowl avatar Sep 22 '24 23:09 davidfowl

@davidfowl it seems ready to me 👍🏼 who can complete this?

hansmbakker avatar Sep 24 '24 10:09 hansmbakker

Thanks @hansmbakker !

davidfowl avatar Sep 24 '24 15:09 davidfowl