feat(azure_logs_ingestion sink): Initial `azure_logs_ingestion` sink
Summary
The current azure_monitor_logs sink uses the Data Collector API, which has been deprecated and will be removed in September 2026.
This sink uses the replacement Logs Ingestion API.
While I did consider making this a drop-in replacement for the existing sink, users need to make numerous breaking infrastructure changes, including:
- Creating new Data Collection Endpoint and Data Collection Rule resources
- Moving from a workspace-based secret key to an OAuth credential (App Registration, Managed Identity, etc.)
- (optionally) Re-configuring logs to use the built-in tables, instead of
_CLcustom tables.
Change Type
- [ ] Bug fix
- [X] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance
Is this a breaking change?
- [ ] Yes
- [X] No
How did you test this PR?
- Following the Tutorial steps, create a Log Analytics workspace, App Registration, Data Collection Endpoint, and Data Collection Rule
- Set the
AZURE_TENANT_ID,AZURE_CLIENT_ID, andAZURE_CLIENT_SECRETenvironment variables from the App Registration - Use the following
vector.yaml:
sources:
stdin:
type: stdin
sinks:
azure:
type: azure_logs_ingestion
inputs:
- stdin
endpoint: https://dce-e42z.westus2-1.ingest.monitor.azure.com
dcr_immutable_id: dcr-00000000000000000000000000000000
stream_name: Custom-vector_CL
Does this PR include user facing changes?
- [X] Yes. Please add a changelog fragment based on our guidelines.
- [ ] No. A maintainer will apply the "no-changelog" label to this PR.
References
- Closes: https://github.com/vectordotdev/vector/issues/20978
- Mentioned in: https://github.com/vectordotdev/vector/issues/20625 - while this PR doesn't resolve this issue for
azure_blob, by using the Azure Identity crate, this sink supports passwordless credentials.
We will need some documentation files. See an example here (all files under
website). Note thatbase/is generated bymake generate-component-docs.
Apologies, I thought this page was auto-generated as well - added now.
Is the intention to complete replace the
azure_monitor_logssink? If that's the case maybe we can mark the existing one as deprecated in favor of this new sink.
Good point, added 🙂
Hi @jlaundry, we received this report https://github.com/vectordotdev/vector/issues/23036 and we will be reverted to older azure_* crate versions. Does this affect your PR?
Hi @jlaundry, we received this report #23036 and we will be reverted to older azure_* crate versions. Does this affect your PR?
From memory, there is a minor refactor required in https://github.com/jlaundry/vector/blob/02562be6447af36404d8b5668434e317c87a45b2/src/sinks/azure_logs_ingestion/config.rs#L139 to change it back to azure_identity::create_default_credential()?;, and possibly subsequent type changes... but reverting to 0.17 or 0.19 won't fundamentally block this PR, as thankfully I was using the raw REST API 🙂
Probably easiest if you rollback the package first, and then I'll rebase, retest, and push.
FYI - Reverted deps https://github.com/vectordotdev/vector/pull/23039
FYI, I started preparing the rebase, but I've seen that the Azure Rust team have recently decided to change how authentication works with the azure_identity SDK: https://github.com/Azure/azure-sdk-for-rust/issues/2283
Depending on what they decide, we may need to explicitly configure credentials, either via the vector config file or environment variables. The azure_blob sink will need a similar change (unless using connection_string config).
So instead of releasing an initial sink, and then requiring another config/environment change, I'll wait until there's stability in the SDK before moving forward with this PR.
(I'm not giving up!!!)
Hi @jlaundry,
Thanks a lot for your initiative! In the company I work in we need this sink exactly.
I went over your PR and I haven't seen explicit auth fields (like connection_string for AzureBlobStorage or shared_key for AzureMonitorLogs). Are they implicit in some way? Or are client secrets not supported for authentication?
I'm asking because our use case is running vector on an AWS machine and connecting to multiple Azure sinks, so if only AAD or ENV_VAR based authentication is currently supported, we wouldn't be able to use it.
Do you consider adding the explicit possibility to use client secrets per sink? I'd be happy to contribute to that.
By the way, I see that the discussion here is closed. Does it mean you will go forward with merging your PR?
Thanks a lot! Joel
So instead of releasing an initial sink, and then requiring another config/environment change, I'll wait until there's stability in the SDK before moving forward with this PR.
Makes sense. I feel that these crates are unfortunately a bit unstable and each version update is risky. Thank you for your interest in contributing 👍
Hello @jlaundry, @pront Is the above-mentioned blocking point still applicable? I don't have a lot of experiences in rust but I would be interested in working on it
Hi all, currently the azure_* crates are not in a good state. Coincidentally, @thomasqueirozb was looking at this issue today. He will comment on this PR if we have a solution.
For those playing along at home (hi @yoelk @Renizmy), a summary of the current issues and why we're blocked:
- Vector's
azure_blobsink currently uses theazure_storageandazure_storage_blobscrates, which are deprecated/legacy/EOL. The last version released was 0.21.0, which aligns to the 0.21.0azure_coreandazure_identitycrates. - The proposed replacement
azure_storage_blobcrate is a ground-up re-implementation, currently in it's infancy; Microsoft have a big scary warning that there are bugs, and this crate must not be used in production. - In addition, the
azure_coreandazure_identity0.22.0 crates changed the Traits of various components, and refactored the project structure, making the updated crates incompatible with the lastazure_storageout of the box. - I've seen other projects in the same boat do things like compatibility shims to use
azure_storage0.21.0 with more recentazure_core(which is what @thomasqueirozb is working on in https://github.com/vectordotdev/vector/pull/23351)... which works, but adds technical debt to each project. Theazure_storagecrate will need to be removed eventually. - But, the bigger issue: for those unfamiliar with Azure workloads, there are a multitude of different ways to get credentials, depending on the deployment type and usage requirements (Managed Identities, Workload Identities, Azure CLI, certificate files... all the way down to good old OAuth Client ID & Secret). Usually, these are abstracted by the language's SDK through the
DefaultAzureCredentialclass.- The Go SDK and Python SDK documentation have better descriptions and examples of how this works if you're interested
- Starting with
azure_identity0.22.0, the Microsoft team decided to makeDefaultAzureCredentialdifferent for the Rust SDK, and only use development credentials, for unspecified security reasons. While aChainedTokenCredentialwas proposed (again, similar to the pattern established in the Go/Python/.NET/JavaScript SDKs), this was also removed. - The net result is that Rust projects that upgrade to
azure_identity>= 0.22.0 will need to explicitly add configuration for the user to specify what credential type they're intending to use, and then implement a switch to instantiate the appropriate Credential - otherwise, current production deployments that use Managed/Workload Identities orAZURE_*environment variables will just stop working. - ... and this morning, I see that Microsoft are considering re-designing the identity library, using cross-compiled .NET code (🫤), so more turbulence is on the horizon...
What I think this means for this PR, and my thoughts/opinions for the wider project:
- Upgrading
azure_identityin #23351 will break current users of theazure_blobsink unless they are using aconnection_string. Given that this is on the critical path to update Vector to usehttp 1.x, this is probably still worth doing - but existing users will need to migrate their config to use a connection string. - Once #23351 has been merged, I can then restart development on this PR, and I'll spend some time designing some reusable config options for selecting the appropriate Credential.
- Finally, once the
azure_storage_blobcrate reaches production stability, that's probably the point to migrate theazure_blobssink, and as part of that introduce the various identity config options.
Also note: The existing azure_monitor_logs sink is unaffected by all this drama, because it (only) uses a shared key credential. However, the upstream API is still going to be deprecated September 2026.
- Upgrading
azure_identityin #23351 will break current users of theazure_blobsink unless they are using aconnection_string. Given that this is on the critical path to update Vector to usehttp 1.x, this is probably still worth doing - but existing users will need to migrate their config to use a connection string.
Hi @jlaundry, I missed the context on this one. How does this break existing users? E.g. assume we keep the connection_string and go ahead with that PR. The old configs will still load. Are you saying that it will break in production? Making connection_string mandatory has other benefits so we will probably go ahead with making it mandatory.
Hi @jlaundry, I missed the context on this one. How does this break existing users? Making
connection_stringmandatory has other benefits so we will probably go ahead with making it mandatory.
@pront the current documented behavior of the azure_blob sink is that if the storage_account is specified, it will attempts to load credentials for the account in the following ways, in order:
- read from environment variables (more information)
- looks for a Managed Identity
- uses the az CLI tool to get an access token (more information)
This is based on the azure_identity <= 0.21.0 behavior of DefaultAzureCredential. Once upgraded, and unless we create our own Credential wrapper struct that re-implements the old behavior, this will change to:
- uses the az CLI tool to get an access token (more information)
Based on past experience with customers, I expect 60-70% of production users are using connection_string, and among the remaining it's evenly split between environment variables and Managed Identities - but there's no real way to know for sure until it breaks. I don't think anyone has a valid use case for using an az CLI identity outside development environments.
So yes, I support removing the storage_account field, and forcing everyone to use connection_string until we have a patterned for using environment variables and Managed Identities.
FYI @jlaundry: this https://github.com/vectordotdev/vector/pull/23351 was merged
Very nice work. Looking forward to seeing this upstream. Right now I am using Logstash with the microsoft-sentinel-log-analytics-logstash-output-plugin plugin as a temporary solution.
Hi.
Can we expect this to be included in the next release of Vector?
Hi.
Can we expect this to be included in the next release of Vector?
Hi @sb1-nicolai, this depends on @jlaundry and the community. The Vector team is not actively working on this PR.
While I am still keen to finish this feature in time for the older API deprecation, I don't want to load the project with an unmanageable/unsupported mess. Unfortunately, not much has changed since I wrote https://github.com/vectordotdev/vector/pull/22912#issuecomment-3064019563
The azure_storage_blob crate still doesn't have feature parity with azure_storage_blobs, and the Microsoft Rust team aren't making their roadmap clear. Other projects are continuing to vendor the old azure_storage_blobs crate with shims.
And on the azure_identity side, it's also unclear if they're moving ahead with their (horrible, IMHO) plan to wedge .NET cross-compiled code in, and it appears the product group are resisting adopting the AZURE_TOKEN_CREDENTIALS convention that they've established with the other languages.
@sb1-nicolai if Azure Log Ingestion is important to you and your team, may I please suggest you reach out to your Microsoft CSAM, and ask them to escalate to the product group.
Otherwise... Vector has fantastic support for other cloud logging platforms... 😉
Hi @jlaundry
Following the discussion about AZURE_TOKEN_CREDENTIALS and per-sink authentication:
I see there's interest in using AZURE_TOKEN_CREDENTIALS to control credential chain behavior, but I'm struggling to understand how this would be compatible with per-sink authentication
Fo example:
- If we set
AZURE_TOKEN_CREDENTIALS="prod", ALL sinks will use the prod chain (Environment → WorkloadIdentity → ManagedIdentity) - If we set
AZURE_TOKEN_CREDENTIALS="ManagedIdentityCredential", ALL sinks will use only ManagedIdentity - I don't see a way to have Sink A use one credential type and Sink B use another
This creates an incompatibility with @yoelk's requirement for different authentication per sink
Thoughts?