enhancement(kubernetes_logs source): Wire up acknowledgement support via existing file ack support
Summary
Mirror changes made to file source as part of implementing acknowledgements support there over to kubernetes_logs source.
Change Type
- [ ] Bug fix
- [x] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance
Is this a breaking change?
- [ ] Yes
- [x] No
How did you test this PR?
Given that the underlying feature work for implementing acknowledgements in the file source is well-tested, my plan is to deploy a custom build of vector with this change to a dev environment and simply smoke test to confirm that, when acknowledgements are enabled via config, we see evidence that log events are not considered processed until they are fully delivered at the end of the sink chain (eg by counting unique log event ids at the end of the test after bouncing some of the intermediate hops in chain during the test).
Does this PR include user facing changes?
- [x] Yes. Please add a changelog fragment based on our guidelines.
- [ ] No. A maintainer will apply the "no-changelog" label to this PR.
Checklist
- [x] Please read our Vector contributor resources.
make check-allis a good command to run locally. This check is defined here. Some of these checks might not be relevant to your PR. For Rust changes, at the very least you should run:cargo fmt --allcargo clippy --workspace --all-targets -- -D warningscargo nextest run --workspace(alternatively, you can runcargo test --all)
- [ ] If this PR introduces changes Vector dependencies (modifies
Cargo.lock), please rundd-rust-license-tool writeto regenerate the license inventory and commit the changes (if any). More details here.
References
- Closes: https://github.com/vectordotdev/vector/issues/7462
- Ref https://github.com/vectordotdev/vector/issues/7336 and https://github.com/vectordotdev/vector/issues/9856 for global tracking of ACK support across sources
- Ref https://github.com/vectordotdev/vector/issues/7459 for ACK support in file source, which this PR depends on (implemented in https://github.com/vectordotdev/vector/pull/7816)
- Ref https://github.com/vectordotdev/vector/issues/9856 (which was closed as a dupe of 7462)
Hi @ganelo and thank you for this PR. The make check-component-docs check failed. Can you take a look?
Hi @ganelo and thank you for this PR. The
make check-component-docscheck failed. Can you take a look?
Hi @pront - this seems to be broken on master, as far as I can tell. In any case, nothing I've touched should have broken the ruby script. Not sure how best to proceed.
Hi @ganelo and thank you for this PR. The
make check-component-docscheck failed. Can you take a look?Hi @pront - this seems to be broken on master, as far as I can tell. In any case, nothing I've touched should have broken the ruby script. Not sure how best to proceed.
Can you share the failure? The script requires Ruby 3, which often trips people up.
Hi @pront - this seems to be broken on master, as far as I can tell.
I verified that it is not broken master.
In any case, nothing I've touched should have broken the ruby script. Not sure how best to proceed.
This PR adds configuration fields and we rely on source code to generate docs. Hence, docs need to regenerated. As @jszwedko pointed out we have a dependency on Ruby 3. My version locally:
❯ ruby -v
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [arm64-darwin22]
Ah okay, thanks for clarifying! Will try again tomorrow with the correct version of Ruby to get to the bottom of the failure.
On Thu, Feb 6, 2025, 18:37 Pavlos Rontidis @.***> wrote:
Hi @pront https://github.com/pront - this seems to be broken on master, as far as I can tell.
I verified that it is not broken master.
In any case, nothing I've touched should have broken the ruby script. Not sure how best to proceed.
This PR adds configuration fields and we rely on source code to generate docs. Hence, docs need to regenerated. As @jszwedko https://github.com/jszwedko pointed out we have a dependency on Ruby 3. My version locally:
❯ ruby -v ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [arm64-darwin22]
— Reply to this email directly, view it on GitHub https://github.com/vectordotdev/vector/pull/22366#issuecomment-2641355696, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4OKWXU3ELUEEVUJQVLFDD2OPW3BAVCNFSM6AAAAABWPOJTRWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBRGM2TKNRZGY . You are receiving this because you were mentioned.Message ID: @.***>
Looks better now 👍 Is it possible to add a test to cover the acks feature? You can draw inspiration from other sources e.g. the file source:
https://github.com/vectordotdev/vector/blob/b90715bd452e03f4ca151114f5475daca3d99832/src/sources/file.rs#L2375-L2376
Looks better now 👍 Is it possible to add a test to cover the acks feature? You can draw inspiration from other sources e.g. the
filesource:https://github.com/vectordotdev/vector/blob/b90715bd452e03f4ca151114f5475daca3d99832/src/sources/file.rs#L2375-L2376
To confirm - there aren't any existing tests that actually exercise the kubernetes_logs source, right? I poked around and didn't find much that did anything beyond some niche unit tests. Certainly nothing to mirror the run_file_source fn of file.rs.
I'm not utterly opposed to writing one from more or less whole cloth, but it will be somewhat slow going as I'm a relative newcomer to Rust.
@pront after a lot of off-and-on effort, finally managed to get some working tests added - note that the current tests require permissions to write to /var/log since that's where the kube crate expects kubernetes logs to be. if that's a problem, I'm not sure how to get around that requirement
@pront after a lot of off-and-on effort, finally managed to get some working tests added - note that the current tests require permissions to write to /var/log since that's where the kube crate expects kubernetes logs to be. if that's a problem, I'm not sure how to get around that requirement
Maybe I don't know the implementation well enough, but could we use let data_dir = tempdir().unwrap(); and set include_paths_glob_patterns to that?
@pront after a lot of off-and-on effort, finally managed to get some working tests added - note that the current tests require permissions to write to /var/log since that's where the kube crate expects kubernetes logs to be. if that's a problem, I'm not sure how to get around that requirement
Maybe I don't know the implementation well enough, but could we use
let data_dir = tempdir().unwrap();and setinclude_paths_glob_patternsto that?
I'm afraid not - data_dir only controls where vector writes its checkpoint files, the location that vector will look at to find the actual input kubernetes log files for the source is hardcoded here:
https://github.com/vectordotdev/vector/blob/689a65d51a2f8034f220afcf5e65e1b924019687/src/sources/kubernetes_logs/path_helpers.rs#L10
@pront after a lot of off-and-on effort, finally managed to get some working tests added - note that the current tests require permissions to write to /var/log since that's where the kube crate expects kubernetes logs to be. if that's a problem, I'm not sure how to get around that requirement
Maybe I don't know the implementation well enough, but could we use
let data_dir = tempdir().unwrap();and setinclude_paths_glob_patternsto that?I'm afraid not - data_dir only controls where vector writes its checkpoint files, the location that vector will look at to find the actual input kubernetes log files for the source is hardcoded here:
https://github.com/vectordotdev/vector/blob/689a65d51a2f8034f220afcf5e65e1b924019687/src/sources/kubernetes_logs/path_helpers.rs#L10
I see, is there anything preventing us from refactoring this? Tests requiring permissions to /var/log isn't great.
A hackier alternative is to provide different values based on #[cfg(test)] and #[cfg(not(test))].
@pront after a lot of off-and-on effort, finally managed to get some working tests added - note that the current tests require permissions to write to /var/log since that's where the kube crate expects kubernetes logs to be. if that's a problem, I'm not sure how to get around that requirement
Maybe I don't know the implementation well enough, but could we use
let data_dir = tempdir().unwrap();and setinclude_paths_glob_patternsto that?I'm afraid not - data_dir only controls where vector writes its checkpoint files, the location that vector will look at to find the actual input kubernetes log files for the source is hardcoded here: https://github.com/vectordotdev/vector/blob/689a65d51a2f8034f220afcf5e65e1b924019687/src/sources/kubernetes_logs/path_helpers.rs#L10
I see, is there anything preventing us from refactoring this? Tests requiring permissions to
/var/logisn't great.A hackier alternative is to provide different values based on
#[cfg(test)]and#[cfg(not(test))].
That's fair - pushed up a change which plumbs through a temp dir for testing.
Hey @pront - want to make sure you saw the latest changes here :)
@pront - I hesitate to close this out after so much iteration, but I do want to be up front that we've struggled to enable this in our environment. The behavior that we've been observing is bursts of full throughput interspersed with long periods of absolutely no events coming through. From some reading of the original PR that added support for file-based acknowledgements (https://github.com/vectordotdev/vector/pull/7816), I attribute this to perf concerns with the underlying implementation of acknowledgements for the file source, which obviously this PR just extends to the kubernetes_logs source.
So, bottom line is that we will probably not use this feature, but since presumably folks were comfortable merging https://github.com/vectordotdev/vector/pull/7816 despite the perf concerns, I think this PR is in a similar boat. I leave it to you and/or other Vector devs as to whether to proceed with this PR as written or not.
@pront - I hesitate to close this out after so much iteration, but I do want to be up front that we've struggled to enable this in our environment. The behavior that we've been observing is bursts of full throughput interspersed with long periods of absolutely no events coming through. From some reading of the original PR that added support for file-based acknowledgements (#7816), I attribute this to perf concerns with the underlying implementation of acknowledgements for the file source, which obviously this PR just extends to the kubernetes_logs source.
So, bottom line is that we will probably not use this feature, but since presumably folks were comfortable merging #7816 despite the perf concerns, I think this PR is in a similar boat. I leave it to you and/or other Vector devs as to whether to proceed with this PR as written or not.
Hi @ganelo, thank you for your efforts on this PR. The Vector team appreciates it. If nothing else, there's good test enhancement and code refactoring in this. Ideally we want those. Also, we still go ahead with it but we should add a note in the changelog and in the K8s logs source docs that this is in beta and mention the performance degradation you observed.