vector
vector copied to clipboard
kubernetes_logs source wrongly collects messages from the beginning of the file
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
kubernetes_logs source is configured to tail the end of the log file to prevent it from collecting old messages (read_from: end). But when testing with an isolated service that it has never seen before, it collects messages from the beginning of the file. To ensure Vector thought the log file was "new", we change data_dir to a new empty directory.
After deploying vector, we expected to see messages starting from the moment that vector started. Instead we see messages going back more than day.
Configuration
data_dir: /var/lib/vector-agent
sources:
container_logs_source:
type: kubernetes_logs
auto_partial_merge: true
# We retain cached metadata for twice the default because the pod can
# be recreated.
delay_deletion_ms: 120000
extra_field_selector: "metadata.namespace=cme-internal-apps"
exclude_paths_glob_patterns:
- "**/*.gz"
- "**/*.tmp"
- "**/*.log.*"
# This is 25% of the default because we know that some tenants trigger rapid
# rolls. Ignore files older than 5 minutes old...but pull new events when
# written.
glob_minimum_cooldown_ms: 15000
ignore_older_secs: 300
# Read the default 32k per line (vector skips longer messages) and read
# twice the default bytes to get more of the message before roll.
# We still truncate to 10k later.
max_line_bytes: 32768
max_read_bytes: 4096
read_from: end
self_node_name: "${VECTOR_SELF_NODE_NAME}"
timezone: local
Version
0.28.1
Debug Output
No response
Example Data
No response
Additional Context
Vector logs that it discovered the log file
2023-06-01T16:08:14.435585Z INFO source{component_kind="source" component_id=container_logs_source component_type=kubernetes_logs component_name=container_logs_source}:file_server: vector::internal_events::file::source: Found new file to watch. file=/var/log/pods/cme-internal-apps_echo-server-cccfc9489-c49mf_26e09362-3a73-4af5-9aae-a9aca984c14f/echo-server/0.log
We see the first message in the echo-server log file is from yesterday.
head -1 0.log
2023-05-30T12:42:55.189666057-04:00 stdout F 10.156.154.140:43268 | GET /ping
And we see this message and many hundred more were also collected:
2023-05-30T12:42:55.189 10.156.154.140:43268 | GET /ping
The log file is a regular file on the filesystem
stat 0.log
File: ‘0.log’
Size: 2725064 Blocks: 9344 IO Block: 4096 regular file
Device: 10300h/66304d Inode: 67108944 Links: 1
Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-06-01 13:18:05.442032816 -0400
Modify: 2023-06-01 13:18:33.892942790 -0400
Change: 2023-06-01 13:18:33.892942790 -0400
Birth: -
References
No response
we're experiencing the same issue, even on latest 0.30.0 version
I also experienced same issue with 0.26.0
I'm also experiencing the same issue with the latest Version 0.34.1
Yes, i also experienced such issue in 0.34.1
0.36.0 still the same
Running into this on 0.39.0. It's making swapping out our old pipeline components with Vector fairly treacherous in busy clusters due to huge initial spikes in resource utilization.