opentelemetry-collector-contrib [filelogreceiver] CPU consumption increases (roughly) linearly with number of files watched

Component(s)

pkg/stanza, receiver/filelog

What happened?

Description

The CPU consumption of the filelog receiver increases (roughly) linearly as the number of files to be watched increases.

Steps to Reproduce

Download otelcol-contrib (any recent release. tested on Linux AMD64 with different versions from 0.77 to 0.86)
Create 100 small files with random content of about 1000 bytes in size. e.g. i=1; while [ $i -ne 100 ]; do dd if=/dev/urandom bs=1000 count=1 | base64 > /tmp/logs/$i.log; i=$(($i+1)); done
Run OTel Collector with a simple pipeline of filelogreceiver -> loggingexporter and send the 100 files through the pipeline
Observe CPU utilization of the otelcol-contrib process. On my test system (AMD Zen3 laptop CPU, 100 files uses ~5% of a single core to watch 100 files)
Restart the process at step 2 and increase the number of files to be watched.
Observe the CPU utilization increase. On my test system, at 1000 files watched, ~110% of one core is needed to watch the 1000 files

Expected Result

Good question.

I think that ultimately, the expected result is that the default configuration of the filelog receiver protects the administrator from putting unexpected burden on the Host CPU.

Actual Result

With a default configuration, the filelog receiver can cause significant increase in CPU utilization.

Collector version

v0.86.0

Environment information

Environment

OS: Archlinux x86_64, Compiler(if manually compiled): N/A. Using binaries from github for testing.

OpenTelemetry Collector configuration

receivers:
  filelog/manyfiles:
    include:
      - /tmp/logs/*

exporters:
  logging:
    # verbosity: detailed ## if visual verification of pipeline is required

service:
  pipelines:
    logs/manyfiles:
      receivers:
        - filelog/manyfiles
      processors: []
      exporters:
        - logging

Log output

./otelcol-contrib --config minimal.yaml
2023-10-03T18:51:22.983+1100    info    [email protected]/telemetry.go:84 Setting up own telemetry...
2023-10-03T18:51:22.983+1100    info    [email protected]/telemetry.go:201        Serving Prometheus metrics      {"address": ":8888", "level": "Basic"}
2023-10-03T18:51:22.983+1100    info    [email protected]/exporter.go:275        Deprecated component. Will be removed in future releases.       {"kind": "exporter", "data_type": "logs", "name": "logging"}
2023-10-03T18:51:22.985+1100    info    [email protected]/service.go:138  Starting otelcol-contrib...     {"Version": "0.86.0", "NumCPU": 16}
2023-10-03T18:51:22.985+1100    info    extensions/extensions.go:31     Starting extensions...
2023-10-03T18:51:22.985+1100    info    adapter/receiver.go:45  Starting stanza receiver        {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs"}
2023-10-03T18:51:22.988+1100    info    [email protected]/service.go:161  Everything is ready. Begin running and processing data.
2023-10-03T18:51:23.195+1100    info    fileconsumer/file.go:196        Started watching file from end. To read preexisting logs, configure the argument 'start_at' to 'beginning'      {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs", "component": "fileconsumer", "path": "/tmp/logs/1.log"}
2023-10-03T18:51:23.258+1100    info    fileconsumer/file.go:194        Started watching file   {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs", "component": "fileconsumer", "path": "/tmp/logs/560.log"}
2023-10-03T18:51:23.307+1100    info    LogsExporter    {"kind": "exporter", "data_type": "logs", "name": "logging", "resource logs": 1, "log records": 100}
2023-10-03T18:51:23.307+1100    info    LogsExporter    {"kind": "exporter", "data_type": "logs", "name": "logging", "resource logs": 1, "log records": 100}

Additional context

A quick look at pprof shows that most CPU time is mostly spent on GC.

(pprof) top10 
Showing nodes accounting for 6420ms, 58.95% of 10890ms total
Dropped 280 nodes (cum <= 54.45ms)
Showing top 10 nodes out of 173
      flat  flat%   sum%        cum   cum%
    1700ms 15.61% 15.61%     1700ms 15.61%  runtime/internal/syscall.Syscall6
    1220ms 11.20% 26.81%     1220ms 11.20%  runtime.memclrNoHeapPointers
     960ms  8.82% 35.63%     2360ms 21.67%  runtime.scanobject
     950ms  8.72% 44.35%      950ms  8.72%  memeqbody
     360ms  3.31% 47.66%      540ms  4.96%  runtime.findObject
     300ms  2.75% 50.41%     2780ms 25.53%  runtime.gcDrain
     300ms  2.75% 53.17%      490ms  4.50%  runtime.greyobject
     220ms  2.02% 55.19%      220ms  2.02%  runtime.procyield
     210ms  1.93% 57.12%      210ms  1.93%  runtime.futex
     200ms  1.84% 58.95%      200ms  1.84%  encoding/base64.(*Encoding).Encode

Oct 03 '23 07:10 cwegener

Pinging code owners:

filelogreceiver: @djaglowski
pkg/stanza: @djaglowski
receiver/filelog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Oct 03 '23 07:10 github-actions[bot]

Performance improvements are always welcome.

Oct 03 '23 13:10 djaglowski

Pinging code owners for receiver/filelog: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

Oct 03 '23 15:10 github-actions[bot]

Pinging code owners for pkg/stanza: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

Oct 03 '23 15:10 github-actions[bot]

Performance improvements are always welcome.

I'll see how far I can get with just some flamegraph analysis.

Oct 03 '23 20:10 cwegener

Thanks @cwegener. Feel free to share any insights you discover here.

Oct 03 '23 22:10 djaglowski

Thanks @cwegener. Feel free to share any insights you discover here.

My first instinct is that allocations are escaping to heap. I'll have a look which allocations that might be and if the code can be changed so that the compiler will use stack-based allocation instead.

Oct 04 '23 05:10 cwegener

Looking at this again after some related discussions, I think I can explain what is going here.

Basically, we are remembering N file (roughly 3x what we find in a poll interval). Then, for each of the M files we find in a poll interval, we open it, read a fingerprint from it, and directly compare it against the N files we remember (unless we find a match, in which case we stop short).

There is clearly opportunity for improvement here and I think recent refactoring has put us closer to a trie-based solution which would give us much more efficient cross-referencing of files. Let's keep this open until we can make a substantial gain on this front.

Oct 31 '23 12:10 djaglowski

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/filelog: @djaglowski
pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Jan 01 '24 03:01 github-actions[bot]

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/filelog: @djaglowski
pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Apr 15 '24 04:04 github-actions[bot]

Hello. Just a small comment/feedback. In our environments this bug is a blocker, so we had to deprecate failog receiver and switch to another (non-opentelemetry related) solution for a specific case.

Jun 17 '24 07:06 afoninsky

Thank you for the additional feedback @afoninsky. We will take a serious look at this before moving the component to GA.

Jun 28 '24 16:06 djaglowski

The most impactful mitigation I've found is to lengthen the poll_interval to 1s.

With a value of 200ms my understanding is the system checks all files 5 times per second. With 1000+ files, this creates significant overhead.

With poll_interval: 200ms

(pprof) top10
Showing nodes accounting for 17500ms, 47.44% of 36890ms total
Dropped 530 nodes (cum <= 184.45ms)
Showing top 10 nodes out of 250
      flat  flat%   sum%        cum   cum%
    4250ms 11.52% 11.52%     4250ms 11.52%  runtime/internal/syscall.Syscall6
    3030ms  8.21% 19.73%     3030ms  8.21%  memeqbody
    2330ms  6.32% 26.05%     3020ms  8.19%  encoding/json.appendCompact
    2010ms  5.45% 31.50%     2010ms  5.45%  runtime.memclrNoHeapPointers
    1400ms  3.80% 35.29%     4620ms 12.52%  runtime.scanobject
    1150ms  3.12% 38.41%     4260ms 11.55%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fileset.(*Fileset[go.shape.*uint8]).Match
     960ms  2.60% 41.01%     1270ms  3.44%  runtime.typePointers.next
     890ms  2.41% 43.43%     1920ms  5.20%  runtime.pcvalue
     860ms  2.33% 45.76%      860ms  2.33%  runtime.futex
     620ms  1.68% 47.44%      620ms  1.68%  encoding/json.stateInString

With poll_interval: 1s

(pprof) top10
Showing nodes accounting for 3650ms, 46.09% of 7920ms total
Dropped 238 nodes (cum <= 39.60ms)
Showing top 10 nodes out of 269
      flat  flat%   sum%        cum   cum%
     830ms 10.48% 10.48%      830ms 10.48%  runtime/internal/syscall.Syscall6
     790ms  9.97% 20.45%      790ms  9.97%  memeqbody
     460ms  5.81% 26.26%      580ms  7.32%  encoding/json.appendCompact
     310ms  3.91% 30.18%      930ms 11.74%  runtime.scanobject
     280ms  3.54% 33.71%      280ms  3.54%  runtime.memclrNoHeapPointers
     260ms  3.28% 36.99%     1050ms 13.26%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fileset.(*Fileset[go.shape.*uint8]).Match
     230ms  2.90% 39.90%      450ms  5.68%  runtime.pcvalue
     170ms  2.15% 42.05%      930ms 11.74%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fingerprint.Fingerprint.StartsWith (inline)
     160ms  2.02% 44.07%      160ms  2.02%  runtime.futex
     160ms  2.02% 46.09%      160ms  2.02%  runtime.step

The next mitigation will be to reduce the max_concurrent_files because each file is read concurrently.

Hope this could help mitigate readers here.

Nov 03 '24 00:11 r0mdau

opentelemetry-collector-contrib opentelemetry-collector-contrib copied to clipboard

[filelogreceiver] CPU consumption increases (roughly) linearly with number of files watched

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Additional context

opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard