opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

[filelogreceiver] CPU consumption increases (roughly) linearly with number of files watched

Open cwegener opened this issue 2 years ago • 12 comments

Component(s)

pkg/stanza, receiver/filelog

What happened?

Description

The CPU consumption of the filelog receiver increases (roughly) linearly as the number of files to be watched increases.

Steps to Reproduce

  1. Download otelcol-contrib (any recent release. tested on Linux AMD64 with different versions from 0.77 to 0.86)
  2. Create 100 small files with random content of about 1000 bytes in size. e.g. i=1; while [ $i -ne 100 ]; do dd if=/dev/urandom bs=1000 count=1 | base64 > /tmp/logs/$i.log; i=$(($i+1)); done
  3. Run OTel Collector with a simple pipeline of filelogreceiver -> loggingexporter and send the 100 files through the pipeline
  4. Observe CPU utilization of the otelcol-contrib process. On my test system (AMD Zen3 laptop CPU, 100 files uses ~5% of a single core to watch 100 files)
  5. Restart the process at step 2 and increase the number of files to be watched.
  6. Observe the CPU utilization increase. On my test system, at 1000 files watched, ~110% of one core is needed to watch the 1000 files

Expected Result

Good question.

I think that ultimately, the expected result is that the default configuration of the filelog receiver protects the administrator from putting unexpected burden on the Host CPU.

Actual Result

With a default configuration, the filelog receiver can cause significant increase in CPU utilization.

Collector version

v0.86.0

Environment information

Environment

OS: Archlinux x86_64, Compiler(if manually compiled): N/A. Using binaries from github for testing.

OpenTelemetry Collector configuration

receivers:
  filelog/manyfiles:
    include:
      - /tmp/logs/*

exporters:
  logging:
    # verbosity: detailed ## if visual verification of pipeline is required

service:
  pipelines:
    logs/manyfiles:
      receivers:
        - filelog/manyfiles
      processors: []
      exporters:
        - logging

Log output

./otelcol-contrib --config minimal.yaml
2023-10-03T18:51:22.983+1100    info    [email protected]/telemetry.go:84 Setting up own telemetry...
2023-10-03T18:51:22.983+1100    info    [email protected]/telemetry.go:201        Serving Prometheus metrics      {"address": ":8888", "level": "Basic"}
2023-10-03T18:51:22.983+1100    info    [email protected]/exporter.go:275        Deprecated component. Will be removed in future releases.       {"kind": "exporter", "data_type": "logs", "name": "logging"}
2023-10-03T18:51:22.985+1100    info    [email protected]/service.go:138  Starting otelcol-contrib...     {"Version": "0.86.0", "NumCPU": 16}
2023-10-03T18:51:22.985+1100    info    extensions/extensions.go:31     Starting extensions...
2023-10-03T18:51:22.985+1100    info    adapter/receiver.go:45  Starting stanza receiver        {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs"}
2023-10-03T18:51:22.988+1100    info    [email protected]/service.go:161  Everything is ready. Begin running and processing data.
2023-10-03T18:51:23.195+1100    info    fileconsumer/file.go:196        Started watching file from end. To read preexisting logs, configure the argument 'start_at' to 'beginning'      {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs", "component": "fileconsumer", "path": "/tmp/logs/1.log"}
2023-10-03T18:51:23.258+1100    info    fileconsumer/file.go:194        Started watching file   {"kind": "receiver", "name": "filelog/manyfiles", "data_type": "logs", "component": "fileconsumer", "path": "/tmp/logs/560.log"}
2023-10-03T18:51:23.307+1100    info    LogsExporter    {"kind": "exporter", "data_type": "logs", "name": "logging", "resource logs": 1, "log records": 100}
2023-10-03T18:51:23.307+1100    info    LogsExporter    {"kind": "exporter", "data_type": "logs", "name": "logging", "resource logs": 1, "log records": 100}

Additional context

A quick look at pprof shows that most CPU time is mostly spent on GC.

(pprof) top10 
Showing nodes accounting for 6420ms, 58.95% of 10890ms total
Dropped 280 nodes (cum <= 54.45ms)
Showing top 10 nodes out of 173
      flat  flat%   sum%        cum   cum%
    1700ms 15.61% 15.61%     1700ms 15.61%  runtime/internal/syscall.Syscall6
    1220ms 11.20% 26.81%     1220ms 11.20%  runtime.memclrNoHeapPointers
     960ms  8.82% 35.63%     2360ms 21.67%  runtime.scanobject
     950ms  8.72% 44.35%      950ms  8.72%  memeqbody
     360ms  3.31% 47.66%      540ms  4.96%  runtime.findObject
     300ms  2.75% 50.41%     2780ms 25.53%  runtime.gcDrain
     300ms  2.75% 53.17%      490ms  4.50%  runtime.greyobject
     220ms  2.02% 55.19%      220ms  2.02%  runtime.procyield
     210ms  1.93% 57.12%      210ms  1.93%  runtime.futex
     200ms  1.84% 58.95%      200ms  1.84%  encoding/base64.(*Encoding).Encode

cwegener avatar Oct 03 '23 07:10 cwegener

Pinging code owners:

  • filelogreceiver: @djaglowski
  • pkg/stanza: @djaglowski
  • receiver/filelog: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Oct 03 '23 07:10 github-actions[bot]

Performance improvements are always welcome.

djaglowski avatar Oct 03 '23 13:10 djaglowski

Pinging code owners for receiver/filelog: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Oct 03 '23 15:10 github-actions[bot]

Pinging code owners for pkg/stanza: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Oct 03 '23 15:10 github-actions[bot]

Performance improvements are always welcome.

I'll see how far I can get with just some flamegraph analysis.

cwegener avatar Oct 03 '23 20:10 cwegener

Thanks @cwegener. Feel free to share any insights you discover here.

djaglowski avatar Oct 03 '23 22:10 djaglowski

Thanks @cwegener. Feel free to share any insights you discover here.

My first instinct is that allocations are escaping to heap. I'll have a look which allocations that might be and if the code can be changed so that the compiler will use stack-based allocation instead.

cwegener avatar Oct 04 '23 05:10 cwegener

Looking at this again after some related discussions, I think I can explain what is going here.

Basically, we are remembering N file (roughly 3x what we find in a poll interval). Then, for each of the M files we find in a poll interval, we open it, read a fingerprint from it, and directly compare it against the N files we remember (unless we find a match, in which case we stop short).

There is clearly opportunity for improvement here and I think recent refactoring has put us closer to a trie-based solution which would give us much more efficient cross-referencing of files. Let's keep this open until we can make a substantial gain on this front.

djaglowski avatar Oct 31 '23 12:10 djaglowski

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • receiver/filelog: @djaglowski
  • pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jan 01 '24 03:01 github-actions[bot]

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • receiver/filelog: @djaglowski
  • pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Apr 15 '24 04:04 github-actions[bot]

Hello. Just a small comment/feedback. In our environments this bug is a blocker, so we had to deprecate failog receiver and switch to another (non-opentelemetry related) solution for a specific case.

afoninsky avatar Jun 17 '24 07:06 afoninsky

Thank you for the additional feedback @afoninsky. We will take a serious look at this before moving the component to GA.

djaglowski avatar Jun 28 '24 16:06 djaglowski

The most impactful mitigation I've found is to lengthen the poll_interval to 1s.

With a value of 200ms my understanding is the system checks all files 5 times per second. With 1000+ files, this creates significant overhead.

With poll_interval: 200ms

(pprof) top10
Showing nodes accounting for 17500ms, 47.44% of 36890ms total
Dropped 530 nodes (cum <= 184.45ms)
Showing top 10 nodes out of 250
      flat  flat%   sum%        cum   cum%
    4250ms 11.52% 11.52%     4250ms 11.52%  runtime/internal/syscall.Syscall6
    3030ms  8.21% 19.73%     3030ms  8.21%  memeqbody
    2330ms  6.32% 26.05%     3020ms  8.19%  encoding/json.appendCompact
    2010ms  5.45% 31.50%     2010ms  5.45%  runtime.memclrNoHeapPointers
    1400ms  3.80% 35.29%     4620ms 12.52%  runtime.scanobject
    1150ms  3.12% 38.41%     4260ms 11.55%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fileset.(*Fileset[go.shape.*uint8]).Match
     960ms  2.60% 41.01%     1270ms  3.44%  runtime.typePointers.next
     890ms  2.41% 43.43%     1920ms  5.20%  runtime.pcvalue
     860ms  2.33% 45.76%      860ms  2.33%  runtime.futex
     620ms  1.68% 47.44%      620ms  1.68%  encoding/json.stateInString

With poll_interval: 1s

(pprof) top10
Showing nodes accounting for 3650ms, 46.09% of 7920ms total
Dropped 238 nodes (cum <= 39.60ms)
Showing top 10 nodes out of 269
      flat  flat%   sum%        cum   cum%
     830ms 10.48% 10.48%      830ms 10.48%  runtime/internal/syscall.Syscall6
     790ms  9.97% 20.45%      790ms  9.97%  memeqbody
     460ms  5.81% 26.26%      580ms  7.32%  encoding/json.appendCompact
     310ms  3.91% 30.18%      930ms 11.74%  runtime.scanobject
     280ms  3.54% 33.71%      280ms  3.54%  runtime.memclrNoHeapPointers
     260ms  3.28% 36.99%     1050ms 13.26%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fileset.(*Fileset[go.shape.*uint8]).Match
     230ms  2.90% 39.90%      450ms  5.68%  runtime.pcvalue
     170ms  2.15% 42.05%      930ms 11.74%  github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/fingerprint.Fingerprint.StartsWith (inline)
     160ms  2.02% 44.07%      160ms  2.02%  runtime.futex
     160ms  2.02% 46.09%      160ms  2.02%  runtime.step

The next mitigation will be to reduce the max_concurrent_files because each file is read concurrently.

Hope this could help mitigate readers here.

r0mdau avatar Nov 03 '24 00:11 r0mdau