amqp091-go icon indicating copy to clipboard operation
amqp091-go copied to clipboard

100% CPU usage

Open pwlb opened this issue 3 years ago • 5 comments
trafficstars

Hello

package main

import (
	"context"
	"github.com/rabbitmq/amqp091-go"
	"sync"
	"time"
)

var wg sync.WaitGroup

func amqp(ctx context.Context) {
	defer func() {
		wg.Done()
	}()
	c, err := amqp091.Dial("amqp://guest:[email protected]:5672/")
	if err != nil {
		panic("connection error")
	}
	defer c.Close()

	<-ctx.Done()
}

const n = 16

func main() {
	ctx, cancel := context.WithCancel(context.Background())
	wg.Add(n)
	for i := 0; i < n; i++ {
		go amqp(ctx)
	}

	cancel()
	wg.Wait()

	time.Sleep(time.Hour)
}

The above code causes 100% CPU usage after a short while The problem does not always occur, so you may have to run it several times to reproduce the problem Strace attached after the problem occurred:

strace: Process 3167635 attached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100,00    4,571892     4571892         1         1 futex
  0,00    0,000010           5         2           rt_sigprocmask
  0,00    0,000002           1         2           getpid
  0,00    0,000001           0         2           gettid
  0,00    0,000001           0         2           tgkill
  0,00    0,000000           0         1           rt_sigaction
------ ----------- ----------- --------- --------- ----------------
100.00    4,571906                    10         1 total

pwlb avatar Jul 15 '22 21:07 pwlb

Hi, we could use a bit more information -

  • How did you discover this issue?
  • Does it affect real-world use of this library? This is a pretty artificial scenario.
  • Do you have time to investigate it yourself?

lukebakken avatar Jul 16 '22 19:07 lukebakken

While developing an application that has long sleeps in it, I encountered this problem. I eliminated all the external libs one by one, until the connection to rabbit alone remained, and the problem continued.

I wanted to write the simplest and shortest possible code, to reproduce the problem, and it turned out that just opening and closing the connection was enough. You can add opening a channel, publishing a message and closing a channel, and the problem will continue to occur, so I would not call it an "artificial scenario"

Unfortunately, I do not have time to look for the cause myself

pwlb avatar Jul 20 '22 23:07 pwlb

so I would not call it an "artificial scenario"

Now that we have an explanation it's not so artificial 😉

lukebakken avatar Jul 21 '22 14:07 lukebakken

https://stackoverflow.com/questions/57717635/golang-for-select-blows-up-cpu

lukebakken avatar Jul 21 '22 14:07 lukebakken

The problem stems from the goroutine used for a connection exiting before the connection is established. We don't have code (yet) to take into account scenarios where a connection is abruptly closed.

A workaround is to always give the connection time to establish before attempting to cancel it - https://github.com/lukebakken/amqp091-go_gh-103/blob/main/main.go#L18-L19

lukebakken avatar Jul 21 '22 15:07 lukebakken