toxiproxy icon indicating copy to clipboard operation
toxiproxy copied to clipboard

Hangs indefinitely while removing toxic

Open tekjar opened this issue 6 years ago • 3 comments

package main

import (
	"time"

	toxiproxy "github.com/Shopify/toxiproxy/client"
	"github.com/sirupsen/logrus"
)

var log = logrus.New()

func init() {
	log.SetLevel(logrus.DebugLevel)
}

type Toxic struct {
	client *toxiproxy.Client
	proxy  *toxiproxy.Proxy
}

func (toxic *Toxic) Clean() {
	log.Debugln("Cleaning Toxics ......")

	toxic.proxy.RemoveToxic("bandwidth_down")
	log.Debugln("Done cleaning ......")
}

func (toxic *Toxic) HalfOpenConnection() {
	toxic.Clean()
	log.Debugln("Applying Halfopen Connection Toxic")

	toxic.proxy.AddToxic("bandwidth_down", "bandwidth", "downstream", 1.0, toxiproxy.Attributes{
		"rate": 0,
	})
}

func main() {
	var err error

	// connect to toxiproxy server
	var client = toxiproxy.NewClient("localhost:8474")
	proxy, err := client.CreateProxy("toxicbroker", "127.0.0.1:9883", "127.0.0.1:1883")

	if err != nil {
		log.Fatalln("Couldn't create proxy client", err)
	}

	toxic := Toxic{client, proxy}

	var clean1 = time.After(1 * time.Second)
	var halfopen <-chan time.Time

	var toxicTime = 20 * time.Second

	for {
		select {
		case <-clean1:
			halfopen = time.After(toxicTime)
			toxic.Clean()
		case <-halfopen:
			clean1 = time.After(toxicTime)
			toxic.HalfOpenConnection()
		}
	}
}

The Clean methods blocks indefinitely. This is happening on both 2.1.2 and 2.1.3

tekjar avatar Apr 02 '18 08:04 tekjar

This seems to be due to the bandwidth rate being set to 0. Right now zero / negative are undefined behavior for most toxics. This should probably be tested and documented in toxiproxy. Some of the toxics can be destructive to data too when removed, which shouldn't be the case.

I believe this was one of the issues I was trying to address in my Bidirectional toxics branch. Maybe I'll spend some time updating it.

xthexder avatar Apr 02 '18 09:04 xthexder

@xthexder Thanks. I can help with testing your branch. Is there any other way to simulate half open connections?

tekjar avatar Apr 02 '18 09:04 tekjar

Hi. Any update on this?

tekjar avatar Dec 14 '18 16:12 tekjar

Hey folks! We ran frequently into this bug when using shopify/toxiproxy:2.1.4 with Testcontainers. Was this by any chance fixed with ghcr.io/shopify/toxiproxy:2.5.0? Some of the 2.5.0 changes sound at least promising regarding this bug (Add TimeoutHandler for the HTTP API server, On interrupting Bandwidth toxic, use non-blocking writes etc.).

Donnerbart avatar Oct 11 '22 14:10 Donnerbart

@Donnerbart It should be fixed in latest version by #436 .

I am going to close this issue, pls raise a new issue in case you have still problem.

miry avatar Oct 12 '22 10:10 miry

@miry Thanks for the clarification about the deadlock.

@miry @xthexder Is it still unsupported to use the bandwidth proxy with rate 0? This is what the official TestContainers container does, see https://github.com/testcontainers/testcontainers-java/blob/main/modules/toxiproxy/src/main/java/org/testcontainers/containers/ToxiproxyContainer.java#L175

We had big issues with hanging tests and created a workaround using the timeout proxy with 0. That one works fine for us, but we're not 100% sure if it's a valid replacement/use-case. If so we could provide a fix for the Testcontainers code.

Donnerbart avatar Oct 12 '22 14:10 Donnerbart

I think the timeout/non-blocking write changes have resolved the deadlocks in the bandwidth toxic. There's still some edge-case behavior to worry about if you plan on pausing a connection with bandwidth 0. Any time the bandwidth toxic is interrupted it may let some data through (Can happen when another toxic is added or removed, or the bandwidth is updated). Also if for some reason buffers get full and write timeouts start happening, the stream will be corrupted due to dropped data (This should log in the toxiproxy server at least).

If you don't care about resuming the connection after pausing, then the timeout toxic will be the better option. It will not buffer any data, and will immediately drop it, since the connection is expected to be closed after.

I'm not an active user of Toxiproxy these days, so I haven't really tested this scenario with the latest version.

xthexder avatar Oct 12 '22 16:10 xthexder