toxiproxy
toxiproxy copied to clipboard
Hangs indefinitely while removing toxic
package main
import (
"time"
toxiproxy "github.com/Shopify/toxiproxy/client"
"github.com/sirupsen/logrus"
)
var log = logrus.New()
func init() {
log.SetLevel(logrus.DebugLevel)
}
type Toxic struct {
client *toxiproxy.Client
proxy *toxiproxy.Proxy
}
func (toxic *Toxic) Clean() {
log.Debugln("Cleaning Toxics ......")
toxic.proxy.RemoveToxic("bandwidth_down")
log.Debugln("Done cleaning ......")
}
func (toxic *Toxic) HalfOpenConnection() {
toxic.Clean()
log.Debugln("Applying Halfopen Connection Toxic")
toxic.proxy.AddToxic("bandwidth_down", "bandwidth", "downstream", 1.0, toxiproxy.Attributes{
"rate": 0,
})
}
func main() {
var err error
// connect to toxiproxy server
var client = toxiproxy.NewClient("localhost:8474")
proxy, err := client.CreateProxy("toxicbroker", "127.0.0.1:9883", "127.0.0.1:1883")
if err != nil {
log.Fatalln("Couldn't create proxy client", err)
}
toxic := Toxic{client, proxy}
var clean1 = time.After(1 * time.Second)
var halfopen <-chan time.Time
var toxicTime = 20 * time.Second
for {
select {
case <-clean1:
halfopen = time.After(toxicTime)
toxic.Clean()
case <-halfopen:
clean1 = time.After(toxicTime)
toxic.HalfOpenConnection()
}
}
}
The Clean
methods blocks indefinitely. This is happening on both 2.1.2 and 2.1.3
This seems to be due to the bandwidth rate being set to 0. Right now zero / negative are undefined behavior for most toxics. This should probably be tested and documented in toxiproxy. Some of the toxics can be destructive to data too when removed, which shouldn't be the case.
I believe this was one of the issues I was trying to address in my Bidirectional toxics branch. Maybe I'll spend some time updating it.
@xthexder Thanks. I can help with testing your branch. Is there any other way to simulate half open connections?
Hi. Any update on this?
Hey folks! We ran frequently into this bug when using shopify/toxiproxy:2.1.4
with Testcontainers. Was this by any chance fixed with ghcr.io/shopify/toxiproxy:2.5.0
? Some of the 2.5.0 changes sound at least promising regarding this bug (Add TimeoutHandler for the HTTP API server
, On interrupting Bandwidth toxic, use non-blocking writes
etc.).
@Donnerbart It should be fixed in latest version by #436 .
I am going to close this issue, pls raise a new issue in case you have still problem.
@miry Thanks for the clarification about the deadlock.
@miry @xthexder Is it still unsupported to use the bandwidth proxy with rate 0
? This is what the official TestContainers container does, see https://github.com/testcontainers/testcontainers-java/blob/main/modules/toxiproxy/src/main/java/org/testcontainers/containers/ToxiproxyContainer.java#L175
We had big issues with hanging tests and created a workaround using the timeout proxy with 0
. That one works fine for us, but we're not 100% sure if it's a valid replacement/use-case. If so we could provide a fix for the Testcontainers code.
I think the timeout/non-blocking write changes have resolved the deadlocks in the bandwidth toxic. There's still some edge-case behavior to worry about if you plan on pausing a connection with bandwidth 0. Any time the bandwidth toxic is interrupted it may let some data through (Can happen when another toxic is added or removed, or the bandwidth is updated). Also if for some reason buffers get full and write timeouts start happening, the stream will be corrupted due to dropped data (This should log in the toxiproxy server at least).
If you don't care about resuming the connection after pausing, then the timeout toxic will be the better option. It will not buffer any data, and will immediately drop it, since the connection is expected to be closed after.
I'm not an active user of Toxiproxy these days, so I haven't really tested this scenario with the latest version.