Bug: Race Condition
hi we used your library in one of our company projects. at first, we had this problem --> a few of our CCR packets were broken. for example, PDP-Address becomes 100.67.0.227 or 0.0.0.0 or 0.64.0.0 we started debugging and after a while, we tried go -race command and we got this DATA RACE:
==================
WARNING: DATA RACE
Write at 0x00c0002bae20 by goroutine 16:
runtime.slicecopy()
/usr/local/go/src/runtime/slice.go:246 +0x0
bufio.(*Reader).Read()
/usr/local/go/src/bufio/bufio.go:238 +0x2ac
io.ReadAtLeast()
/usr/local/go/src/io/io.go:314 +0x9c
io.ReadFull()
/usr/local/go/src/io/io.go:333 +0x6a4
github.com/fiorix/go-diameter/v4/diam.(*Message).readBody()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/message.go:121 +0x6b4
github.com/fiorix/go-diameter/v4/diam.ReadMessage()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/message.go:74 +0x1fd
github.com/fiorix/go-diameter/v4/diam.(*conn).readMessage()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:170 +0x36b
github.com/fiorix/go-diameter/v4/diam.(*conn).serve()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:197 +0x112
Previous read at 0x00c0002bae26 by goroutine 23:
main.isValid()
/home/amirhossein/Documents/MyProgramms/GoExamples/tools/diameter/server/diameter.go:51 +0x84
main.handleCCR()
/home/amirhossein/Documents/MyProgramms/GoExamples/tools/diameter/server/diameter.go:20 +0x176
github.com/fiorix/go-diameter/v4/diam.HandlerFunc.ServeDIAM()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:349 +0x51
github.com/fiorix/go-diameter/v4/diam.Handler.ServeDIAM-fm()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:31 +0x64
github.com/fiorix/go-diameter/v4/diam/sm.handshakeOK.ServeDIAM()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/sm/sm.go:179 +0xab
github.com/fiorix/go-diameter/v4/diam.(*ServeMux).serve()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:486 +0x457
github.com/fiorix/go-diameter/v4/diam.(*ServeMux).ServeDIAM()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:461 +0x3c6
github.com/fiorix/go-diameter/v4/diam/sm.(*StateMachine).ServeDIAM()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/sm/sm.go:116 +0x6e
github.com/fiorix/go-diameter/v4/diam.serverHandler.ServeDIAM()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:582 +0x101
github.com/fiorix/go-diameter/v4/diam.(*conn).serve()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:213 +0x133
Goroutine 16 (running) created at:
github.com/fiorix/go-diameter/v4/diam.(*Server).Serve()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:643 +0x5d2
github.com/fiorix/go-diameter/v4/diam.(*Server).ListenAndServe()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:603 +0x125
github.com/fiorix/go-diameter/v4/diam.ListenAndServeNetwork()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:657 +0x164
github.com/fiorix/go-diameter/v4/diam.ListenAndServe()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:668 +0x179
main.main()
/home/amirhossein/Documents/MyProgramms/GoExamples/tools/diameter/server/main.go:34 +0x224
Goroutine 23 (running) created at:
github.com/fiorix/go-diameter/v4/diam.(*Server).Serve()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:643 +0x5d2
github.com/fiorix/go-diameter/v4/diam.(*Server).ListenAndServe()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:603 +0x125
github.com/fiorix/go-diameter/v4/diam.ListenAndServeNetwork()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:657 +0x164
github.com/fiorix/go-diameter/v4/diam.ListenAndServe()
/root/go/pkg/mod/github.com/fiorix/go-diameter/[email protected]/diam/server.go:668 +0x179
main.main()
/home/amirhossein/Documents/MyProgramms/GoExamples/tools/diameter/server/main.go:34 +0x224
==================
We decided to divide the diam server and client into another repository (you can check it in this repository) but we still have the same problem. then we decided to debug this go-diameter library. in all of our tests and scenarios, reading a message in handler has race condition too so in my opinion there is not any problem with the unmarshal function. finally, we figure out where the bug is and how to fix it but our solution wasn't a perfect one:
// message.go line 66
func ReadMessage(reader io.Reader, dictionary *dict.Parser) (*Message, error) {
// remove pool and allocate memory
// buf := newReaderBuffer()
// defer putReaderBuffer(buf)
buf := new(bytes.Buffer)
m := &Message{dictionary: dictionary}
cmd, stream, err := m.readHeader(reader, buf)
if err != nil {
return nil, err
}
m.stream = stream
if err = m.readBody(reader, buf, cmd, stream); err != nil {
return nil, err
}
return m, nil
}
We thought the problem is from sync.pool but we were not sure so we changed more codes and refactored the whole algorithm of reading from conn. the result was the same and the problem is not from sync.pool. you can check this library_changes then open files or apply the patch. there are some changes and lots of them are because of change reading form conn algorithm but the important part is in message.go line 162, this solution fixes the problem but we still don't know why. my opinion is that we have a serious problem with bytes.buffer and []byte or maybe with golang garbage collector. please check this issue and tell us your opinion. Do you think as same as us that the problem is from this library? we have to be sure about this first then we can fix this bug and send a pull request.
I debugged this by my self and the problem is now solved, this is a simple simulated condition:
package main
import (
"bytes"
"net"
"os"
"os/signal"
)
type sampleStruct struct {
byteArray0 []byte
byteArray1 []byte
ip net.IP
}
func main() {
message := []byte{0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2}
ss := &sampleStruct{}
unmarshal(message, ss)
// reading
go func() {
for {
//if ss.ip.Equal(net.IP{10, 10, 10, 10}) {
if bytes.Equal(ss.ip, []byte{1, 1, 1}) {
}
}
}()
// writing
go func() {
for i := 0; i <= 255; i++ {
message[11] = byte(i)
}
}()
s := make(chan os.Signal, 1)
signal.Notify(s, os.Interrupt)
<-s
}
func unmarshal(data []byte, ss *sampleStruct) {
ss.byteArray0 = data[:4]
ss.byteArray1 = data[4:9]
ss.ip = data[9:]
//c := make([]byte, len(data))
//copy(c,data[9:])
//ss.ip = c
}
# save this in main.go and run by this command
go run -race main.go