melody icon indicating copy to clipboard operation
melody copied to clipboard

Fix concurrent panic

Open lesismal opened this issue 3 years ago • 10 comments

if two goroutines exec these two lines nearly the same time: https://github.com/olahol/melody/blob/master/session.go#L59 https://github.com/olahol/melody/blob/master/session.go#L24

the two goroutines will get the same opening state, and if close(s.output) exec first https://github.com/olahol/melody/blob/master/session.go#L63

case s.output <- message will panic: https://github.com/olahol/melody/blob/master/session.go#L30

try this example to reproduce this bug:

package main

import (
	"log"
	"sync"
	"time"

	"github.com/gin-gonic/gin"
	"github.com/gorilla/websocket"
	"github.com/olahol/melody"
)

func client(wg *sync.WaitGroup) {
	defer wg.Done()
	c, _, err := websocket.DefaultDialer.Dial("ws://localhost:5000/ws", nil)
	if err != nil {
		log.Fatal("dial:", err)
	}
	defer c.Close()

	text := "test"
	err = c.WriteMessage(websocket.TextMessage, []byte(text))
	if err != nil {
		log.Printf("write: %v", err)
	}

	// try to trigger that melody Session at the server side getting the same opening state before close(s.input)
	time.AfterFunc(time.Second*2, func() {
		c.Close()
	})

	for {
		_, _, err := c.ReadMessage()
		if err != nil {
			return
		}
	}
}

func main() {
	r := gin.Default()
	m := melody.New()

	r.GET("/ws", func(c *gin.Context) {
		m.HandleRequest(c.Writer, c.Request)
	})

	m.HandleMessage(func(s *melody.Session, msg []byte) {
		go func() {
			// try to trigger that getting the same opening state before s.input<-
			for s.Write(msg) == nil {
			}
		}()
	})

	go func() {
		for {
			wg := &sync.WaitGroup{}
			for i := 0; i < 20; i++ {
				wg.Add(1)
				go client(wg)
			}
			wg.Wait()
		}
	}()

	r.Run("localhost:5000")
}

then wait for enough time, we will get:

panic: send on closed channel

goroutine 2105 [running]:
github.com/olahol/melody.(*Session).writeMessage(0xc000483500, 0xc000ba55f0)
        somedir/src/github.com/olahol/melody/session.go:30 +0x6c
github.com/olahol/melody.(*Session).Write(0xc000483500, 0xc000714400, 0x4, 0x200, 0x0, 0x0)
        somedir/src/github.com/olahol/melody/session.go:149 +0x90
main.main.func2.1(0xc000483500, 0xc000714400, 0x4, 0x200)
        somedir/src/github.com/olahol/melody/examples/chat/main.go:50 +0x50
created by main.main.func2
        somedir/src/github.com/olahol/melody/examples/chat/main.go:48 +0x65

lesismal avatar May 19 '21 19:05 lesismal

Why remove the pointer of sync.RWMutex. Does this not go wrong in some cases, such as when using a shallow copy?

iwctwbai avatar Mar 09 '22 09:03 iwctwbai

I only wanted to pr this commit: https://github.com/lesismal/melody/commit/e21886f3a7b601a5509274fb14dcc66d859304a8 but when I committed some new on my branch, it was auto appended to this pr by github, I just reverted the latest and leave only https://github.com/lesismal/melody/commit/e21886f3a7b601a5509274fb14dcc66d859304a8 here.

lesismal avatar Mar 09 '22 11:03 lesismal

Why remove the pointer of sync.RWMutex. Does this not go wrong in some cases, such as when using a shallow copy?

It's not removing the pointer of sync.RWMutex.

Plz see the 1st floor, the close() doesn't guard the process of changing the conn's state and closing the output chan, some other goroutine may push data to the output chan after the chan has been closed.

And run the testing code, you will reproduce the panic.

lesismal avatar Mar 09 '22 11:03 lesismal

@lesismal The problem you are referring to does exist and you fixed it.

But I mean here https://github.com/olahol/melody/commit/c116958ce5fce3fc54e6e1f282353d294df0d547#diff-3d4a6b7d54b5107bc5694b7a06383f9ec4c68f9abe43ba0942f5d970f3a8ccacL21

iwctwbai avatar Mar 09 '22 12:03 iwctwbai

I only wanted to pr this commit: lesismal@e21886f but when I committed some new on my branch, it was auto appended to this pr by github, I just reverted the latest and leave only lesismal@e21886f here.

@iwctwbai here

lesismal avatar Mar 09 '22 12:03 lesismal

Yes, I just happened to see your code and was a little confused @lesismal

iwctwbai avatar Mar 09 '22 12:03 iwctwbai

@iwctwbai After I have made this pr, I saw this: https://github.com/olahol/melody/issues/58 , so I think this pr would not be reviewed and merged for a long time. After that I make new commit on my branch and didn't come back to see this pr again and didn't find that it was auto appended to this pr until you leave a message:joy:, thank you for your attention and review!

Hope everything is fine with olahol.

lesismal avatar Mar 09 '22 12:03 lesismal

Hope everything is fine with olahol.

iwctwbai avatar Mar 09 '22 15:03 iwctwbai

@lesismal Hello, I have an question about your commit content is not close output channel. it's no problem of send data to no read channel(output)?

my soluation is defer a recover func in writeMessage. however, this is not the best soluation.

z9905080 avatar Apr 10 '22 05:04 z9905080

Because writeMessage use select with default, then it won't block:

select {
case s.output <- message:
default:
	s.melody.errorHandler(s, errors.New("session message buffer is full"))
}

The chan will be gc even it is not closed.

So, it's ok.

lesismal avatar Apr 10 '22 06:04 lesismal

Thank you for the fix :+1: Sorry for ghosting for so long

olahol avatar Sep 14 '22 13:09 olahol

Really happy to see u again, welcome back!

lesismal avatar Sep 14 '22 14:09 lesismal

Thank you for the kind words and the help fixing this old library :slightly_smiling_face:

olahol avatar Sep 14 '22 14:09 olahol