truechain-consensus-core icon indicating copy to clipboard operation
truechain-consensus-core copied to clipboard

fix some issue that some var use without lock

Open iamyh opened this issue 7 years ago • 10 comments

解决了以下的问题:

  1. 一些变量读写并没有加锁保护
  2. 一些地方出现error,应该return而没有return
  3. 建议用同一的方法打日志,而不是fmt.Printf和自定义方法MyPrint混用(可能还存在其他地方)

一些建议:

  1. node那个struct由于有好几个属性变量,但是node只有一个lock,是否可以按照成员属性变量对于有一个锁,进而提高效率?
  2. 建议对属性变量的修改,涉及到锁保护的,都封装为函数,尽可能很简短的函数。函数里面统一加上mu.lock()和defer mu.Unlock(),否则非常容易出现死锁,尤其在node.go那个文件,里面有些函数比较长的情况。

iamyh avatar May 03 '18 02:05 iamyh

let me take a look. Could you please use English for future communication? thanks.

arcolife avatar May 04 '18 20:05 arcolife

ok.

iamyh avatar May 05 '18 06:05 iamyh

oh just checked, once in about 5 times, for i in {1..5}; do go build engine.go && ./engine ; done ..this would still error out with:

...
<snip>
...

goroutine 1535 [semacquire]:
sync.runtime_SemacquireMutex(0xc42021447c, 0xc420374500)
        /usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc420214478)
        /usr/local/go/src/sync/mutex.go:134 +0x108
_/workspace/truechain-consensus-core/pbft-core.(*Node).incCommDict(0xc420214400, 0xc420899380, 0x80)
        /workspace/truechain-consensus-core/pbft-core/node.go:690 +0x3a
_/workspace/truechain-consensus-core/pbft-core.(*Node).processCommit(0xc420214400, 0x1, 0x1b, 0x0, 0x2, 0xc420899380, 0x80, 0xffffffffffffffff, 0xc420899400, 0x80, ...)
        /workspace/truechain-consensus-core/pbft-core/node.go:838 +0xa3
_/workspace/truechain-consensus-core/pbft-core.(*Node).ProxyProcessCommit(0xc420214400, 0x1, 0x1b, 0x0, 0x2, 0xc420899380, 0x80, 0xffffffffffffffff, 0xc420899400, 0x80, ...)
        /workspace/truechain-consensus-core/pbft-core/node.go:456 +0x156
reflect.Value.call(0xc42012cc60, 0xc420248010, 0x13, 0x138a9ed, 0x4, 0xc420239f18, 0x3, 0x3, 0xc4202f3e00, 0x0, ...)
        /usr/local/go/src/reflect/value.go:447 +0x969
reflect.Value.Call(0xc42012cc60, 0xc420248010, 0x13, 0xc42077af18, 0x3, 0x3, 0x0, 0x0, 0x0)
        /usr/local/go/src/reflect/value.go:308 +0xa4
net/rpc.(*service).call(0xc420242040, 0xc420244000, 0xc4200182e0, 0xc4200182f0, 0xc42024a100, 0xc4202b1b60, 0x1334280, 0xc4208803c0, 0x199, 0x12f6400, ...)
        /usr/local/go/src/net/rpc/server.go:384 +0x14e
created by net/rpc.(*Server).ServeCodec
        /usr/local/go/src/net/rpc/server.go:480 +0x43a

i'll take a stab at it! https://github.com/truechain/truechain-consensus-core/blob/devel/pbft-core/node.go#L504-L507

arcolife avatar May 15 '18 12:05 arcolife

which branch is it? i can not match the code.In my env,i just run go build engine.go && ./engine, it is ok.

iamyh avatar May 18 '18 07:05 iamyh

@iamyh yea but did you run for i in {1..5}; do go build engine.go && ./engine ; done and see if it fails even once? I've checked out your PR locally and ran this.

arcolife avatar May 23 '18 08:05 arcolife

included in https://github.com/truechain/truechain-consensus-core/commit/50cfb07f880c04dd42d1e06146e9ed7a5870380b while merging devel.

Please, let's rebase and continue the discusison

arcolife avatar May 24 '18 01:05 arcolife

@arcolife hey,when i run go build engine.go and for i in {1..5}; do ./engine>> log 2>&1 ; with master branch,it will also come out 'goroutine stack error'.

when i search log file with keyword goroutine, it show 'concurrent map writes'. it means there is so much bug in node.go,such as

func (nd *Node) newClientRequest(req Request, clientId int) { // TODO: change to single arg and single reply
	if v, ok := nd.active[req.Dig]; ok {

should be locked like:

var v ActiveItem
var ok bool
nd.mu.Lock()
v, ok = nd.active[req.Dig]
nd.mu.Unlock()

there are so many code without lock.please check it carefully.because node only has one lock mu,it will result in performance down,that is why i advise use muti lock and readwrite lock。

iamyh avatar May 24 '18 03:05 iamyh

@iamyh , hi , could you please try with sync.Map to replace the nd.active map. https://golang.org/pkg/sync.

please make sure your golang version is above 1.9.

hixichen avatar Jun 03 '18 07:06 hixichen

@hixichen my go version is go version go1.9.2 darwin/amd64

iamyh avatar Jun 03 '18 23:06 iamyh

@iamyh Hi could you please join the invite from https://gitter.im/truechain-net/engg-foss-global thanks

arcolife avatar Jun 05 '18 12:06 arcolife