fabric
fabric copied to clipboard
Consistent FSM errors during multi-peer docker-compose startup
Starting multiple peers using docker-compose consistently yields FSM errors. Not sure if these are serious, as usually the peers continue to function. This run had more errors than usual.
[bcbrock@arlab178]$ dcup dc_nosec_classic_4.yml
Creating obc_vp0_1
Creating obc_vp1_1
Creating obc_vp3_1
Creating obc_vp2_1
[bcbrock@arlab178]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ccac5a65b753 openchain-peer "sh -c 'obc-peer peer" 3 seconds ago Up 2 seconds obc_vp2_1
d3ef7cffa6af openchain-peer "sh -c 'obc-peer peer" 3 seconds ago Up 3 seconds obc_vp3_1
d97917954d73 openchain-peer "sh -c 'obc-peer peer" 4 seconds ago Up 3 seconds obc_vp1_1
b7c25e33acbf openchain-peer "sh -c 'obc-peer peer" 4 seconds ago Up 4 seconds obc_vp0_1
[bcbrock@arlab178]$ grep ERRO /home/bcbrock/vp*_logs/stderr
[bcbrock@arlab178]$ grep ERRO /home/bcbrock/vp*_logs/stderr
/home/bcbrock/vp1_logs/stderr:18:58:28.310 [peer] handleChat -> ERRO 054 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp3" 172.17.0.3:30303 VALIDATOR }
/home/bcbrock/vp1_logs/stderr:18:58:28.672 [peer] handleChat -> ERRO 05c Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp2" 172.17.0.4:30303 VALIDATOR }
/home/bcbrock/vp2_logs/stderr:18:58:28.672 [peer] handleChat -> ERRO 057 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp3" 172.17.0.3:30303 VALIDATOR }
/home/bcbrock/vp2_logs/stderr:18:58:28.672 [peer] handleChat -> ERRO 05c Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1" 172.17.0.2:30303 VALIDATOR }
/home/bcbrock/vp3_logs/stderr:18:58:28.310 [peer] handleChat -> ERRO 053 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1" 172.17.0.2:30303 VALIDATOR }
/home/bcbrock/vp3_logs/stderr:18:58:28.672 [peer] handleChat -> ERRO 05c Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp2" 172.17.0.4:30303 VALIDATOR }
The run script, docker-compose files, and full stderr logs are attached in the ZIP logs.zip
I am running in a non-Vagrant environment on an X86 server. Note each peer is logging to a like-named file, e.g., the first error shows peer 1 reporting an error concerning peer 3.
I'm getting similar errors:
vp2_1 | 12:44:48.003 [chaincode] NewChaincodeSupport -> INFO 006 Chaincode support using peerAddress: 172.17.0.4:30303 vp2_1 | 12:44:48.004 [rest] StartOpenchainRESTServer -> INFO 007 Initializing the REST service... vp2_1 | 12:44:48.004 [main] serve -> INFO 008 Starting peer with id=name:"vp2" , network id=dev, address=172.17.0.4:30303, discovery.rootnode=vp0:30303, validator=true vp2_1 | 12:44:48.004 [genesis] func1 -> INFO 009 Creating genesis block. vp2_1 | 12:44:48.005 [genesis] func1 -> INFO 00a No genesis block chaincodes defined. vp2_1 | 12:44:55.660 [peer] handleChat -> ERRO 00b Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp3" 172.17.0.5:30303 VALIDATOR } vp0_1 | 12:44:47.137 [main] serve -> INFO 001 Security enabled status: false vp0_1 | 12:44:47.137 [main] serve -> INFO 002 Privacy enabled status: false vp0_1 | 12:44:47.138 [eventhub_producer] start -> INFO 003 event processor started vp0_1 | 12:44:47.317 [buckettree] initConfig -> INFO 004 configs passed during initialization = map[string]interface {}{"numBuckets":10009, "maxGroupingAtEachLevel":10} vp0_1 | 12:44:47.317 [buckettree] initConfig -> INFO 005 Initializing bucket tree state implemetation with configurations &{maxGroupingAtEachLevel:10 lowestLevel:5 levelToNumBucketsMap:map[2:11 1:2 0:1 5:10009 4:1001 3:101] hashFunc:0x2aa320a7650} vp0_1 | 12:44:47.318 [chaincode] NewChaincodeSupport -> INFO 006 Chaincode support using peerAddress: 172.17.0.3:30303 vp0_1 | 12:44:47.319 [rest] StartOpenchainRESTServer -> INFO 007 Initializing the REST service... vp0_1 | 12:44:47.319 [main] serve -> INFO 008 Starting peer with id=name:"vp0" , network id=dev, address=172.17.0.3:30303, discovery.rootnode=, validator=true vp0_1 | 12:44:47.319 [genesis] func1 -> INFO 009 Creating genesis block. vp0_1 | 12:44:47.319 [genesis] func1 -> INFO 00a No genesis block chaincodes defined. vp0_1 | 12:44:47.877 [peer] handleChat -> ERRO 00b Error during Chat, stopping handler: connection error: desc = "transport is closing" vp0_1 | 12:44:48.567 [peer] handleChat -> ERRO 00c Error during Chat, stopping handler: connection error: desc = "transport is closing" obcca_1 | INFO: 2016/05/10 12:44:46 CA Server (0.1)
Looks like my peers are active though. Any ideas? I'm new to this.
@muralisrini or @jeffgarratt - I figure you are the most familiar with the FSM. Could you take a look?
@bcbrock @srderson @muralisrini @tdietter This is expected behavior when the system is started quickly in this way. It is simply the nodes attempting to connect to each other and establish the p2p network during the discovery process. In this case, the duplicate handler simply indicates that an attempt was made to open 2 channels between the same peers, which is not allowed. It should be innocuous to the function of the network.
@jeffgarrat The problem is that it is unsettling for to see ERRO in the logs, it naturally triggers an issue to be opened (e.g. #1468). If this is really innocuous can the level be reduced to INFO with a less-scary log message?
see also #1418
met exactly the same issue, and I do agree with @bcbrock that the FSM error should be reducecd to INFO level if it was expected...
I'm facing the same issue. I tried to start the instance one by one with 30s apart but still getting the error. Any idea about the workaround?