parity-ethereum icon indicating copy to clipboard operation
parity-ethereum copied to clipboard

with empty `nodes.json` discovery only finds 4 connectable peers max, all other discovered peers have status failure

Open c0deright opened this issue 4 years ago • 21 comments

  • Parity Ethereum version: v.2.5.10-stable / v2.6.5-beta / v2.2.9-stable
  • Operating system: Linux
  • Installation: binary provides by github
  • Fully synchronized: yes
  • Network: ethereum
  • Restarted: yes

After upgrading from v2.2.9 to v2.6.5-stable on three different machines everything went smooth, parity was configured with

[network]
min_peers = 10
max_peers = 15

and all three nodes had ~10 peers.

Today I stopped all three parity instances and deleted the files .local/share/io.parity.ethereum/chains/ethereum/network/nodes.json and /home/bitcoin/.local/share/io.parity.ethereum/network/key and restarted parity on all three instances.

Since then all 3 parity nodes only have 4 active peers although in their .local/share/io.parity.ethereum/chains/ethereum/network/nodes.json they have 30 peers listed.

Edit: 4 of these 30 peers listed in nodes.json have status "success", the other 26 have status "failure".

All my three nodes connect to the same 4 peers:

vm1

netstat -antp|grep ESTAB| grep parity
tcp        0      0 10.0.48.101:45796       144.217.139.5:30303     ESTABLISHED 15886/parity-binary
tcp        0      0 10.0.48.101:34206       139.99.160.213:30303    ESTABLISHED 15886/parity-binary
tcp        0      0 10.0.48.101:50084       139.99.51.203:30303     ESTABLISHED 15886/parity-binary
tcp        0      0 10.0.48.101:47736       193.70.55.37:30303      ESTABLISHED 15886/parity-binary

vm2

netstat -antp|grep ESTAB| grep parity
tcp        0      0 10.0.49.147:35002       144.217.139.5:30303     ESTABLISHED 18226/parity-binary
tcp        0      0 10.0.49.147:35298       139.99.51.203:30303     ESTABLISHED 18226/parity-binary
tcp        0      0 10.0.49.147:43032       193.70.55.37:30303      ESTABLISHED 18226/parity-binary
tcp        0      0 10.0.49.147:51090       139.99.160.213:30303    ESTABLISHED 18226/parity-binary

vm3

netstat -antp|grep ESTAB| grep parity
tcp        0      0 10.0.49.140:52896       144.217.139.5:30303     ESTABLISHED 3939/parity-binary
tcp        0      0 10.0.49.140:42378       193.70.55.37:30303      ESTABLISHED 3939/parity-binary
tcp        0      0 10.0.49.140:54112       139.99.51.203:30303     ESTABLISHED 3939/parity-binary
tcp        0      0 10.0.49.140:32954       139.99.160.213:30303    ESTABLISHED 3939/parity-binary

I downgraded parity from v2.6.5-beta to v2.5.10-stable and then to v2.2.9-stable and tested with the existing nodes.json (30 peers listed) and with deleted nodes.json. Result is the same. Only 4 connections to the above mentioned peers although configured for more ~~and more listed in nodes.json~~ (edited)

Could someone provide his nodes.json file from a well connected peer?

I even stopped parity, edited nodes.json and removed the 4 mentioned peers in that file, saved it and restarted parity. Same result. ~~It looks like parity is NOT using peers from nodes.json.~~ (edited)

c0deright avatar Nov 27 '19 21:11 c0deright

Just now i notice that only the 4 peers my nodes connect to have "last_contact": "success" in nodes.json, all others have failure.

Trying telnet on the IPs listed with "failure" and their Port gives me telnet: Unable to connect to remote host: Connection refused.

Is there some sort of attack running against the ethereum network?

c0deright avatar Nov 27 '19 21:11 c0deright

nodes.json from one of my VMs.

c0deright avatar Nov 27 '19 21:11 c0deright

@c0deright this sounds like a discovery issue if you are trying with a really old version and still seeing the same results - are these vms all colocated?

jam10o-new avatar Nov 27 '19 21:11 jam10o-new

@joshua-mir Please see my comment above with my nodes.json file. Discovery looks like it's working but all but 4 peers are non-functional.

My VMs in question all run on AWS.

Could you provide the nodes.json file from a well connected node?

c0deright avatar Nov 27 '19 21:11 c0deright

I edited nodes.json again and only left the 4 working nodes in. Then started parity. Waited some minutes. 4/10 peers. Stopped parity. Looked in nodes.json again. The 4 working nodes had "success" and were listed on top, then 26 nodes followed with "last_contact" "failure", like in nodes.json in my comment above.

I think discovery is working but I mostly discover non-connectable-peers. Very strange.

c0deright avatar Nov 27 '19 21:11 c0deright

Here is a nodes.json file from 10 mins of operation right now :neutral_face:

Can you double check your security group settings?

jam10o-new avatar Nov 27 '19 21:11 jam10o-new

I didn't change anything on security-group since ages. The VMs in question don't have restrictions on outgoing traffic. They go out over a NAT gateway, so my VMs have no open ports for others to connect inbound. My VMs can only initiate connections to others, others can't initiate connections to my parity nodes.

With your nodes.json it's working right now. Will report back in a couple of hours.

c0deright avatar Nov 27 '19 21:11 c0deright

right, I still find it strange how you would be seeing a change in behaviors now..

jam10o-new avatar Nov 27 '19 21:11 jam10o-new

Right now one has 8 peers, the other two have 9 peers. I have a feeling that over time the peer count will only decrease but never increase. But we'll see. Will report back in ~10 hours.

c0deright avatar Nov 27 '19 22:11 c0deright

Thanks for your quick help, @joshua-mir

What's the DNS names parity tries to connect to when it has an empty nodes.json?

c0deright avatar Nov 27 '19 22:11 c0deright

shouldn't be dns names - it grabs the bootnodes defined in the chainspec for the chain you are using

jam10o-new avatar Nov 27 '19 22:11 jam10o-new

@joshua-mir Since starting my 3 VMs with your nodes.json file they have been running with 7-10 peers constantly.

I still find the distribution between working peers and failing peers appalling:

vm1

% grep -c success $NODES_JSON
11
% grep -c failure $NODES_JSON
1012

vm2

% grep -c success $NODES_JSON
11
% grep -c failure $NODES_JSON
1013

vm3

% grep -c success $NODES_JSON
9
% grep -c failure $NODES_JSON
1013

Low peer count for itself is no problem but we had issues in the last couple of days with outgoing TX not being visible on the blockchain for several hours. The raw TX could always be broadcasted via etherscan.org but e.g. yesterday evening we had a TX sent out and not being visible on the blockchain for ~3 hours.

I just now changed min/max to 15/15 and now my vms do have 13-15 peers.

Will test later with empty nodes.json again to see if discovery works fine.

c0deright avatar Nov 29 '19 10:11 c0deright

I actually have an identical issue:

Newly set up a machine, using 2.5.10, Linux Ubuntu with provided binary form Github, syncing from scratch with enabled tracing. I am only able to connect to one peer, after a while it falls down to 0.

With your nodes.json (@joshua-mir) it actually works (8-10 peers). Thanks for that. (But that can't be a solution either)

edit: Linux Ubuntu is within a Windows Subsystem for Linux, however using the provided windows .exe leads to the same result.

UliGall avatar Dec 04 '19 08:12 UliGall

agree, there's clearly something up with connections being accepted by peers after discovery.

jam10o-new avatar Dec 04 '19 09:12 jam10o-new

have the same isssue using v2.5.11

or2008 avatar Dec 10 '19 07:12 or2008

@c0deright can you update what is your current situation? did you manage to find a solution / workaround?

on my machine, I have max 4 peers, tried reinstalling parity, and sync the chain again from scratch.. didnt help, tried to remove nodes.json with no luck either.

image

or2008 avatar Dec 13 '19 08:12 or2008

@or2008

My workaround was to use the nodes.json provided from @joshua-mir and after that never delete it again.

You can try with my file in the meantime: nodes.json.txt

Could we please remove the unconfirmed flag since this issue clearly affects several people?

c0deright avatar Dec 13 '19 11:12 c0deright

@c0deright Thanks, I replaced the old nodes.json with yours, but when I start parity again, it overrides this file with bad nodes (only 4 nodes on success), any idea why?

or2008 avatar Dec 13 '19 14:12 or2008

Just chiming in that I've been experiencing this, and using Joshua's nodes.txt I actually got a reasonable number of peers (was stuck on 1-2, now 30+). I had only seen this on mainnet, not on goerli with the same setup, if that's helpful for someone investigating.

area avatar Dec 16 '19 12:12 area

Had the same issue. Was warp syncing a new system and it was stuck with 2 peers. I had finished the snapshot sync and ancient blocks were over 5000000. Current blocks would sync but ancient blocks would not. Tried deleting nodes.json. That didn't work. Using nodes.json from @joshua-mir unblocked things and now peers are back up to 24 and ancient blocks are syncing. This is v2.5.13-stable.

Ancient block syncing previously stopped at block #5551719. Adding additional logging showed this:

2020-01-14 11:48:43 IO Worker #1 TRACE sync OldBlocks: No useful subchain heads received, expected hash 0x609f0cb4ed49d7fa9a667d3190eb63339da019bd66964e0c0520009db54cf525 2020-01-14 11:48:43 IO Worker #1 TRACE sync OldBlocks: Expected some useful headers for downloading OldBlocks. Try a different peer

strickon avatar Jan 14 '20 17:01 strickon

Any news about this subject? I am having the same issue here (only 2 connected nodes, from a max of 50).

bfgasparin avatar Jun 01 '20 13:06 bfgasparin