dispersy copied to clipboard
Dispersy walker has fewer candidates than expected (2)
(This issue is duplicate of #38. This was closed after it fixed a bug causing the trackers to no longer respond.)
We would expect more candidates sooner. Especially the AllChannelCommunity takes much longer to obtain a good number of candidates than we would expect.
Strangely enough the walk success rate is relatively high (around 90%). This contradicts the lack of available candidates.
This behavior must be either solved or explained. Please investigate.
The behavior of Tribler improved on my home Ubuntu box. But the search community is still not bootstrapping.
UDP connectable peers seem to run fine, see http://jenkins.tribler.org/jenkins/job/Test_tribler_devel/73/
When behind a NAT, a lot of walk messages get lost. Do the logs indicate an incoming message from the IPv4 addresses listed in screenshots below? These should be connectable:, but there are frequent "walk_fail" problems. This test was conducted between 12:00 and 13:00 on Saturday 22June.
And another IPv4 address (tethering with 3G) gives same problems.
There are 81 entries between 12:24 and 13:00. With the exception of one entry, they are all for the BarterCommunity. Later in the log are entries for the other communities as well.
I have a natted virtualbox that seems to have similar low connectability issues. I'll continue to test from there.
The screenshots above are repeated with current branch. Dispersy AllChannel community works. However, the Search community fails to work.
The "walk_fail" shows my computer cannot connect to: asmat.das2.ewi.tudelft.nl. kayapo.das2.ewi.tudelft.nl. superpeer9.das2.ewi.tudelft.nl. om.cs.vu.nl
With logging we can determine if either the tracker, the client or both are at fault.
I found a possible explanation for this bug. The CommunityStatisctics class was using the yield_iter_categories which filtered out all introduced candidates. Pull request #77 seems to fix it. Since applying this change, I have never seen less than 17 candidates in the allchannel or searchcommunity. Usually, the both hover around the 20 mark.
The only the "timeout_adjustment" property in candidates.py seems to still influence the number of candidates reported. During the startup, I reguarly see a behavour similar to:
8 candidates 7 candidates 9 candidates
Which i feel is caused by this timeout_adjustment property.
The new test_overlay.py script was (last week) still reporting drops in candidates back to as low as 4 at times, this was using the fixed community.dispersy_yield_candidates(), i.e. the one returning walk, stumble, and intro.
#77 does make the problem less 'severe' as the GUI will now include intro candidates in the count as well. But the problem isn't solved yet.
As for the timeout_adjustment property, this should cause a candidate that we walk towards to get category 'none' until the intro response is received. As this is not immediately clear, I suggest we define the exact behavior we want and clean this up with https://github.com/Tribler/dispersy/issues/68.
Could you explain to me why a candidate for which we have just send an introduction-request to should not be in the walk category? For me, it makes sense to prevent it from being walked to again using the is_eligable_for_walk but removing it from the walk category does not.
I agree with you, I'm guessing this was easier to implement at the time. As I said, we should properly define these cases, implement, and verify with unit tests.