fred Random initial routing

It looks like this should actually improve performance, at least for inserts. It should also improve anonymity. It should be tested; any performance effects will be visible on a local deployment, although they will be clearer once the network has upgraded. It should also be simulated; I may be in a position to do that soon.

Apr 11 '16 17:04 toad

Why does initial random routing improve performance?

(or rather: how?)

Apr 11 '16 22:04 ArneBab

Local requests are randomly distributed across the keyspace. But our peers are narrowly focused on our specialisation, especially for a darknet node, a bit less so for opennet. We have very few "long links", by design. This is a capacity problem! Assuming that we route correctly, we get far more bandwidth by routing randomly for the first few hops. Think about it in terms of number of escape routes available. Far more links to the wider network are available 2 hops from the originator (at random) than on the originator.

For inserts there's no cost, because we have plenty of HTL, have to go the full HTL, and this doesn't change that.

Also, it improves anonymity. And I expect it will improve data persistence slightly, although at the cost of request spamming (e.g. chat clients) being squashed a bit less often.

Apr 11 '16 22:04 toad

ah, yes: For slow connections there’s just a single node which can take requests outside the ±0.001 location range. The concept sounds good — thank you! (I only glanced over the code, though: not a real review).

Apr 11 '16 22:04 ArneBab

Very cool! This sounds like an encouraging approach / improvement. This is too late to make it into 1473 but it's certainly a candidate (along with the rest of the simulator stuff that hasn't been reviewed yet) for 1474.

I don't understand

although at the cost of request spamming (e.g. chat clients) being squashed a bit less often.

What do you mean by squashed? Starved for resources? Chat clients not getting starved sounds like a good thing.

Apr 12 '16 10:04 Thynix

RecentlyFailed: If a node receives too many requests for the same key and HTL, it sends a RecentlyFailed to terminate the request. If it then finds the key, it will offer it to the peers who've asked for it. All within a threshold time period, of course. This is a good thing because it reduces the load caused by polling-based chat apps etc. And it only marginally reduces the probability of actually finding the data - certainly for popular data the offers (ULPRs) will work fast when the data is inserted.

This is sabotaged to some degree by this code because we try more nodes.

I guess per-node failure tables are probably closer to this... I don't remember the details right now. The point is more of the requests for popular keys will survive for more hops. This may be a good thing for rarer keys (search more of the network), but it's not such a good thing for popular ones (although RF will kick in fairly soon anyway so maybe it's not a big deal).

Apr 12 '16 11:04 toad

Note that the request part reduces throughput at the level of the whole network, because on average it takes an extra hop to find the data. However if we assume that most nodes aren't requesting most of the time, e.g. there are lots of nodes that don't constantly have full queues, then this is fine. There is an argument for only using this for inserts, or only random routing while htl=max for both inserts and requests.

Apr 12 '16 17:04 toad

Matthew Toseland writes:

However if we assume that most nodes aren't requesting most of the time

This is an unrealistic assumption. If there is interesting content on Freenet, nodes will be requesting approximately all the time, at least on the bulk queue.

Apr 13 '16 18:04 ArneBab

I disagree. We have thousands of nodes, but I don't think we have thousands of active filesharers.

Apr 13 '16 19:04 toad

What about WebOfTrust? Bookmarks? Do those not provide a baseline of constant activity?

On Wed, Apr 13, 2016, 3:27 PM Matthew Toseland [email protected] wrote:

I disagree. We have thousands of nodes, but I don't think we have thousands of active filesharers.

— You are receiving this because you commented.

Reply to this email directly or view it on GitHub https://github.com/freenet/fred/pull/529#issuecomment-209610534

Apr 13 '16 19:04 Thynix

Bookmarks cause a fairly small amount of traffic, especially as it's cached nearby. As I've explained, polling keys doesn't result in fetching them constantly, only periodically. WoT is only used by a minority of users, or we'd see a lot more traffic on the boards. And anyway it polls SSKs, it doesn't fetch all that many CHKs? I dunno, you'd have to look at the detail, it's possible it always fetches a CHK.

Apr 13 '16 20:04 toad

@toad if we do not have them now, we will have them once Freenet gets more users. That’s the result I got from every filesharing system I contributed to over the past 15 years: The default state is fully loaded. The only way around that would be to design services which spawn special purpose nodes which are only used for low-volume operation and where the users never see the filesharing aspect.

In Sone there are about 80 IDs who were active in the past month. That’s then at least 80x80 SSK downloads, where some of the SSKs surely redirect to a CHK due to size.

WoT however has 14k identities: http://www.draketo.de/english/freenet/social-graph-snapshot

Apr 26 '16 19:04 ArneBab

Freenet can deal with Sone/WoT-type traffic very efficiently, because it ends up cached and propagated. I agree it's a scalability problem though. In general, filesharing systems are "start when you want the file, shut down when you're done, we'll borrow some of your bandwidth during that period". That's what BitTorrent does, and it sucks. It's not what Freenet does.

Anyway, I'd be perfectly happy to have this initially deployed only for inserts. What impact it has on requests is an open question: They will take longer, but have more capacity on the first hop. But for inserts the performance impact should be essentially negligible, and very likely positive due to having more routes available.

Apr 29 '16 14:04 toad

@toad what people tend to do is “queue up tons of files and be happy whenever one completes”. This does not mean that random initial routing would necessarily be a bad idea for requests. It just means that any argumentation, which requires the assumption that the network isn’t fully loaded, isn’t valid argumentation.

For inserts I fully agree.

Apr 29 '16 15:04 ArneBab

I'm not convinced that this would actually work. I'd be glad to be proven wrong by simulations, though.

Local requests are randomly distributed across the keyspace. But our peers are narrowly focused on our specialisation, especially for a darknet node, a bit less so for opennet. We have very few "long links", by design. This is a capacity problem!

So far I agree.

Assuming that we route correctly, we get far more bandwidth by routing randomly for the first few hops.

Random routing for the first few hops does not increase the expected available bandwidth, except for nodes with below-average bandwidth limits themselves. What initial random routing does, for say N ~= 2 steps, is to make what was initially our problem, our neigbour's neighbour's problem (N steps away from us), however with HTL reduced by N. This node will now have to deal with a request randomly distributed across the keyspace, just like we do now without random initial routing, however with HTL=18-N.

Think about it in terms of number of escape routes available. Far more links to the wider network are available 2 hops from the originator (at random) than on the originator.

There are many more escape routes available, but we choose only one at random. Down the line, after initial routing, the expected number of links to the wider network (for any random key) is the same as the local expectation.

Jun 19 '16 21:06 bertm

@ArneBab @bertm should we close this PR ?

May 06 '24 11:05 hernic

fred fred copied to clipboard

Random initial routing

fred
fred copied to clipboard