go-libp2p
go-libp2p copied to clipboard
peerstore: next steps for datastore-backed peerstore
Following libp2p/go-libp2p-peerstore#34, we're in a pretty good state vs. baseline. Performance benchcmp here: https://github.com/libp2p/go-libp2p-peerstore/issues/31#issuecomment-421433125
Summing up my thoughts here on how to move forward.
-
We should find a way to reduce the cost of
Peers(). IPFS doesn't call it from critical paths (only CLI commands), but as libp2p powers more projects, we should not make assumptions about how users will be calling us.- With the current db layout,
PeersWithAddrs()traverses the entire KV store building a set of unique peers. - We obviously want to avoid doing that. Options:
- Jamming lists of multiaddrs in peer-addresses entries, to reduce total key count.
- Not feasible because TTL management needs to be fine-grained (up to the multiaddr).
- Controlling the iterator seek behaviour. The
go-datastoreiterator doesn't allow us to seek arbitrarily, but badger does. Instead of ingesting all keys, and comparing them, we want to perform a "fold" operation taking advantage of the fact that badger is an ordered LSM tree. When we encounter a new peer, we skip over all keys untilValidForPrefixis no longertrue. That means we've encountered another peer, and we use that as the new prefix to seek against.- Unfortunately, this only works for ordered key-value stores.
- Store an index of peers, e.g. in a
/peeridxkeyspace. Then we can just iterate through a small keyspace instead of the entire database.- This is my fave, but given that TTL expires keys at the DB level without us coming to know, maintaining those indices aligned with the true existence of keys for peers is non trivial. I have several ideas here.
- Jamming lists of multiaddrs in peer-addresses entries, to reduce total key count.
- With the current db layout,
-
Regarding option 3 above, find a way to store the earliest and tardiest expirations.
- The tardiest coincides with when a peer is gone from the DB (if an explicit delete doesn't occur earlier, e.g. setTTL to 0), i.e. it can help us manage the peer index.
- The earliest coincides with when the cache entry for a peer should be invalidated. In this way, we don't need to calculate this timestamp every time we load a peer from DB to the cache.
- We could use merge operators for this, but they are too specific to certain implementations, and also eventually consistent. I guess we need strict consistency.
- Tracking these time bounds is easily implemented if it weren't for explicit deletes:
- On every insert or update you can update the bounds if the newly added entry changes them.
- Deletes become non-deterministic. If the deleted entries are defining the bounds (i.e. you are deleting the earliest entry or the tardiest entry to be expired), you need to iterate through all peer's keys to recalculate those bounds.
- I'd argue it's a worthy tradeoff, though, as it saves time on get, which is probably way more frequent than deletes.
-
Use the
ds.Keyfunctionality to create key hierarchies, e.g./peer:<id>/addr:<hash or index>=>multiaddr. -
Migrate the KeyBook and the PeerMetadata substores, under appropriate hierarchical namespaces as per above.
-
Find a way to remove the multiaddr hash from the peer's key, maybe replacing it with some kind of sequence number.
-
Spin off pstoreds package into top-level module:
go-libp2p-peerstore-ds. -
EDIT: Consider moving
PeerInfoto somewhere more common.
Tagged a few people just to make you aware of the future direction I'd like to pursue with the peerstore, following all the work that has been done already.
Not feasible because TTL management needs to be fine-grained (up to the multiaddr).
We can also manage this using a GC system. That is, store large records and periodically sweep.
Note: We also don't need to use TTLs if you can think of a better metric/system.
i think this is a definite case for indexing, as you suggest. a note: we can also open multiple badger data stores instead of namespacing keys if it suits. it’s designed to have that be an alternative to multiple column families.
@raulk What is the status of this as of date ? I'd like to help out with this by picking up some some of the chunks if possible.