Routing/HTTP router: `routing provide` and fast-provide do not provide to HTTP router (kubo 0.38 and 0.39)
Checklist
- [x] This is a bug report, not a question. Ask questions on discuss.ipfs.tech.
- [x] I have searched on the issue tracker for my bug.
- [x] I am running the latest kubo version or have an issue updating.
Installation method
dist.ipfs.tech or ipfs-update
Version
`ipfs version 0.39.0`
Config
{
"API": {
"HTTPHeaders": null
},
"Addresses": {
"API": "/ip4/127.0.0.1/tcp/0",
"Announce": [],
"AppendAnnounce": [],
"Gateway": "/ip4/127.0.0.1/tcp/0",
"NoAnnounce": [],
"Swarm": [
"/ip4/0.0.0.0/tcp/0"
]
},
"AutoNAT": {},
"Autorelay": {},
"Bootstrap": [],
"DNS": {
"Resolvers": {}
},
"Datastore": {},
"Discovery": {
"MDNS": {
"Enabled": false
}
},
"Experimental": {},
"Gateway": {},
"Identity": {},
"Internal": {},
"Ipns": {},
"Migration": {},
"Mounts": {},
"Peering": {},
"Pinning": {},
"Plugins": {},
"Profiles": {},
"Provider": {},
"Pubsub": {},
"Reprovider": {},
"Routing": {
"Methods": {
"find-peers": {
"RouterName": "HttpRouterNotSupported"
},
"find-providers": {
"RouterName": "HttpRoutersParallel"
},
"get-ipns": {
"RouterName": "HttpRouterNotSupported"
},
"provide": {
"RouterName": "HttpRoutersParallel"
},
"put-ipns": {
"RouterName": "HttpRouterNotSupported"
}
},
"Routers": {
"HttpRouter1": {
"Parameters": {
"Endpoint": "http://127.0.0.1:19575"
},
"Type": "http"
},
"HttpRouterNotSupported": {
"Parameters": {
"Endpoint": "http://kubohttprouternotsupported"
},
"Type": "http"
},
"HttpRoutersParallel": {
"Parameters": {
"Routers": [
{
"IgnoreErrors": true,
"RouterName": "HttpRouter1",
"Timeout": "10s"
}
]
},
"Type": "parallel"
}
},
"Type": "custom"
},
"Swarm": {}
}
Description
When Kubo is configured to use an HTTP router, routing provide (and the implicit provide triggered by --fast-provide-wait) does not send any provider records to the HTTP router on Kubo 0.38.x and 0.39.0. On Kubo 0.37.0, a plain ipfs add still sends PUTs to the HTTP router containing the CID.
What I was doing: Starting a fresh repo, configuring Routing.* to point at a local mock HTTP router, starting the daemon, then adding a file with ipfs add (with --fast-provide-wait when available on 0.39) and calling ipfs routing provide --recursive --verbose on the added CID and its raw-multihash form. The mock HTTP router records all requests.
What happened: On Kubo 0.38.x and 0.39.0, the HTTP router receives zero requests (no PUT /routing/v1/providers). On Kubo 0.37.0, the add triggers PUTs that include the CID.
Error messages: None from routing provide on 0.38/0.39 (it returns 200);
Steps to reproduce (self-contained repo):
- Clone repo: https://github.com/Rinse12/Reproduce-not-providing-on-0.39
-
npm install(useskubonpm package) - Run
npm run repro- Starts a mock HTTP router on 127.0.0.1:19575
- Inits a fresh Kubo repo, sets Routing.* to the mock router, randomizes ports, disables MDNS
- Starts daemon,
ipfs add --fast-provide-wait(or falls back to plain add), thenipfs routing provide --recursive --verbose(CID + raw multihash) - Prints every request the mock router received and checks for the added CID in v0, v1, and raw (0x55) forms
- Observe output:
- 0.39.0:
Router requests: [](no provider PUTs) - 0.38.0: PUTs arrive but contain unrelated keys, not the added CID
- 0.37.0: PUTs contain the added CID after add; explicit routing/provide fails with “no connected peers.”
- 0.39.0:
Expected: provider records for the added CID (dag-pb and raw multihash) should be sent to the configured HTTP router.
Triage: Will investigate
Triage: Will investigate
I also observed this behaviour with Kubo 0.39.
I have the sense that this is an undocumented —and potentially unintended— regression introduced in 0.38.
According to https://github.com/ipfs/kubo/blob/master/docs/config.md#provide:
While designed to support multiple routing systems in the future, the current default configuration only supports providing to the Amino DHT.
It's a bit ambiguous as to whether the DHT is only supported with the default config, or more broadly.
Triage notes
- This issue is not about
Provide.*, it is about using non-standard custom routing config when user explicitly setsRouting.Type=customand then crafts composite router by hand inRouting.Routers- Note: the custom config via
Routing.Routersis marked as experimental in docs, and not enabled by default, which impacts prioritization - Historical background/cross-linking: it was flagged how brittle the API described in IPIP-526 at Protocol Labs times (e.g. https://github.com/ipni/index-provider/issues/403), noting the API is not documented and not tested, yet nobody stepped up to make it a real thing.
- Note: the custom config via
Open questions
-
Technical: unsure how complex a fix would be. Added this to my TODO to evaluate complexity this week
- I'll comment later with my findings
-
General Politics: should we fix a broken/limited API described in IPIP-526 that lacked tests (so it was not detected when it stopped working)?
- If we fix this, more people will use this API, and we need not only good tests, but also specs (we now have IPIP-526 so maybe that is enough, as logn we keep flagging its experimental/deprecated)
- 👉️ Long term, replacement is needed. Feedback on IPIP-526 would be appreciated (why it works for your use case, or why it does not, and what is missing)
- Side Note
- the way announcements are made signs every CID, which does not scale for big datasets. If we embrace this without any safeguards, we will be triaging bugs about "kubo provide being slow" and "kubo eats up all my CPU" as a natural progression here.
- We could work around this by printing warning when delegated HTTP publishing is enabled via
Routing.Type=customwith message linking to IPIP-526 and the experimental character of API + have a flag that allows user to disable signatures (if they are enabled right now) in trusted environments, like we do for pubsub for testing.
- We could work around this by printing warning when delegated HTTP publishing is enabled via
- the way announcements are made signs every CID, which does not scale for big datasets. If we embrace this without any safeguards, we will be triaging bugs about "kubo provide being slow" and "kubo eats up all my CPU" as a natural progression here.
- If we fix this, more people will use this API, and we need not only good tests, but also specs (we now have IPIP-526 so maybe that is enough, as logn we keep flagging its experimental/deprecated)
Ok, so TLDR on technical side, found potential reason for regression:
- When
Provide.DHT.SweepEnabled=true(the default since v0.39), theSweepingProviderOptfunction is used. It checks for DHT availability. For custom HTTP-only routing (Routing.Type="custom"with only HTTP routers), there is no DHT client. This causes impl == nil →NoopProvider{}is returned. - A potential fix is to detect when no DHT is available but HTTP routing is configured for provide, fall back to LegacyProvider instead of NoopProvider
- A: Change
OnlineProviders()to detect HTTP-only routing and useLegacyProviderOptin that case - B: In
SweepingProviderOpt, when DHT is missing, fall back toLegacyProviderinstead ofNoopProvider
- A: Change
@Rinse12 @2color Until this is fixed, users with HTTP-only custom routing can try to work around the issue by disabling sweep mode, which is DHT-specific optimization anyway (Legacy provider, afaik, was not changed since 0.37, so should "stil work"):
$ ipfs config Routing.Type
custom
$ ipfs config --json Provide.DHT.SweepEnabled false
Mind, the Routing.Type=custom has no meaningful tests coverage, and should not be used in production, only for testing and research. https://github.com/ipfs/kubo/pull/11111 updates docs to make it very clear.
Turns out fix was not that complex (use Legacy provider when no DHT is present), added regression test as well, see:
- https://github.com/ipfs/kubo/pull/11112
- https://github.com/ipfs/kubo/pull/11111
I can confirm that setting ipfs config --json Provide.DHT.SweepEnabled false fixes this.
Thanks.
Same for me. Thanks for your attention to this problem