kubo icon indicating copy to clipboard operation
kubo copied to clipboard

Gateway Handler Extraction

Open whyrusleeping opened this issue 2 years ago • 10 comments

Checklist

  • [X] My issue is specific & actionable.
  • [X] I am not suggesting a protocol enhancement.
  • [X] I have searched on the issue tracker for my issue.

Description

In building rainbow I found that it was very difficult to just use the corehttp package as is for a number of reasons, so to get things functional I copy pasted the code out and macheted things into place. I would really like to not have two separate copies of that code, So I'm opening up this issue to track what work is needed to unify that (and generally make the gateway code be reuseable externally).

  • [ ] Make the gateway handler not depend on the coreapi interface
    • The coreapi interface is really difficult to satisfy, or even stub out. Its implementation is highly coupled and very complicated. In reality, the gateway only needs three methods on the whole thing: ResolvePath Get and Name().Resolve.
    • refactoring the corehttp code to take an interface with the bare minimum of methods needed is really critical here
  • [ ] extract the corehttp package to a separate repo, so others dont have to import all of go-ipfs
    • This will also help tease out other ways in which this package is coupled to the rest of the codebase (especially the 'assets' module, that imports the entire 'core' module)

whyrusleeping avatar Oct 21 '21 21:10 whyrusleeping

The stewards have been talking a lot lately about ways to consolidate repos, because the overhead is slowing us down a lot.

Is the issue with "importing all of go-ipfs" due to its transitive dependencies? If so, would Go 1.17's mod graph pruning help with that? Are there other issues with keeping it in go-ipfs?

guseggert avatar Oct 21 '21 22:10 guseggert

This is more of a meta point but I was thinking about this yesterday and in a lot of ways the value of the repos/packages that have been built across the ipfs/libp2p/ipld/multiformats orgs is the ability to mix and match and build applications that are specific to the needs of the developer. Interfacing with go-ipfs node is the wrong way to build applications and in a lot of ways go-ipfs is a glorified tech demo (I in no way mean this disrespectfully). go-ipfs should become purely a config file, a bundle of other packages, and a cli app to interact with it. As such anything unique and of value in terms of functionality should be extracted from go-ipfs and put into its own repo.

lanzafame avatar Oct 22 '21 00:10 lanzafame

@guseggert While I do really want the code extracted to a separate repo, I would be happy if we just managed to get it decoupled enough that it could be. I would like to iterate on the gateway code pretty fast, but first things first is getting it into a place where we can even think about doing that.

whyrusleeping avatar Oct 22 '21 01:10 whyrusleeping

I 100% agree that mixing and matching is valuable. I'm wondering if we can keep that benefit, while lowering the burdensome cost of maintenance and constantly propagating changes across dozens of repos in order to get a feature out the door. Maybe the answer is no, but I think it's worth considering.

If we wanted to experiment with some new combination of components, what specific issues would we run into if they were all in one repo and we just consumed it as a library? (And the go-ipfs binary could also consume in the same way, like you're talking about @lanzafame ).

I think to your point @whyrusleeping, maybe we can refactor the gateway handler to be reusable, and reuse it elsewhere, without necessarily having to pay the cost of another repo.

guseggert avatar Oct 22 '21 01:10 guseggert

in a lot of ways the value of the repos/packages that have been built across the ipfs/libp2p/ipld/multiformats orgs is the ability to mix and match and build applications that are specific to the needs of the developer

As such anything unique and of value in terms of functionality should be extracted from go-ipfs and put into its own repo.

These points seem quite unrelated to each other as indicated by the "repos/packages". You are able to import sub packages within repos and have things work just fine without needing to create yet another repo.

As it stands the ipfs and libp2p orgs have literally hundred of repos and some brief messing around with go mod shows go-ipfs itself depends on nearly 150 transitively. I think the number of PRs into go-ipfs which are really multiple PRs across various libraries would be an understandable measure of the complexity introduced by this repo sprawl.

I'd like to move in a direction where packages get split out into their own modules once the demand for independent development and velocity is large enough that you'd really want independent versioning. You can have multiple go modules in the same repo, although I'm not sure how well that works at the moment so it might be reasonable to say we want different go modules in different repos.

but first things first is getting it into a place where we can even think about doing that.

Yes 👍. It's generally easier to develop against fuller APIs than more narrow ones when you have an implementation that does so much (e.g. *IpfsNode), at the time it's arguable how much effort should be expended to create the narrowest possible interface.

Once there's demand for the package to operate independently, as there is now for the http gateway functionality, then it's much easier to justify the effort to rework the package to have more manageable dependencies and a narrower API.

aschmahmann avatar Oct 22 '21 01:10 aschmahmann

On the meta-level, fundamentally, I have no issue with a single repo. But the single repo should not and cannot be the go-ipfs repo. Go-ipfs is an application with a significant amount of baggage, both technical and in how people perceive what it is. Any single repo model that contains modules for mixing and matching needs to be separate and imported by go-ipfs.

As it stands the ipfs and libp2p orgs have literally hundred of repos and some brief messing around with go mod shows go-ipfs itself depends on nearly 150 transitively. I think the number of PRs into go-ipfs which are really multiple PRs across various libraries would be an understandable measure of the complexity introduced by this repo sprawl.

This speaks more to the tech-demo aspect of go-ipfs then it does to the repo sprawl.

On the technical-level, multiple go modules in a single repo have a greater management complexity, see https://golang.org/doc/modules/managing-source#multiple-module-source, each module version requiring a git tag prefixed with the module name.

lanzafame avatar Oct 22 '21 01:10 lanzafame

Ideally go-ipfs would be limited to the CLI implementation and mechanics of standing up a daemon. All the other functionality would be in a separate library repo. It's an overused term but the porcelain/plumbing analogy fits here

iand avatar Jun 28 '22 11:06 iand

Implemented the minimal needed changes in https://github.com/ipfs/go-ipfs/pull/9070

iand avatar Jun 30 '22 15:06 iand

Recent gateway extraction efforts (Why's rainbow, Ian's PR, Will's gateway-prime) were discussed during Kubo standup today, below is a short summary, so we all are in sync:

  • Kubo maintainers are suportive, we want to decouple and extract gateway code to be useful outside Kubo.
  • Given current prioritization and limited resources, we will refine Gateway interface while the code remains in ipfs/kubo repo
    • This allows us to benefit from existing end-to-end tests in test/sharness/*gateway*.sh, and save everyone time by avoiding code duplication/divergence across repos
  • https://github.com/ipfs/kubo/pull/9070 proposed by @iand is the first step in that direction (we can merge small changes and iterate on it over time)
  • After we have interface fleshed out and decoupled from Kubo internals, the code extraction will be part of the bigger "libipfs" effort which is in Kubo's long term roadmap (cc @guseggert @BigLep)

lidel avatar Jul 26 '22 17:07 lidel

The "libipfs" effort is tracked here: https://github.com/ipfs/kubo/issues/8543

BigLep avatar Jul 26 '22 18:07 BigLep

2023-01-24 standup: for consolidating to go-libipfs, we want to move in https://github.com/ipfs/kubo/tree/master/core/corehttp (and some other directories)

BigLep avatar Jan 24 '23 17:01 BigLep

The gateway has been successfully moved to go-libipfs/gateway and we have added some examples on how to use it. The gateway code has also been adapted such that it does not require the Core API and only a small subset with specific functions. I'm closing this as the main topic has been completed.

We also want to move more gateway sharness tests to go-libipfs, and that is being tracked on https://github.com/ipfs/go-libipfs/issues/146.

hacdias avatar Feb 14 '23 11:02 hacdias