cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Allow patching dependencies with patch files

Open hetmankp opened this issue 7 years ago • 22 comments

Summary

This proposes to add an ability for Cargo to patch dependencies with patch files.

Problem description

The addition of the [patch] manifest section has made making localised changes to crates used by an individual project much easier. However, this still requires checking out the source code for the entire crate even if only a few lines of the source actually need to be modified.

This approach is fairly reasonable for individual Rust projects, however when embedded in a much larger build system that pulls hundreds of various projects together this can quickly become unwieldy (e.g. Buildroot, OpenWrt build system, etc). Rather than storing the entire source for any packages requiring modification, these build system instead store a set of patch files against a specific release version for the package in question.

Possible solution

This proposes adding an ability to let Cargo apply a set of patch files to a create after it is retrieved and extracted and before it is compiled. It would probably make sense to use the unified diff format given its popularity with tools like Git etc.

A previous proposal #3830 similar to this one, suggested the following syntax:

[dependencies.foobar]
version = "1.42"
patches = [
    "path/to/patch-1.patch",
    "path/to/patch-2/foobarize-a-bit-less.patch"
]

That syntax seems like a good starting point although someone with more Cargo experience might have a better idea.

Final remarks

One of the stated 2017 roadmap goals was for better integration with large build systems and I believe this would help further that goal.

hetmankp avatar Oct 20 '17 03:10 hetmankp

FWIW, I think applying patches is beyond Cargo's rational. Besides the intricacies of successfully applying patches (see the wilderness of git apply), at the end of the day Cargo itself is simply not a universal build system. For my money, it would be better turn this around and have the external build system fetch the source of the dependency, apply the patch and call cargo to build the desired crate, including the prepared, patched dependency.

lukaslueg avatar Oct 20 '17 10:10 lukaslueg

@lukaslueg I see where you're coming from but that would eliminate the primary purpose of Cargo; that is dependency management. Having build systems re-implement the crates.io dependecy tree and try to keep it in sync with upstream would be so onerous I don't think anyone would bother.

There is however an alternative to this proposal. Keep the behaviour of Cargo as it is now but expose more of the internal behaviour to outside build systems so they can have more control over the process. For example, what I'm picturing might be something like the following:

  1. A command for Cargo to list all the crates and their specific versions which are needed to build the project; listed in the correct build order.
  2. A way to download an individual crate from the spec in the above list to a custom location.
  3. A way to extract the crates (the external build system could do this but letting Cargo do it might make the process more consistent?)
  4. A way to build the project with the manually extracted crates (after the external build system has patched those crates)

The above is probably just one approach and someone more familiar with Cargo's internals might have a better suggestion, however the core idea here is a way for external build systems to hook into any part of the Cargo build process to make it easier to integrate into their own build process.

hetmankp avatar Oct 23 '17 00:10 hetmankp

+1

nhynes avatar Jan 17 '18 22:01 nhynes

Hello,

Here i use Rocket for a project. But there is a thing Rocket doesn't do well for my purpose : Chunked Transfer Encoding of few bytes. I had to patch Rocket AND Hyper (Rocket uses an old version of Hyper → 0.10.13).

There is an issue for this on Rocket, but i can't open an issue for this on Hyper as latest versions of hyper already has the patch (but Rocket isn't compatible with latest Hyper).

So, as long as Rocket doesnt use latest Hyper, i am stuck at forking both projects, to add my patches to them, while @hetmankp's proposal would allow me to store only 2 diff files of around 10 lines each.

That's why i +1 this issue. Sorry for my english if understanding me is hard.

gfriloux avatar Mar 29 '18 10:03 gfriloux

Another case where it might be useful, is when you need to provide a mini test project that expose a bug. You could provide a patch for the impacted library, and observe the bug by (de)commenting the patch from the Cargo.toml file.

I like the idea. Now, it may not be doable.

gfriloux avatar Mar 29 '18 10:03 gfriloux

Hey folks, I created a (very) dirty POC for a Cargo Patch sub command [1] which allows patching dependencies from crates.io. It is far away from being feature complete but it should be a nice base to get something started. I will work on it on the next days and weeks and help would be much appreciated.

[1] https://github.com/mettke/cargo-patch

mettke avatar May 20 '20 14:05 mettke

I've posted an implementation PR. Currently in draft.

da-x avatar Dec 19 '20 15:12 da-x

On the road to publishing an RFC for this, there's another use case I'd like to present - dealing with trivially broken dependencies upon a Cargo.lock update.

  • When updating a Cargo.lock to latest versions, sometimes we may discover dependencies that are broken either in compile time or run-time. Sometimes these breakages can remain unresolved for weeks.
  • Hand-crafting a set of versions to undo a breakage of an updated Cargo.lock can be more difficult (and sometimes impossible) compared to applying trivial patches to the broken versions by whatever means.
  • Specifically, patch files sometimes apply cleanly to more than one version, and they bear a small footprint on your source tree, and can be easily adjusted/updated/removed once Cargo.lock is updated.
  • Therefore, generating and using patch files may be more manageable than the alternatives (forking and vendoring).
  • This underscores the likelihood that this feature is mostly useful for repositories providing a Cargo.lock, i.e. for standalone programs or for testing, and likely to be coupled with a Cargo.lock source tracking.

Case study

Currently for my ptytest crate, if I recreate Cargo.lock the test suite currently breaks. However it breaks because of an indirect dependency: a dependency of vt100 crate, named enumset. See relevant issue in enumset.

The patching process with this feature can work as follows: I take or rebase an already existing fix for version 0.4.5, and generate the patch file using the following Git command:

$ git format-patch HEAD~1.. --stdout > /path/to/ptytest/patches/enumset-0.4.5.patch

With a full implementation of proposed change, patching over crates.io would work, and adding these two lines in the Cargo.toml would be sufficient:

[patch.crates-io]
enumset = { patch-files = ["patches/enumset-0.4.5.patch"] }

Discovery of patches

If anyone uses ptytest from above and also reset their Cargo.lock, they would have to import this the patch I mentioned above to their environment as well. It's also easier for them than vendoring, forking by themselves or relying on my own github fork (hell no, safer to just rely on crates.io). I wonder if is worth to add a Cargo command to list all the patches used by all dependencies and indirect dependencies.

Patch files vs a user-provided patching program

An alternative to patch files is providing an external program that would patch the crate, for example like with build.rs, but before the dependencies are fully calculated, e.g. a new and separately provided preprocess.rs, in addition to build.rs. For build.rs we provide a protocol that tells Cargo whether the program needs to be re-run. For this case, we can do the same.

This approach is much thicker than patch_files and requires specifying how the preprocess.rs would affect the build process, the calculated dependencies, and its protocol with the Cargo program. Also, it would run for libraries as well, and it is less clear what it would do for them. It also poses some portability concerns.

da-x avatar Jan 16 '21 12:01 da-x

@da-x Did you ever get around to publishing that RFC? I don't see it on https://rust-lang.github.io/rfcs/

PanopticaRising avatar Aug 18 '21 18:08 PanopticaRising

@PanopticaRising not yet, you can go ahead and work on it if you'd like (just let me know).

da-x avatar Aug 19 '21 10:08 da-x

I see there was an implementation, as well as good idea and good usecases.

I just wanted to add my own support for this feature with an example, that actually harms the entire ecosystem a bit.

So basically this is what triggered it: https://github.com/willcrichton/winit/commit/5fac5afb726a7a159eaea86027a4872dafb20f81 I wanted to use that in my bevy app I was playing around with. Now the crates.io version of bevy is 0.5, bevy_winit 0.5, which is a subcrate of bevy 0.5 has winit 0.24 included. I need winit 0.25, and i need to apply that patch to winit 0.25.

I also use a few libraries, "bevy_mod_picking", "bevy_web_fullscreen" and other small but useful crates, they are all pointing towards bevy 0.5.

What ended up happening, is that I need to download winit, apply that patch to winit. I then proceeded to download bevy, patch that so that it points to my newly pathched winit. Now bevy is a custom version and I need all the different bevy libraries to use the same bevy installation. I did not get the [patch.crates-io] to work as intended. So i vendored all the libraries i use, changed one line in the Cargo.toml file of each of the libraries to point to my patched bevy, that points to my patched winit.

I dont want to maintain or have a fork in my github that only changes one line of the Cargo toml file, to point to a file on my local disk. So what I ended up doing was removing the .git repo information, and are now keeping all of the bevy libraries I use as a local copy with all git history scrubbed.

The harm is basically that any fixes or changes I now do the subrepos, are not in any kind of git structure, and I cannot send a merge request in a simple way anymore. Had i been able to just store some git patches to the crates in question, I would have had an easier time to contribute back to the ecosystem, I suspect there are more silent vendoring, with bugfixes that never see the light of day because of this.

TLDR: wanted to apply one patch to a subdependency, ended up needing to vendor 10 different libraries to in total change maybe 20 lines of code, and removing the individual git history of each of those.

TotalKrill avatar Sep 15 '21 16:09 TotalKrill

@TotalKrill But wouldn't that concrete and probably common problem be better fixed with some way to pin/override dependencies of dependencies?

mohe2015 avatar Sep 15 '21 22:09 mohe2015

@mohe2015 yeah, but a way to apply patches would mean i get a way to do just that, and in the same way maybe fix some name change or other very minor, bit still breaking change in that library.

patches are also a nice way of keeping your own project tidy. It would be a lot easier for someone to test a patch from a merge request and similar as well.

I would however limit this cargo feature to work on only binaries, with the assumption that binaries are the final step in these deeply nested dependency trees we are seeing in projects.

TotalKrill avatar Sep 15 '21 22:09 TotalKrill

@da-x @PanopticaRising @mettke @gfriloux @nhynes @hetmankp

I started on a draft of this RFC proposal, this is my first attempt at creating such a proposal, and I based it upon my example in this case. Feel free to help out, extend or comment on this RFC draft, as well as to recommend other interested parties to take a look. You can find it here:

https://github.com/rust-lang/rfcs/pull/3177

TotalKrill avatar Sep 16 '21 18:09 TotalKrill

I really just wanted to comment on this thread as it was something that I really, badly need right now, so I was hoping that by fervently posting nice things in this thread, it would magically appear. At the very least, I'm expressing my support for this feature.

Whoever implements this, thank you so much.

rwthompsonii avatar Jan 06 '22 19:01 rwthompsonii

I really just wanted to comment on this thread as it was something that I really, badly need right now, so I was hoping that by fervently posting nice things in this thread, it would magically appear. At the very least, I'm expressing my support for this feature.

Whoever implements this, thank you so much.

Same. Seems like a pretty useful feature. I need it myself and I'm trying to figure out a way around it.

Titaniumtown avatar Feb 25 '22 23:02 Titaniumtown

I really just wanted to comment on this thread as it was something that I really, badly need right now, so I was hoping that by fervently posting nice things in this thread, it would magically appear. At the very least, I'm expressing my support for this feature. Whoever implements this, thank you so much.

Same. Seems like a pretty useful feature. I need it myself and I'm trying to figure out a way around it.

I'll second these. Kind of desired feature!

matislovas avatar Jul 13 '22 13:07 matislovas

So I agree that it goes against the soul of cargo. However, this world isn't perfect; plenty of crates have things in them that need to be fixed, at least temporarily until the depended upon crate is updated. I think we all understand and are okay with the fact that a patching system would be unstable, but sometimes it's just necessary.

MixusMinimax avatar Oct 05 '22 16:10 MixusMinimax

So I agree that it goes against the soul of cargo. However, this world isn't perfect; plenty of crates have things in them that need to be fixed, at least temporarily until the depended upon crate is updated. I think we all understand and are okay with the fact that a patching system would be unstable, but sometimes it's just necessary.

Absolutely. Not to mention it's common practice for projects that aggregate a large number of other projects. I'm thinking of build systems like Buildroot or OpenWRT. Needing to get something working quickly in production while you also kick off the lengthier process of updating the upstream project can often be very important. Sometimes you can also be stuck having to use a given crate version due to a web of dependencies within your system and an update to the fixed upstream version may not be possible until some future point. Having to create repository clones for every package that needs a small two line fix can become arduous when you're managing hundreds of packages.

I guess the short of it is, many developers are often engaged in plumbing work to interconnect many upstream pieces and while plumbing work can be dirty and unpleasant it's still vital to keep our society going. It would be nice to make that work just a little less unpleasant.

hetmankp avatar Oct 17 '22 05:10 hetmankp

See also rust-lang/rfcs#3177

epage avatar Oct 17 '23 14:10 epage