git2-rs
git2-rs copied to clipboard
Self-referential repo + tree
I've found that I often need to store a Repository
and a Tree<'repo>
that borrows from it. Unfortunately, this creates unsafe self-referential structs.
It would be helpful if this crate provided a safe abstraction for repo + tree together, e.g.
let repo = git2::Repository::open()?;
/// makes RepositoryWithTree
let repo = repo.with_tree(|repo| {
repo.find_commit(head)?.tree()
});
*repo // Derefs to Repository
repo.tree(); // returns borrowed Tree
Alternatively, there's owning-ref
that has OwningHandle
trait. It would require git2 types to implement its traits.
This I believe is one of the major design flaws of the git2
crate right now. It was one of the first pieces of Rust code I ever wrote and I was pretty excited to use lifetimes. I think that reasons like this make it so it wasn't the right choice for all types. Ideally Tree
would retain a strong reference to the repository or owner from which it came so it wouldn't have a lifetime. Similarly objects and other "major" items shouldn't have lifetimes but they do.
I think it would be reasonable though to implement traits from owning-ref
behind an optional feature, however, in the meantime.
Ideally Tree would retain a strong reference to the repository or owner from which it came so it wouldn't have a lifetime.
That would create a potentially misleading interface, where you could hold onto what you think is a standalone object, but that keeps various other large objects around for longer than expected. It would also add to the size of those objects, and some programs keep a large number of objects around.
I appreciate that git2 uses lifetimes the way it does. Personally, I'm hopeful that some future polonius-based rustc is capable of handling self-references "natively"; until then, using something like owning-ref
seems reasonable, and perhaps git2 could provide some additional support to make that easier.
Just adding my 2c worth of experience...
I just ran into this, tried making it work with thread-local statics (doesn't work because LocalKey::with
doesn't let the reference to the inner value escape the closure), tried making it work with OwningHandle
(realised I'd need to throw in some unsafe to make that work, which I'm trying to avoid in this project), and eventually ended up finding this issue.
My use case is a cache that persists for the lifetime of my program, and includes things like the current working tree diff so I don't need to recompute it all the time (once per thread would be fine) or pass it around to all the places that need it. So I'm trying to find a way to stash a Repository
and a Diff
together somewhere with a 'static
lifetime, that also doesn't require those structs to be Send
.
I feel like I'm missing something simple here, but I keep running into dead-ends. 😊
This I believe is one of the major design flaws of the
git2
crate right now. It was one of the first pieces of Rust code I ever wrote and I was pretty excited to use lifetimes. I think that reasons like this make it so it wasn't the right choice for all types. IdeallyTree
would retain a strong reference to the repository or owner from which it came so it wouldn't have a lifetime. Similarly objects and other "major" items shouldn't have lifetimes but they do.
Hey, I got a question which might sound silly. Why exactly are the lifetimes or references to the objects owner required at all? Maybe I am misreading the documentation of libgit2 (having never worked with it directly), but as far as I can understand the objects can exist on their own, as we are required to free almost all of them ourselves anyways, right?
I am currently getting bitten pretty hard by the lifetimes (and also the missing Send impls for objects, but a topic for another day) and would consider putting some work into a PR (if there is interest), ripping the lifetimes out for stuff like commits, objects, trees etc. But my C-knowledge is rudimentary enough to potentially be dangerous, so I'd definitely need some Feedback on the general approach.
Whatever the case, thank you (and all others who worked on this) a lot for this library :)
Maybe I am misreading the documentation of libgit2 (having never worked with it directly), but as far as I can understand the objects can exist on their own, as we are required to free almost all of them ourselves anyways, right?
TL;DR: you can't be sure about these things without diving into the libgit2 documentation and/or source.
First of all: deallocation/destruction being the library user's responsibility or not generally has nothing to do with whether an object can life on it's own or not. Especially not in C.
In languages like C, an object (e.g. tree or commit in this case) may contain arbitrary references to data held alive by another object (e.g. the repository). Free the repository while still having a commit in use and you'll end up with a use-after-free bug. Ideally, the documentation would contain some informal description of the requirements on the order of destruction, but as documentation is written by humans it may still be incomplete, and more often than not it doesn't make any declarations regarding this particular topic at all. In the case of libgit2, I'd have a closer look at some more general documentation they provide, then maybe the library reference (though you probably will not find any useful info regarding lifetimes there).
You'll probably need to dive into the source yourself, find out what each object references, whether you have shared data and who's responsible for desctucting and/or deallocating that data. You may encounter some form of reference counting (think Rc
or, if you're lucky, Arc
), which allows "safely" depending on data originally part of the repository after freeing it, but that's not a given. You really have to find out yourself. And maybe extend the documentation of libgit2 along the way.
Hi thank you a lot for the detailed explanation. The more I learn about C the less I understand how anyone ever manages to write correct or at least semi-correct software in it.
I'll definitely take a better look at the code and documentation to learn then, but I am also even more scared to touch any of it, at least if others might depend on my work in the future. So no PR to this project by me :)