New remotes save with unexpected metadata in the presence of global configuration
Current behavior 😯
When a user has global configuration like remote.origin.prune = true, a repository configuration with no remote is opened, and a remote is created and saved with that name to the repository’s configuration, the new configuration is written to the existing section from the global configuration. This means that the metadata from the global configuration (including trust) is used instead of the file’s metadata, and an attempt to save the modified configuration with config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()) will omit the new remote.
This differs from the behaviour when there is no configuration for that remote outside of the repository configuration file, or the behaviour where a separate remote section already exiets in the repository configuration.
Expected behavior 🤔
A new remote section is created if it does not already exist in the gix::config::File, even if there is configuration for that remote from other sources.
In general I’m not sure when you would ever want the behaviour of the current section_mut family of functions, and I suspect this is downstream of the fact that gix::config::File mixes together sections from all sources, which also makes editing and saving configuration somewhat inconvenient. I think it is likely that a better design would have a gix::config::File per configuration source, with an abstraction on top that layers them together for look‐up of values. Mutating an abstraction consisting of the combination of multiple cascading configurations doesn’t make that much sense; you want to use it to decide the effective values, but operate on one specific file for mutation and writing.
However, this could be worked around in the remote API by simply creating a new section if one with metadata that matches the actual file doesn’t already exist.
Git behavior
yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ git init
Initialized empty Git repository in /private/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/tmp.vVfQbOHnTq/.git/
yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ git -c remote.origin.prune=true remote add origin https://example.com/
yuyuko:/v/f/y/m/T/tmp.vVfQbOHnTq
❭ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = https://example.com/
fetch = +refs/heads/*:refs/remotes/origin/*
Steps to reproduce 🕹
use std::fs::File;
use tempfile::TempDir;
type Error = Box<dyn std::error::Error>;
fn save_config(config: &gix::config::File) -> Result<(), Error> {
let mut config_file = File::create(config.meta().path.as_ref().unwrap())?;
config.write_to(&mut config_file)?;
Ok(())
}
fn dump_remote_sections(header: &str, config: &gix::config::File) -> Result<(), Error> {
println!("# {header}");
println!("# {:?}", config.meta());
for section in config.sections_by_name("remote").unwrap() {
println!("## {:?}", section.meta());
section.write_to(&mut std::io::stdout())?;
}
println!();
Ok(())
}
fn open_repo(repo_dir: &TempDir) -> Result<gix::Repository, Error> {
Ok(gix::open::Options::default()
.with(gix::sec::Trust::Reduced)
.config_overrides([b"remote.origin.prune = true"])
.open(repo_dir.path())?
.to_thread_local())
}
fn demonstrate_bug(repo_dir: &TempDir) -> Result<(), Error> {
let repo = open_repo(repo_dir)?;
let mut config = repo.config_snapshot().clone();
dump_remote_sections("Initial state", &config)?;
repo.remote_at("https://example.com/")?
.save_as_to("origin", &mut config)?;
dump_remote_sections("After saving remote", &config)?;
save_config(&config)?;
dump_remote_sections("After reload", &open_repo(repo_dir)?.config_snapshot())?;
Ok(())
}
fn main() -> Result<(), Error> {
let repo_dir = tempfile::tempdir()?;
gix::init(&repo_dir)?;
demonstrate_bug(&repo_dir)?;
Ok(())
}
Output:
# Initial state
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
prune = true
# After saving remote
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
prune = true
url = https://example.com/
# After reload
# Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
## Metadata { path: Some("/var/folders/yd/mh726b5d2vqfyp132jtzq9t80000gn/T/.tmpke1Cbq/.git/config"), source: Local, level: 0, trust: Reduced }
[remote "origin"]
prune = true
url = https://example.com/
## Metadata { path: None, source: Api, level: 0, trust: Full }
[remote "origin"]
prune = true
Apologies for the late response, thanks a lot for reporting and sorry for the hassle.
I understand that section_mut() as used in Remote::save_as_to() finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore.
After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.
However, this could be worked around in the remote API by simply creating a new section if one with metadata that matches the actual file doesn’t already exist.
I agree that a fix would be pass only the local config to begin with, which currently isn't super convenient. Repository::config_snapshot() will always provide the fully-merged result.
To me, a fix would be a documentation change to make clear that the passed configuration isn't filtered, and is expected to be filtered based on the user needs. Typically, this config should only contain local sections, it kind of overrides everything that is present already anyway.
As a second step, there could be a config_snapshot(_mut)() variant which returns pre-filtered configuration, maybe even with a closure, or add a method to filter sections to get a similar effect as git2::Config::open_level().
Do you think that would solve the problem decently? Maybe there are other solutions you imagined? In any case, your help with this would be very welcome. Thank you.
I understand that
section_mut()as used inRemote::save_as_to()finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore.After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.
Right. So when we actually write out configuration in Jujutsu, we do config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()), and this basically works fine except in this edge case and a few others. The result in that edge case is actually kinda worse than in the reproducer program I’ve shown here, because the section merging can end up with us removing the remote from the repository configuration entirely. We have documented some of these edge cases in https://github.com/jj-vcs/jj/commit/045b17e38edb886ba591fdbbd92399ddeb8fb01f and worked around this one in https://github.com/jj-vcs/jj/commit/af67d1d8c4ab4e53a3b8ba4862af07e5831428bc by creating a new empty section before writing out the remote.
I agree that a fix would be pass only the local
configto begin with, which currently isn't super convenient.Repository::config_snapshot()will always provide the fully-merged result.To me, a fix would be a documentation change to make clear that the passed configuration isn't filtered, and is expected to be filtered based on the user needs. Typically, this config should only contain local sections, it kind of overrides everything that is present already anyway.
As a second step, there could be a
config_snapshot(_mut)()variant which returns pre-filtered configuration, maybe even with a closure, or add a method to filter sections to get a similar effect asgit2::Config::open_level().
I am not sure this would be a great or at least complete fix. It would mean that you can’t really use the higher-level Remote API to do manipulation of remote configuration in files, since the Remote API operates on repositories rather than configurations. There’s also the existing problem where you can’t really make a new remote and save it to a repository’s mutable configuration, because both of them want to borrow the repository.
My feeling is that a more holistic fix might look like: gix::config::File represents one single file losslessly, i.e. it has a single Metadata, and gix::config::Layered (or something) stores a stack of gix::config::Files, and has lookup functions that resolves cascading overrides. You can borrow individual gix::config::Files from it to mutate them, but it does not directly offer mutation. APIs read from gix::config::Layered, but save to gix::config::Files. Remotes are created from gix::config::Layered, so you can construct one from a repository’s entire configuration in order to get the resolved configuration for a remote, or filter out a singleton gix::config::Layered with only the repository configuration for remote management operations, and write back modified configuration to the same. Since it keeps around a reference it can save directly to one of the Files backing its configuration rather than needing one passed in. That would also allow access to more elaborate remote‐related configuration, like the various other remote settings and branch remote config, which would help implement things like remote renaming.
Or something like that; it’s just a sketch and there are trade‐offs here. Unfortunately I don’t think I’ll have time to do a restructuring this big any time soon :( The workaround will probably work fine for us and we can do a more complicated one if our remote management needs become more complex.
I understand that
section_mut()as used inRemote::save_as_to()finds a global section and changes it instead. When saving that configuration, it writes everything to the local configuration due to the way this is implemented, but also code has no way of telling global and local modifications apart anymore. After reloading, this is also what we see - global and local modifications were merged effectively, something that isn't happening when Git is performing the same operation.Right. So when we actually write out configuration in Jujutsu, we do
config.write_to_filter(&mut config_file, |section| section.meta() == config.meta()), and this basically works fine except in this edge case and a few others. The result in that edge case is actually kinda worse than in the reproducer program I’ve shown here, because the section merging can end up with us removing the remote from the repository configuration entirely. We have documented some of these edge cases in jj-vcs/jj@045b17e and worked around this one in jj-vcs/jj@af67d1d by creating a new empty section before writing out the remote.
‼️
I am not sure this would be a great or at least complete fix. It would mean that you can’t really use the higher-level
RemoteAPI to do manipulation of remote configuration in files, since theRemoteAPI operates on repositories rather than configurations. There’s also the existing problem where you can’t really make a new remote and save it to a repository’s mutable configuration, because both of them want to borrow the repository.
I see, thanks for pointing this out! The save_as_to() method was created to be able to perform a full clone, and as such it's more of a hack than anything that is fully thought out. And of course, that comes with (unanticipated) problems that you now have to deal with - I am sorry for that.
My feeling is that a more holistic fix might look like:
gix::config::Filerepresents one single file losslessly, i.e. it has a singleMetadata, andgix::config::Layered(or something) stores a stack ofgix::config::Files, and has lookup functions that resolves cascading overrides. You can borrow individualgix::config::Files from it to mutate them, but it does not directly offer mutation. APIs read fromgix::config::Layered, but save togix::config::Files.
I also think something like this is needed, but only when writing files back is desired. There are legitimate cases when one wants to do in-memory edits, and this is (seemingly) fine with the current system.
However, when writing back changes is desired, the current system definitely falls over and is hard to use, the save_as_to() method being a good example for that (I tried and failed).
However, I also think that git2::Config::open_level() can show a way towards writing files back, somewhat in the vein of what you are proposing.
Without going into any detail, it feels like gix can abstract over this when offering an API to make writing changes back to disk easier, specifically cross-checked with use-cases like they occur with remotes: create/update/rename/delete should all work intuitively with it. All that is assuming that the underlying system coming with gix-config is powerful enough.
Finally, once gix has found a way, there might be opportunities to push that API down to gix-config.
Remotes are created from
gix::config::Layered, so you can construct one from a repository’s entire configuration in order to get the resolved configuration for a remote, or filter out a singletongix::config::Layeredwith only the repository configuration for remote management operations, and write back modified configuration to the same. Since it keeps around a reference it can save directly to one of theFiles backing its configuration rather than needing one passed in. That would also allow access to more elaborate remote‐related configuration, like the various other remote settings and branch remote config, which would help implement things like remote renaming.
That is an interesting thought, and I admittedly don't understand how it would work, but we will get to it when we get to it :). No matter what, any update to how configuration files are written needs to be able to provide answers to the issues that came up when dealing with remote configuration.