dulwich icon indicating copy to clipboard operation
dulwich copied to clipboard

Blob creation ignores line-ending policies

Open BoarGules opened this issue 7 years ago • 8 comments

If I read the code correctly, dulwich.index.blob_from_path_and_stat() pays no attention to the core.autocrlf setting and pays no attention to anything in the .gitattributes file.

This is unexpected, and on Windows it is a particular problem because code staged and committed using Dulwich will have line endings that are inconsistent with commits by other tools that do follow Git line-ending policies.

BoarGules avatar Oct 03 '18 09:10 BoarGules

On Wed, Oct 03, 2018 at 02:58:26AM -0700, BoarGules wrote:

If I read the code correctly, dulwich.index.blob_from_path_and_stat() pays no attention to the core.autocrlf setting and pays no attention to anything in the .gitattributes file.

This is unexpected, and on Windows it is a particular problem because code staged and committed using Dulwich will have line endings that are inconsistent with commits by other tools that do follow Git line-ending policies. Unfortunately there are a number of places where Dulwich doesn't support C Git options like this.

A pull request to add support for autocrlf would be great.

-- Jelmer Vernooij [email protected] PGP Key: https://www.jelmer.uk/D729A457.asc

jelmer avatar Oct 05 '18 08:10 jelmer

I'm working on a patch. Implementing the line-ending policy isn't hard. But first I have to read the configuration to discover what the policy is. I am running into NotImplementedErrors in dulwich.config and filling those gaps is taking longer than I would like.

BoarGules avatar Oct 05 '18 10:10 BoarGules

On Fri, Oct 05, 2018 at 03:41:28AM -0700, BoarGules wrote:

I'm working on a patch. Implementing the line-ending policy isn't hard. But first I have to read the configuration to discover what the policy is. I am running into NotImplementedErrors in dulwich.config and filling those gaps is taking longer than I would like. What method exactly is raising NotImplementedError?

You probably want to add a autocrlf argumen to Blob.from_path_and_stat() and then set that argument based on the config in Repo.stage().

You should be able to get a config object from a repository - by calling:

self.get_config_stack().get_boolean(("core", ), "autocrlf")

jelmer avatar Oct 06 '18 20:10 jelmer

According to my reference, the place to start is core.eol, which is almost never defined, and so raises an unavoidable KeyError, because the .get() method lacks a default parameter. It's hard to discover what is there because the discovery methods itersections(), iteritems(), has_section() all raise NotImplementedErrors. The module config needs attention in any case because it doesn't know to look in common Windows config file locations. That is what is taking longer than I would like.

BoarGules avatar Oct 08 '18 07:10 BoarGules

On Mon, Oct 08, 2018 at 12:21:31AM -0700, BoarGules wrote:

According to my reference, the place to start is core.eol, which is almost never defined, and so raises an unavoidable KeyError, because the .get() method lacks a default parameter. It's hard to discover what is there because the discovery methods itersections(), iteritems(), has_section() all raise NotImplementedErrors. The module config needs attention in any case because it doesn't know to look in common Windows config file locations. That is what is taking longer than I would like. There shouldn't be a need to iterate over sections or items here - just calling .get() and catching the KeyError is what I'd expect to happen here.

Jelmer

-- Jelmer Vernooij [email protected] PGP Key: https://www.jelmer.uk/D729A457.asc

jelmer avatar Oct 08 '18 07:10 jelmer

I have a lot of lookups to do. I don't want to litter the code with try...except blocks.

BoarGules avatar Oct 08 '18 07:10 BoarGules

I've started to work on this issue in https://github.com/dulwich/dulwich/pull/674

As @BoarGules already started from the configuration point of view, I started from the line-ending conversion point of view.

Lothiraldan avatar Dec 10 '18 15:12 Lothiraldan

Here is an example of how to clone a repository with pre-defined autocrlf settings using the low-level "plumbing". I looked at the examples provided in #674 for inspiration - though all those examples were for initializing and committing to a bare repository. I was not able to find a way of using the porcelain (without monkey patching) as it handles the repository initialization for you with assumptions.

from dulwich import client, index
from dulwich.repo import Repo

repo_path = <some_repo>
target = "./local"
c, path = client.get_transport_and_path(repo_path)
repo = Repo.init(target)

config = repo.get_config()
config.set("core", "autocrlf", False") # or True, or auto etc.
config.write_to_path()
# At this point, you'll get a bare repository with a `.git/config` autocrlf entry

remote_refs = client.fetch(path, repo, determine_wants=repo.ojbect_store.determine_wants_All)

repo[b"HEAD"] = remote_refs.refs[b"HEAD"]

tree = repo[b"HEAD"].tree
index_file = repo.index_path()
index.build_index_from_tree(repo.path, index_file, repo.object_store, tree)

themanifold avatar Feb 14 '19 12:02 themanifold