jscpd icon indicating copy to clipboard operation
jscpd copied to clipboard

Parse .gitignore As Git Does

Open Kurt-von-Laven opened this issue 3 years ago • 9 comments

Describe the bug .gitignore files are parsed in a manner inconsistent with their specification.

To Reproduce Steps to reproduce the behavior:

  1. Setup a yarn 2 project using Plug'n'Play.
  2. Create a .gitignore file as recommended by yarn, optionally replacing the .yarn/* with .yarn/**.
  3. Set the "gitignore" and "blame" options to true in .jscpd.json.
  4. Install some unplugged dependencies using yarn.
  5. PnPify some SDKs using yarn pnpify --sdks.
  6. Commit your work to git.
  7. Run jscpd on the current working directory.
  8. The .yarn/sdks/ and .yarn/unplugged/ directories are searched for duplicates despite being gitignored, resulting in a "fatal: no such path" crash from git blame since these directories are not tracked.

Expected behavior I expected the files and directories within the .yarn/ directory that were not listed in .gitignore with a leading ! to be excluded.

Desktop (please complete the following information):

  • OS: Ubuntu
  • OS Version 21.04
  • NodeJS Version 16.2.0
  • jscpd version 3.3.26

Additional context I have not fully investigated the issue, but I observed that gitignore-to-glob filters hidden files and directories out of the gitignore file, so I recommend simply running git-check-ignore and only supporting this feature when Git is installed. Alternatively, this issue may be resolved by switching to [email protected].

Workaround Add "./.yarn/**" to the "ignore" array in .jscpd.json.

Kurt-von-Laven avatar Jun 07 '21 05:06 Kurt-von-Laven

I have problem that .gitignore is not consumed irrespective whether I have --gitignore command switcher or put "gitignore": true to the .jscpd.json

Workaround is the only what is works

borys-pc33 avatar Nov 30 '21 10:11 borys-pc33

@borys-pc33, I am not able to reproduce your issue. I recommend posting your .jscpd.json on Stack Overflow and asking about it there.

Kurt-von-Laven avatar Nov 30 '21 15:11 Kurt-von-Laven

{ "threshold": 7, "reporters": ["html", "console", "badge"], "ignore": ["/node_modules/", "/bin/", "/obj/", ".git"], "absolute": false, "gitignore": true, "output": "jscpd-report", "minLines": 5, "minTokens": 40, "mode": "mild", "formatsExts": { "css": ["css"], "liquid": ["liquid"], "markdown": ["md"], "yaml": ["yaml", "yml"], "json": ["json"], "csharp": ["cs"] } }

I think I can make a clone of the repo without irrelevant information

borys-pc33 avatar Nov 30 '21 19:11 borys-pc33

Yeah, that would be helpful, in particular so your .gitignore can be seen. I don't see any issue with your .jscpd.json off hand. I don't think our issues are related. Also, see GitHub's guide to formatting code properly in Markdown.

Kurt-von-Laven avatar Dec 01 '21 05:12 Kurt-von-Laven

Hello,

Sorry, was a little bit busy. Here is public repo with https://github.com/borysl/jscpd-test minimal necessary settings which lead to failure. The previous commit https://github.com/borysl/jscpd-test/commit/f024de168bca11f1e51b1d0de4faf7b50132eef3 will work

borysl avatar Dec 08 '21 17:12 borysl

Apologies, @borysl; somehow I completely missed your message. Your configuration issue was quite tricky and ultimately unrelated to the issue I originally reported. Here is a patch: borysl/jscpd-test#1. If we all agree that gitignore-to-glob fails silently with Windows line endings, I can file a bug upstream.

Kurt-von-Laven avatar Apr 30 '22 08:04 Kurt-von-Laven

It looks like the issue is somewhat related to hidden paths.

Given the following .gitignore

/.cache/
/cache/

We have a difference of behavior we can quickly figure out simply by storing the npm cache on the root project directory:

# Produces no duplicate report from the `cache` directory
g clean -fdx && npm_config_cache=cache npx [email protected] --gitignore .
# Produces duplicate report from the `.cache` directory
g clean -fdx && npm_config_cache=.cache npx [email protected] --gitignore .

Looking at the related source code, we may assume it come from the pattern custom transformation or a bug with the used gitignore-to-glob package.

In case of second option, we may replace it by a more maintained one named ignore.

Dirty workaround that may feet some situation, using --ignore='./.cache/**/*' option alongside the --gitignore one.

soullivaneuh avatar Mar 01 '23 23:03 soullivaneuh

As mentioned in the original issue, the root cause is that gitignore-to-glob deliberately filters hidden files and directories out of the .gitignore. Using the ignore package instead seems like a great solution.

Kurt-von-Laven avatar Mar 01 '23 23:03 Kurt-von-Laven

jscpd seems to not understand that .gitignore lines can be prefixed with a /, either.

For example, if I have /foo in my .gitignore, I still see dupes in the foo directory. However, if i remove the leading slash, jscpd ignores correctly (but now a file named foo or a subdirectory named foo will get ignored).

FFdhorkin avatar Oct 21 '23 01:10 FFdhorkin