talisman SUGGESTION: ignore pattern instead or additional to ignore file

Hi, I have a lot of LaTeX documentation in a git repo which refers e.g. to gdrive docs or sheets. Talisman does a good job to identify this referrers, because the looking like base64 code :-)

I am able to disarm only files with this patterns. I'd like to disarm only the pattern itself, not the hole file - so i will stopped, if i do another mistake in the same tex file.

Same problem may occur in comments of any programming language.

Aleks

Nov 17 '17 10:11 alesti

Related PR about ignoring only specific detectors: #38 Also discusses if we need a more powerful configuration format to support features like what you describe

Oct 12 '18 08:10 flosell

This can be done from .talismanrc by configuring ignore_detectors under fileignoreconfig. @alesti Please check and let us know if that solves what you were looking for.

Refer: https://github.com/thoughtworks/talisman#ignoring-specific-detectors

May 20 '19 08:05 harinee

I have the same problem, getting false positives for files that are modified regularly, appears to create an unnecessary burden to have to update/add entries to .talismanrc in the repo root to approve the new content and tends to train developers to just either add the filecontent for the file to the ignore_detectors or always run the checksum and consequently secrets can slip through.

@harinee Think the current ignore_detectors available are a little too broad, if I turn on filecontent I won't get alerted subsequently to any other token getting injected.

An ability to specify some 'allowed' patterns or strings, that can be per-file or per-repo would be an improvement to ignoring all filecontent in this case. Essentially a way of whitelisting as I think it's much better to be able to say, that rule doesn't apply to this line or this string, because we know it's ok, rather than a whole file unless the file is a set of constants that are referenced elsewhere or a test ssh key.

Currently awslabs/git-secrets has taken this approach, and is the reason that I'm planning to use it combined with the regex patterns from trufflehog for now. Being able to enable a scanner that can pick up entropy strings with individual string/pattern whitelisting would make me switch to that one as otherwise the false positives hit rate is too high. I know I could switch off entropy checking in trufflehog but as it still operates using full file content exclusions, that's not as good as being able to add specific string allows which is what I'll get from adding all of it's regexes to be used by git-secrets as well.

I'd imagine this could be:

Global to repo: add repoignore section with 'allowed_patterns' (or something)
to file: add entry for 'allowed_patterns'
to line: support reading text containing talisman: ignore_xxxxx on the same line (in a comment) or above a block with something to tell it how many lines to cover to ignore anything triggered

I'd imagine that an allowed_patterns under file entries in fileignoreconfig would cover sufficient use-cases that the other two options would be just for making for nicer experience around particular edge cases.

Thoughts?

Jun 19 '19 13:06 electrofelix

talisman talisman copied to clipboard

SUGGESTION: ignore pattern instead or additional to ignore file

talisman
talisman copied to clipboard