trufflehog icon indicating copy to clipboard operation
trufflehog copied to clipboard

Trufflehog ignoring API tokens in jupyter notebook and python script

Open asmaier opened this issue 2 years ago • 2 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

TruffleHog Version

trufflehog 3.6.7

Trace Output

https://gist.github.com/asmaier/6cb73713eff0830e86eaed024128385f

Expected Behavior

I expect trufflehog to find both API tokens in both files

Actual Behavior

Trufflehog doesn't find the leaked API tokens in the python file and the jupyter notebook.

Steps to Reproduce

I created a minimal git repository with two files

test.py

user = "[email protected]"
token = "fmq9UXkjXpSqD7UoGgqRhhntoI3sL0z9Cs6RlVnn"

test.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11290b8f-0422-426e-a13a-a08fe45246a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "user = \"[email protected]\"\n",
    "token = \"fmq9UXkjXpSqD7UoGgqRhhntoI3sL0z9Cs6RlVnn\""
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

Then I ran

git init
git add test.py test.ipynb
git commit -a -m "First commit"

In the same directory I executed

trufflehog git --no-update --no-verification --trace file://.

No problem was reported. In both files the API keys have not been detected.

Environment

  • Mac OS X
  • 12.4

Additional Context

gitleaks detects the API token in the python file (but also not in the jupyter notebook).

❯ gitleaks detect -v -l debug --source .                                                

11:36AM DBG No gitleaks config found in path .gitleaks.toml, using default gitleaks config
11:36AM DBG executing: /usr/local/bin/git -C . log -p -U0 --full-history --all
{
	"Description": "Generic API Key",
	"StartLine": 2,
	"EndLine": 2,
	"StartColumn": 2,
	"EndColumn": 51,
	"Match": "token = \"fmq9UXkjXpSqD7UoGgqRhhntoI3sL0z9Cs6RlVnn\"",
	"Secret": "fmq9UXkjXpSqD7UoGgqRhhntoI3sL0z9Cs6RlVnn",
	"File": "test.py",
	"Commit": "d544821afd9b44aa2a741f5c2c38ea0da5bddd0e",
	"Entropy": 4.734184,
	"Author": "Andreas Maier",
	"Email": "[email protected]",
	"Date": "2022-07-05T09:12:35Z",
	"Message": "Second commit",
	"Tags": [],
	"RuleID": "generic-api-key"
}
11:36AM DBG 2 commits scanned. Note: this number might be smaller than expected due to commits with no additions
11:36AM INF scan completed in 67.795762ms
11:36AM WRN leaks found: 1

References

asmaier avatar Jul 05 '22 09:07 asmaier

Hi Asmaier. We do detect keys and tokens in jupyter notebooks, just not generic keys. Our engine tries to identify where the key goes to, and if it can't, it won't report it. This is to reduce false positives and only alert on things we have high confidence in. The older V2 version of trufflehog actually did have a rule for generic keys that would have alerted on this, however we removed that capability as it threw too many false positives.

dxa4481 avatar Jul 06 '22 05:07 dxa4481

I understand that false positives are annoying. However for a tool like trufflehog false negatives mean leaks of API keys, which potentially can cause huge damage and costs. So it would be nice if one could configure trufflehog to reduce the number of false negatives if one needs that.

Therefor a solution could be to offer an option to activate a rule for detection of generic keys (maybe with a warning in the documentation that it might increase number of false positives).

asmaier avatar Jul 06 '22 15:07 asmaier

Closing because we are already tracking requests for generic key detection, and are not adding it at this time.

dustin-decker avatar Aug 15 '22 16:08 dustin-decker