a UUID isn't detected by the high entropy string detection
We have a uuid in a file:
$ cat key.txt
732f51d4-033a-47be-baf4-c4d2a22f9368
And we scan that file:
$ detect-secrets scan key.txt
Which outputs:
{
"exclude": {
"files": null,
"lines": null
},
"generated_at": "2020-03-16T12:07:31Z",
"plugins_used": [
{
"name": "AWSKeyDetector"
},
{
"name": "ArtifactoryDetector"
},
{
"base64_limit": 4.5,
"name": "Base64HighEntropyString"
},
{
"name": "BasicAuthDetector"
},
{
"hex_limit": 3,
"name": "HexHighEntropyString"
},
{
"name": "JwtTokenDetector"
},
{
"keyword_exclude": null,
"name": "KeywordDetector"
},
{
"name": "MailchimpDetector"
},
{
"name": "PrivateKeyDetector"
},
{
"name": "SlackDetector"
},
{
"name": "SoftlayerDetector"
},
{
"name": "StripeDetector"
}
],
"results": {},
"version": "0.13.0",
"word_list": {
"file": null,
"hash": null
}
}
Why is a uuid not being detected?
I expected it to contain:
"results": {
"key.txt": [
{
"hashed_secret": "12345678901234567890etc",
"is_verified": false,
"line_number": 1,
"type": "High entropy string"
}
]
},
This is a known problem: https://github.com/Yelp/detect-secrets/blob/master/detect_secrets/plugins/high_entropy_strings.py#L318
You can also check this with
$ detect-secrets scan --string '732f51d4-033a-47be-baf4-c4d2a22f9368'
AWSKeyDetector : False
ArtifactoryDetector : False
Base64HighEntropyString: False (3.85)
BasicAuthDetector : False
CloudantDetector : False
HexHighEntropyString : False
IbmCloudIamDetector : False
IbmCosHmacDetector : False
JwtTokenDetector : False
KeywordDetector : False
MailchimpDetector : False
PrivateKeyDetector : False
SlackDetector : False
SoftlayerDetector : False
StripeDetector : False
TwilioKeyDetector : False
You can see that the HexHighEntropyString returns False (which means it doesn't match the charset), but the Base64HighEntropyString returns False (3.85) (which means it isn't high entropy enough).
In the new v1 version (at the time of writing, currently on the pre-v1-launch branch), we introduce the concept of "filters". For more information, check out the docs: https://github.com/Yelp/detect-secrets/blob/pre-v1-launch/docs/filters.md
Additionally, we have a filter that specifically excludes UUIDs: https://github.com/Yelp/detect-secrets/blob/pre-v1-launch/detect_secrets/filters/heuristic.py#L48. We also (intend to) have the option to disable filters, which you may want to do if you are explicitly looking for UUIDs.
I will also be addressing the fact that our HexHighEntropyString scanner currently ignores hyphens at a later time.