eth-phishing-detect
eth-phishing-detect copied to clipboard
add domain parsing with custom psl list and removing false positives
This PR introduces a major improvement to the test-lists.ts
file which is critical for preventing false positives within CICD.
Rationale
Previously, we've been allowing subdomains to bypass the tranco
check because we were not parsing the domain within this test. Therefore auth.magic.link !== magic.link
even though magic.link
is on the tranco
list and this subdomain is owned by the same company. This has been the source of widespread impact false positives being merged into this repo, and this PR will fix nearly all future cases of the problem. (with the assumption that the tranco list stays up to date so that it contains future popular websites).
As you can see from the websites I removed in config.json
including berkeley.edu, bitcoin.com, consensys.net
and others, this does a very good job at detecting false positives correctly.
Extending public suffix list
The change that enables this change is the custom-tlds.ts
file which now extends public suffix list with suspected hosting providers that do NOT exist on PSL already. This may be annoying to keep up to date, but is necessary to do because PSL is slow to update and there are lots of hosting providers that allow you to host malicious websites. Unfortunately, these hosting providers have a high enough tranco score so they may cause the CICD to fail.
To remedy this you have 2 options:
- If only 1 website in
config.json
uses this domain, you can just add the hostname to the bypass list insidetest-lists.ts
- If several websites are using this as a hosting provider OR it the root website is advertising DNS services, add it to the
custom-tlds.ts
file.
To debug the above, I recommend the following code snippet (or CTRL+F 😉 )
// test-lists.ts
import { parse } from 'tldts';
//........
t.equal(blocked.length, 0, `The following domains should not be blocked: ${blocked}`);
const map = new Map();
for (let x = 0; x < blocked.length; x++) {
const parsedDomain = parse(blocked[x], { allowPrivateDomains: true }).domain;
if (map.has(parsedDomain)) {
map.set(parsedDomain, map.get(parsedDomain) + 1);
} else {
map.set(parsedDomain, 1);
}
}
const sortedMap = new Map([...map.entries()].sort((a, b) => b[1] - a[1]));
sortedMap.forEach((val, key) => {
console.log(`${key} : ${val}`);
});
t.end();