eth-phishing-detect icon indicating copy to clipboard operation
eth-phishing-detect copied to clipboard

add domain parsing with custom psl list and removing false positives

Open mindofmar opened this issue 5 months ago • 2 comments

This PR introduces a major improvement to the test-lists.ts file which is critical for preventing false positives within CICD.

Rationale

Previously, we've been allowing subdomains to bypass the tranco check because we were not parsing the domain within this test. Therefore auth.magic.link !== magic.link even though magic.link is on the tranco list and this subdomain is owned by the same company. This has been the source of widespread impact false positives being merged into this repo, and this PR will fix nearly all future cases of the problem. (with the assumption that the tranco list stays up to date so that it contains future popular websites).

As you can see from the websites I removed in config.json including berkeley.edu, bitcoin.com, consensys.net and others, this does a very good job at detecting false positives correctly.

Extending public suffix list

The change that enables this change is the custom-tlds.ts file which now extends public suffix list with suspected hosting providers that do NOT exist on PSL already. This may be annoying to keep up to date, but is necessary to do because PSL is slow to update and there are lots of hosting providers that allow you to host malicious websites. Unfortunately, these hosting providers have a high enough tranco score so they may cause the CICD to fail.

To remedy this you have 2 options:

  1. If only 1 website in config.json uses this domain, you can just add the hostname to the bypass list inside test-lists.ts
  2. If several websites are using this as a hosting provider OR it the root website is advertising DNS services, add it to the custom-tlds.ts file.

To debug the above, I recommend the following code snippet (or CTRL+F 😉 )

// test-lists.ts

            import { parse } from 'tldts';
           //........

            t.equal(blocked.length, 0, `The following domains should not be blocked: ${blocked}`);

            const map = new Map();
            for (let x = 0; x < blocked.length; x++) {
                const parsedDomain = parse(blocked[x], { allowPrivateDomains: true }).domain;
                if (map.has(parsedDomain)) {
                    map.set(parsedDomain, map.get(parsedDomain) + 1);
                } else {
                    map.set(parsedDomain, 1);
                }
            }
            const sortedMap = new Map([...map.entries()].sort((a, b) => b[1] - a[1]));
            sortedMap.forEach((val, key) => {
                console.log(`${key} : ${val}`);
            });

            t.end();

mindofmar avatar Sep 24 '24 04:09 mindofmar