purple-hats
purple-hats copied to clipboard
PDFs are being scanned when they shouldn't be.
I am not setting the filetype
-i, --fileTypes
With
node --max-old-space-size=6000 --no-deprecation purple-a11y/cli.js -u https://www.whitehouse.gov -c 2 -s same-domain -p 50 -a none --blacklistedPatternsFilename ./pa-gTracker-exclude-medicare.csv -k "Random Example:[email protected]"
But I am still finding PDFs in the list of URLs crawled. This shouldn't be the case.. If the default is html only then I shouldn't see any PDFs (or other docs) in my results.
Hi @mgifford, can I check which version of Purple A11y are you using to run the scan? E.g. 0.9.46, or newer (i.e. directly from GitHub master
)?
If you are running a version from master
, can you get the commit id so I can understand if this issue was already fixed? You can use the following command:
git log -1 --format="%H"
I have not been able to replicate the issue of pdfs scanned when default strategy is html-only
on latest master
commit