nightmare-load-filter
nightmare-load-filter copied to clipboard
Urls not listed in filter still pass through?
First of all thanks for making all these great things happen, big kudos!
Just did a quick run and seemed urls not listed in the filter
still got passed to the fn
.
See below, edgesuite.net
is not listed in the filter I would assume it shouldn't got passed into fn
at all, am I doing something wrong here?
this.browser
.filter({
urls: ['https://*.github.com/*', '*://electron.github.io']
}, function(details, cb){
// a request to http://img.edgesuite.net/foo.png got passed in and blocked, which shouldn't
return cb({cancel: (details.url.indexOf('edgesuite.net') !== -1 )});
})
.goto( url )
@coodoo What version of Nightmare and nightmare-load-filter are you using, out of curiosity?
I tried your example, and if you add logging in the filter callback, it doesn't look like it gets called. How are you asserting that the image is getting blocked? (The URL provided returns a 502.)
@rosshinkley I'll provide detailed report soon, quick question: how do I log in the filter callback? I tried the standard console.log('foo')
to no avail.
Here's a short code sample to reproduce the issue, edgesuite.net
is not listed in the rules
, yet all images from that website were blocked, I would expect that should not happen?
const rules = [
'google.com',
// 'edgesuite.net'
]
this.browser
.filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
.goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )
Using:
"nightmare": "^2.5.0",
"nightmare-load-filter": "0.2.0",
I tried the standard console.log('foo') to no avail.
Output will be a part of the Electron stdout
. Run your script with DEBUG
and you'll have better luck.
...yet all images from that website were blocked, I would expect that should not happen?
That's odd. Maybe this is a quirk of later versions of Electron or Chromium - I would expect whole matches (eg, http://www.google.com
) to match only that address, but it looks like that filter is completely ignored. In fact, I'd expect it to behave how WebRequest
match patterns work. I'll dig into this as time permits.
It looks like it works as expected if you are willing to use wildcards. Your example, slightly modified:
const rules = [
'http://google.com/*',
// 'edgesuite.net'
]
this.browser
.filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
.goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )
Very interesting findings! After playing with it a bit more I found the url must contain ://
and /
after the domain, so something like *://google.com/*
works, any other form won't.
That doesn't surprise me as much: google.com
is ambiguous and should be "fully qualified" (even if the full qualification is with wildcards, it's required to be explicit about what you expect). I can kind of understand why that wouldn't work.