node-horseman icon indicating copy to clipboard operation
node-horseman copied to clipboard

`Unhandled reject Error: Failed to load url`

Open winghouchan opened this issue 8 years ago • 62 comments

@minhchu has reported a bug they are facing where an Unhandled rejection Error: Failed to load url is thrown. This is similar to #180 however the case is not due to SlimerJS in which #180 is specifically for. This new issue will track the new case @minhchu is facing.

The test case which can reproduce the issue is below:

var Horseman = require('node-horseman');
var horseman = new Horseman();

horseman
  .userAgent('Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0')
  .open('https://www.google.com/search?site=&tbm=isch&source=hp&biw=1366&bih=640&q=moon')
  .screenshot('example.jpg')
  .close();

The output is below:

  horseman using PhantomJS from $PATH +0ms
  horseman .setup() creating phantom instance on 12406 +4ms
  horseman phantom created +673ms
  horseman phantom version 2.1.1 +17ms
  horseman page created +11ms
  horseman phantomjs onLoadFinished triggered +13ms success NaN
  horseman injected jQuery +46ms
  horseman .userAgent() set +20ms Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0
  horseman .open() +1ms https://www.google.com/search?site=&tbm=isch&source=hp&biw=1366&bih=640&q=moon
  horseman phantomjs onLoadFinished triggered +899ms success 1
  horseman phantomjs onLoadFinished triggered +103ms fail 2
  horseman injected jQuery +1ms
  horseman .close(). +4ms
Unhandled rejection Error: Failed to load url
    at checkStatus (/Users/user/Projects/node_modules/node-horseman/lib/index.js:276:16)
    at tryCatcher (/Users/user/Projects/node_modules/bluebird/js/release/util.js:16:23)
    at Function.Promise.attempt.Promise.try (/Users/user/Projects/node_modules/bluebird/js/release/method.js:39:29)
    at Object.loadFinishedSetup [as onLoadFinished] (/Users/user/Projects/node_modules/node-horseman/lib/index.js:274:43)
    at /Users/user/Projects/node_modules/node-phantom-simple/node-phantom-simple.js:636:30
    at Array.forEach (native)
    at IncomingMessage.<anonymous> (/Users/user/Projects/node_modules/node-phantom-simple/node-phantom-simple.js:617:17)
    at emitNone (events.js:72:20)
    at IncomingMessage.emit (events.js:166:7)
    at endReadableNT (_stream_readable.js:913:12)
    at nextTickCallbackWith2Args (node.js:442:9)
    at process._tickCallback (node.js:356:17)

winghouchan avatar May 27 '16 10:05 winghouchan

Duplicates #187 😵 which @minhchu was able to submit 40 seconds before this one.

winghouchan avatar May 27 '16 10:05 winghouchan

It's interesting that the onLoadFinished event is triggered twice:

  horseman phantomjs onLoadFinished triggered +899ms success 1
  horseman phantomjs onLoadFinished triggered +103ms fail 2

One with success and the other with fail.

winghouchan avatar May 27 '16 10:05 winghouchan

Test case above submitted by @minhchu does not reliably reproduce the issue. 😕

winghouchan avatar May 27 '16 10:05 winghouchan

Another case where the onLoadFinished event was triggered more than once:

  horseman using PhantomJS from $PATH +0ms
  horseman .setup() creating phantom instance on 12406 +5ms
  horseman phantom created +492ms
  horseman phantom version 2.1.1 +19ms
  horseman page created +11ms
  horseman phantomjs onLoadFinished triggered +14ms success NaN
  horseman injected jQuery +28ms
  horseman .userAgent() set +19ms Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0
  horseman .open() +1ms https://www.google.com/search?site=&tbm=isch&source=hp&biw=1366&bih=640&q=moon
  horseman phantomjs onLoadFinished triggered +904ms success 1
  horseman phantomjs onLoadFinished triggered +65ms fail 2
  horseman phantomjs onLoadFinished triggered +3ms success 3
  horseman injected jQuery +0ms
  horseman .close(). +5ms
  horseman jQuery not injected - already exists on page +10ms
Unhandled rejection Error: Failed to load url
    at checkStatus (/Users/user/Projects/node_modules/node-horseman/lib/index.js:276:16)
    at Object.loadFinishedSetup [as onLoadFinished] (/Users/user/Projects/node_modules/node-horseman/lib/index.js:274:43)
    at /Users/user/Projects/node_modules/node-phantom-simple/node-phantom-simple.js:636:30
    at Array.forEach (native)
    at IncomingMessage.<anonymous> (/Users/user/Projects/node_modules/node-phantom-simple/node-phantom-simple.js:617:17)
    at emitNone (events.js:72:20)
    at IncomingMessage.emit (events.js:166:7)
    at endReadableNT (_stream_readable.js:913:12)
    at nextTickCallbackWith2Args (node.js:442:9)
    at process._tickCallback (node.js:356:17)
From previous event:
    at Object.page.loadedPromise (/Users/user/Projects/node_modules/node-horseman/lib/index.js:233:27)
    at Horseman.<anonymous> (/Users/user/Projects/node_modules/node-horseman/lib/actions.js:70:26)
From previous event:
    at Horseman.exports.open (/Users/user/Projects/node_modules/node-horseman/lib/actions.js:60:20)
    at Horseman.(anonymous function) [as open] (/Users/user/Projects/node_modules/node-horseman/lib/index.js:398:17)
    at Horseman.<anonymous> (/Users/user/Projects/node_modules/node-horseman/lib/index.js:406:22)
    at processImmediate [as _immediateCallback] (timers.js:383:17)
From previous event:
    at Promise.HorsemanPromise.(anonymous function) [as open] (/Users/user/Projects/node_modules/node-horseman/lib/index.js:404:15)
    at Object.<anonymous> (/Users/user/Projects/foo.js:6:4)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)
    at Function.Module._load (module.js:300:12)
    at Function.Module.runMain (module.js:441:10)
    at startup (node.js:139:18)
    at node.js:968:3

winghouchan avatar May 27 '16 10:05 winghouchan

wow, haha :grin:

minhchu avatar May 27 '16 10:05 minhchu

This is an odd one. When I run the script it works fine and onLoadFInished is only triggered once after the open. What node version and operating system are you using?

awlayton avatar May 30 '16 17:05 awlayton

My Node.js version is v4.4.4 and my OS is OS X El Capitan v10.11.4.

winghouchan avatar May 30 '16 19:05 winghouchan

node -v 5.3.0, npm -v 3.3.12, Windows 7 64 bit.

minhchu avatar May 30 '16 20:05 minhchu

For what it's worth: I'm having this issue as well but it is very indeterminate for me. I've been unable to reproduce it reliably. My code runs on a loop and I see this error pop up around 25% of the time. I've used this to help reproduce it:

function main() {
  // ...horseman do stuff...
  horseman.finally(() => {
    console.log(`Waiting ${process.env.DELAY} minute(s)...`);
    setTimeout(main, process.env.DELAY*60*1000);
  });
}

Hope this helps, and for the record:

$ node --version
v6.2.2
$ npm --version
3.9.5
$ npm list node-horseman
...
└── [email protected] 

ianalexander avatar Jul 02 '16 18:07 ianalexander

I also have this issue. For me it happens more times than it works. After a click() and waitForNextPage(), it throws the failed url error.

node: v5.6.0
horseman: [email protected]
OS: Ubuntu 14.04LTS

bmills22 avatar Jul 14 '16 01:07 bmills22

I'm having this issue too: .value --> .click --> .waitForNextPage() --> "Failed to load url". Sometimes it happens after the click before it can even reach the wait. I can't identify any pattern in when it works and when it doesn't.

node: v4.2.3
npm: v3.10.3
horseman: v3.1.1
OS: Windows 10 64 bit

abagh0703 avatar Jul 19 '16 19:07 abagh0703

Same issue here. Works intermittently, but mostly it doesn't. Makes horseman unusable for /me. 😢

sayanriju avatar Jul 20 '16 18:07 sayanriju

Why can't you just catch the rejection @sayanriju? I have still been unable to reproduce this.

awlayton avatar Jul 21 '16 16:07 awlayton

Me too have this problem. It happens randomly. I'm trying to login several user to a page using these statements:

var horseman = new Horseman(Global.horsemanOptions);
usersList.forEach(function(user){        
    horseman
               // Open the page
               .open(WEB_PAGE)
               .value('#loginform > input[name=username]', user.username)
               .value('#loginform > input[name=password]', user.password)
               .click('#loginform > button[type=submit]')
               .waitForNextPage()
               .screenshot(user.username + '.png')
               .then(function(){
                   console.log(user.username + ' LOGGED IN')
               })
               .....
               .....

});

Error seems to be raised for the sequence:

.click('#loginform > button[type=submit]')
.waitForNextPage()

because commenting those statements, error is not raised (but others yes).

The strange thing is that screenshots are taken correctly, they are images of a successfully login, but next statements are not correctly executed.

I'm running it in:

Arch Linux
nodejs: v6.3.1
node-horseman: 3.1.1"

Does anybody know a solution for that?

grimaldello avatar Aug 03 '16 23:08 grimaldello

@OrtoNormale I don't think this is related to the bug you're seeing, but - you're trying to use horseman, which is asynchronous, in a synchronous loop. Each .action() takes time, but by the time the first page is open, your loop has probably completed.

johntitus avatar Aug 03 '16 23:08 johntitus

There was an error in my code snippet. At every iteration a new horseman object is initialized. The statement with new Horseman() has to be moved inside the loop. So every user has its own instance. But is'nt .open() waiting for page load before go on?

grimaldello avatar Aug 03 '16 23:08 grimaldello

Yes, but by the time the first user's page is open, the loop has likely completed, and the user variable will be the last user in the list. Horseman doesn't block the loop. You could move the horseman stuff inside a function, and then just send the user to the function. That should fix your scope issue.

function doStuff(user){
  var horseman = new Horseman(Global.horsemanOptions);
  horseman
               // Open the page
               .open(WEB_PAGE)
               .value('#loginform > input[name=username]', user.username)
               .value('#loginform > input[name=password]', user.password)
               .click('#loginform > button[type=submit]')
               .waitForNextPage()
               .screenshot(user.username + '.png')
               .then(function(){
                   console.log(user.username + ' LOGGED IN')
               })
}
usersList.forEach(doStuff);

johntitus avatar Aug 03 '16 23:08 johntitus

Thanks for the answer. As soon as possible I'll try and report.

grimaldello avatar Aug 04 '16 07:08 grimaldello

Unfortunately problem persits. Here the code I tried:

var performLoginTask = function(usersList){

    var task = function(user){
        var horseman = new Horseman(Global.horsemanOptions);
        horseman
            .on('consoleMessage', function( msg ){
                console.log(msg);
            })
            // Open page
            .open(WEB_PAGE)
            .value('#loginform > input[name=username]', user.username)
            .value('#loginform > input[name=password]', user.password)
            .click('#loginform > button[type=submit]')
            .waitForNextPage()
            .....
            .....
            .....
    };

    usersList.forEach(task);


};

performLoginTask(usersList);

anyway it seems very similar to my version.

Error is the following:

Unhandled rejection Error: Failed to load url

and it's thrown randomly (sometimes yes and sometimes not) and from not a specific user everytime.

It's a really powerful tool and It is a shame not use this tool due to this problem.

grimaldello avatar Aug 04 '16 17:08 grimaldello

at the bottom of your horseman chain, can you add a .catch()?

On Thu, Aug 4, 2016 at 1:06 PM, OrtoNormale [email protected] wrote:

Unfortunately problem persits. Here the code I tried:

var performLoginTask = function(usersList){

var task = function(user){
    var horseman = new Horseman(Global.horsemanOptions);
    horseman
        .on('consoleMessage', function( msg ){
            console.log(msg);
        })
        // Open page
        .open(WEB_PAGE)
        .value('#loginform > input[name=username]', user.username)
        .value('#loginform > input[name=password]', user.password)
        .click('#loginform > button[type=submit]')
        .waitForNextPage()
        .....
        .....
        .....
};

usersList.forEach(task);

};

performLoginTask(usersList);

anyway it seems very similar to my version.

Error is the following:

Unhandled rejection Error: Failed to load url

It's a really powerful tool and It is a shame not use this tool due to this problem.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/johntitus/node-horseman/issues/188#issuecomment-237617715, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkB_t3nKUgZCtgWygOAbxB2nlNgfFyvks5qchwYgaJpZM4IoXRH .

johntitus avatar Aug 04 '16 17:08 johntitus

Added catch, but errors still throwed:

 var performLoginTask = function(usersList){

     var task = function(user){
         var horseman = new Horseman(Global.horsemanOptions);
         horseman
             .on('consoleMessage', function( msg ){
                 console.log(msg);
             })
             // Open page
             .open(WEB_PAGE)
             .value('#loginform > input[name=username]', user.username)
             .value('#loginform > input[name=password]', user.password)
             .click('#loginform > button[type=submit]')
             .waitForNextPage()
             .....
             .....
             .....
             .catch(function(error){console.log(error)})
     };

     usersList.forEach(task);


 };

 performLoginTask(usersList);

and console.log() in block catch print the same text as the one in throwed error.

grimaldello avatar Aug 04 '16 17:08 grimaldello

Could the problem be related to a redirect (HTTP code 302) that is done after login?

grimaldello avatar Aug 04 '16 18:08 grimaldello

If it's being caught in the catch, can you just ignore it? Does it get thrown for every user, or just for one or two?

On Thu, Aug 4, 2016 at 1:45 PM, OrtoNormale [email protected] wrote:

Added catch, but errors still throwed:

var performLoginTask = function(usersList){

var task = function(user){ var horseman = new Horseman(Global.horsemanOptions); horseman .on('consoleMessage', function( msg ){ console.log(msg); }) // Open page .open(WEB_PAGE) .value('#loginform > input[name=username]', user.username) .value('#loginform > input[name=password]', user.password) .click('#loginform > button[type=submit]') .waitForNextPage() ..... ..... ..... .catch(function(error){console.log(error)}) };

usersList.forEach(task);

};

and console.log() in block catch print the same as throwed error.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/johntitus/node-horseman/issues/188#issuecomment-237628743, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkB_j-dTO_NJKqNpblufriSpNx6l25Oks5qciVRgaJpZM4IoXRH .

johntitus avatar Aug 04 '16 18:08 johntitus

The error is thrown only for one or two and randomly, not always the same users.

Ok, I'll try to handle that error in catch().

grimaldello avatar Aug 04 '16 18:08 grimaldello

How many users are you trying to do at the same time? I wonder if it's blowing up Phantom because of memory issues...

On Thu, Aug 4, 2016 at 2:29 PM, OrtoNormale [email protected] wrote:

The error is thrown only for one or two and randomly, not always the same users.

Ok, I'll try to handle that error in catch().

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/johntitus/node-horseman/issues/188#issuecomment-237641345, or mute the thread https://github.com/notifications/unsubscribe-auth/AAkB_qWT3btK5-Z1IMtHA6TGCi7FY26Uks5qci9ugaJpZM4IoXRH .

johntitus avatar Aug 04 '16 18:08 johntitus

At the moment max 10 users, in future maybe more. However It happens even using only 1 user sometimes. For now I'm handling it in catch() block by recalling the task. Not the best solution, but it works.

grimaldello avatar Aug 04 '16 18:08 grimaldello

Echoing what's been said in the earlier comments.

Horseman appears to hang after this happens. For me, it's reliably breaking my application.

Note that I receive three onLoadFinished events after my .open().

My output, from the .catch(err => console.log(err.stack || err)) I added, along with Horseman and Bluebird debug flags:

  horseman .keyboardEvent() +3ms keypress w null
  horseman .keyboardEvent() +4ms keypress X null
  horseman .keyboardEvent() +4ms keypress 8 null
  horseman .click() +0ms #sign-in-button
  horseman .click() done +22ms
  horseman .open() +1ms [redacted URL]
  horseman phantomjs onLoadFinished triggered +4ms fail 2
  horseman phantomjs onLoadFinished triggered +2ms fail 3
Error: Failed to GET url: [redacted URL]
    at checkStatus ([cwd]/node_modules/node-horseman/lib/actions.js:78:16)
    at [cwd]/node_modules/node-phantom-simple/node-phantom-simple.js:60:18
    at IncomingMessage.<anonymous> ([cwd]/node_modules/node-phantom-simple/node-phantom-simple.js:645:9)
    at emitNone (events.js:72:20)
    at IncomingMessage.emit (events.js:166:7)
    at endReadableNT (_stream_readable.js:903:12)
    at doNTCallback2 (node.js:439:9)
    at process._tickCallback (node.js:353:17)
From previous event:
    at Horseman.<anonymous> ([cwd]/node_modules/node-horseman/lib/actions.js:76:5)
From previous event:
    at Horseman.exports.open ([cwd]/node_modules/node-horseman/lib/actions.js:60:20)
    at Horseman.(anonymous function) [as open] ([cwd]/node_modules/node-horseman/lib/index.js:402:17)
    at Horseman.<anonymous> ([cwd]/node_modules/node-horseman/lib/index.js:410:22)
From previous event:
    at Promise.HorsemanPromise.(anonymous function) [as open] ([cwd]/node_modules/node-horseman/lib/index.js:408:15)
    at mainRoutine (app.js:30:6)
From previous event:
    at app.js:24:6
    at processImmediate [as _immediateCallback] (timers.js:368:17)
  horseman phantomjs onLoadFinished triggered +8s success 4

shockey avatar Aug 05 '16 03:08 shockey

Worth noting: I, like @OrtoNormale, am .open()ing a page that may be responding with a 302 Found.

This is just a guess... but Phantom following the Location header could result in a state inconsistency between Horseman and the Phantom instance, if Horseman doesn't take note of the page change.

shockey avatar Aug 05 '16 03:08 shockey

As @shockey says, also in my case horseman jump directly in catch(){...} after that error is thrown without executing the rest of the code

grimaldello avatar Aug 05 '16 08:08 grimaldello

Can anyone provide a full test script to reproduce this? There are a ton of open PhantomJS issues related to 302 redirects, but it's hard to pin down without a test script.

johntitus avatar Aug 05 '16 10:08 johntitus