capybara-envjs remote scripts take an extremely long time (linux?)

trafficstars

I am usring rspec. With the same Gemfile everything works on Mac. On linux it hangs until I Control-C and then slowly gives error messages until I kill -9

I have tried 0.4.0 and github

Nov 08 '10 18:11 gregwebs

I haven't seen this.

Are you able to run the capybara-envjs specs? (Clone the repo, bundle install, rake spec). Those are (obviously) rspec-based and run okay.

It's possible that it's hanging in a lost JS event that env-js is mishandling ....

Nov 09 '10 03:11 smparkes

tests work. I just get 3 failures. So I guess it hangs on Linux with a special setup. Is there a way I can get a debug output or some way of trying to figure this out? I guess I will try executing a different test under envjs for starters.

Nov 09 '10 04:11 gregwebs

Those three failures are known; Jonas is going to tweak Capybara for them.

I'd debug it by digging into the code and trying to figure out where it's stalling. It's not elegant, but I'd put some p/puts into the driver to see where it was stalling. On the env-js side, you can turn on debugging, though again, the easiest way is a hack (hard code the log level in env.js).

Nov 09 '10 04:11 smparkes

I actually successfully ran a test now- I think I just needed to limit it and be more patient. It took ~300 seconds. Without envjs it runs in less than a second. As I said, someone else with the same Gemfile is running this on the Mac- he said using envjs created a 10 second delay.

Another curiosity: when within does not match I get a segfault.

The interesting thing is that removing scripts and css from my layout doesn't dramatically change the run-time of the test. If I remove everything from my layout the test fails after 55 seconds.

Nov 09 '10 15:11 gregwebs

The envjs specs finish in 87 seconds, which seems reasonable since they run over 300 tests.

Nov 09 '10 15:11 gregwebs

I reduced my test to just one action- visit the login page- and it takes about a minute- independent of what is in the application layout

2.84s user 0.59s system 5% cpu 1:01.90 total

Nov 09 '10 15:11 gregwebs

There's a certain amount of startup time, to parse all the JS that implements env.js. This should be a mostly one-time thing, since the parsed script is cached.

The HTML5 parser is written in JS, which is slower than a C++ parser. That plus having to reload env.js into every page slows things down, but I haven't seen cases as extreme as you're noting.

If you can put an example somewhere, I'll try to look at it.

Nov 09 '10 16:11 smparkes

I just realized that there is a separate layout for the login method- so some of my previous statements were incorrect. env-js is just really, really slow to interpret the javascript. If I remove the javascript it runs quickly.

Nov 09 '10 19:11 gregwebs

Actually, it tends not to be the interpretation of the JS that's slow, it's the setup of the DOM classes and parsing of the HTML, which still makes the overall process slower than a C++ parser/DOM.

I haven't seen things quite as slow as you've mentioned, though.

Nov 09 '10 19:11 smparkes

This problem seems entirely due to loading scripts remotely. It is as if the scripts aren't being down-loaded in parallel. I am switching to local scripts and the run-time seems correct now. 6 seconds instead of 300+ seconds.

You could close this ticket now if you put a warning in the install documentation about remote scripts. I would be a lot happier though if there were some kind of explanation for the slowness with remote scripts (on linux or certain setups).

Nov 09 '10 21:11 gregwebs

That's really helpful, thanks.

I have no idea what would cause a factor 60 on remote access. That bears some looking into.

I don't think the scripts are being downloaded in parallel. That's just extra complexity that there's been no call for so far. (And making that work with spidermonkey is a scary concept; you really have to avoid threads or the VMs fight).

But I can't see that that really accounts for that slowdown. Some, yes, but not that much.

Nov 09 '10 22:11 smparkes

I don't think the non-parallel download was causing all of the problem, but it was certainly accounting for a decent chunk of it. There were 9 scripts to download. With a high latency connection this really adds up.

I reproduced this with just jquery & jquery-ui from the google CDN- it took 17 seconds to run a single page visit to a simple page.

Nov 09 '10 22:11 gregwebs

Hmmm ... that's a lot of latency ...

Could cache the results like a real web browser would. This may also defeat the JS compiler cache which would eat up more cycles.

Nov 12 '10 22:11 smparkes

I wasn't saying that the 17 seconds is entirely due to latency effects- it seems like there is something else going on here. But at a minimum the next script must be waiting for the previous script to download and be evaluated.

It seems like 2 passes is appropriate- the first to just start downloading all the scripts and then the second to evaluate the javascript.

Nov 12 '10 22:11 gregwebs

Yeah, not sure.

Doing async download is tricky. It's tricky because using threads with Johnson is somewhere between hard and impossible. And it's tricky because figuring out exactly what scripts you do sync vs. async is hideously subtle (by the HTML5 draft).

So unless it's a priority for someone, it's probably not going to happen soon.

But like you said, unless you're on a 300 baud modem, something I don't understand is going on here.

Nov 12 '10 22:11 smparkes

for vs. async I would think would be in how the js is evaluated- what I am suggesting is a download phase where nothing is evaluated, just scripts are downloaded. After that the page can be evaluated as normal, but when it comes to a remote script it can retrieve it locally instead of making the actual remote request. But all this is useless without threads to make some of the downloads happen in parallel.

Nov 12 '10 22:11 gregwebs

Unfortunately, that's not the way HTML works. It loads the scripts as the tags are seen. In particular, this happens on the fly during the parse. The parse is actually done by an HTML5 parser written in JS. So we don't even know what scripts to load until the previous ones are done.

I suppose it'd be possible to look at the HTML ahead of time and look for scripts that looks like they'll be needed ... but that feels like a kludge.

17 seconds for two files that are instantaneous from the local FS still makes me think I'm doing something egregiously stupid somewhere ...

Nov 12 '10 23:11 smparkes

I do have some kind of understanding about how HTML works, or to be correct, how browsers work on html :). This library works differently than browsers because it doesn't fetch scripts in parallel. I don't know how everything is implemented, but there is a way to accomplish this if browsers have. But certainly it could be very hard to accomplish and it may end up distracting from the real issue.

Nov 12 '10 23:11 gregwebs

Yeah, sorry. You can certainly load scripts async. There are cases where you can't, in particular, normal script tags encountered during parsing, since an earlier script might do a document.write and change what happens later.

As you mention, I suppose it's possible even in this case browsers might speculatively fetch files ahead of time even if though it's not known for sure if those files will be parsed.

It's mind-bogglingly subtle, what with executing scripts during the parse. The world would have been a lot easier if document.write had never existed.

Nov 13 '10 00:11 smparkes

capybara-envjs capybara-envjs copied to clipboard

remote scripts take an extremely long time (linux?)

capybara-envjs
capybara-envjs copied to clipboard