webmachine-ruby
webmachine-ruby copied to clipboard
The consistently failing spec
This has been mentioned in the other PR, and I remember a couple of random build failures in the past. Can someone point me to a failed build or provide intel? Let's try to come up with a fix. :zap:
Pretty sure this is a cascading failure in the adapter specs. Seems the most common culprit is the Reel adapter spec. If it fails, the next adapter test fails too. I struggled with the flakiness of these tests for a long time. They pass consistently on my machine every time, but not so much on Travis.
Can you point me at one of the Reel failures?
Here's a JRuby test run where the Reel adapter fails. Later, the Rack adapter fails.
https://travis-ci.org/seancribbs/webmachine-ruby/jobs/12748305
I'm able to unreliably reproduce this with bundle exec rspec --order rand:53167 spec/webmachine/
.
That's the spec seed of https://travis-ci.org/seancribbs/webmachine-ruby/jobs/12748303
I'm not sure we should bother fixing the root cause of this leak. We could as well just use a fresh random port for each example, and let garbage collection take care of the rest. And when the process exits, they'll get released anyway.
Example: https://github.com/lgierth/aeee/blob/master/spec/support/helper.rb#L15-L20
Yeah, I previously had the adapter tests do just that (89ce86e568eb368b5c05f84f2dd1f97d45b7a26d). I removed that because I thought I had gotten the servers to play nice and shut down.
Seems like web servers generally expect to be at the center of the universe. Maybe the Adapter#shutdown API is just a pipe dream.
(PR reference was a typo)
One of today's builds shows a new error from Reel on rbx-19mode: https://travis-ci.org/seancribbs/webmachine-ruby/jobs/14224187#L1622
Could it be this is the actual error? Maybe the error gets swallowed most of the time due to something. It also always seem to be the same three examples that fail, except for the mentioned JRuby failure - where it's all of the examples in that example group.
On the other hand, the error could as well just be a different symptom of something network-related.
cc @tarcieri
@lgierth hmmm...
ArgumentError: Data object has already been freed
This could either be an http_parser.rb bug on a Reel bug. Not sure? /cc @tmm1
I just experienced another symptom, the first example hangs indefinitely, and isn't even affected by timeout(5)
in #reel_server
: https://gist.github.com/lgierth/7554999
Note how i send SIGINT in line 6, and the difference in time between the former and latter log timestamps.
Also note the spec command which uses --order rand:53167
, as an earlier CI failure did (see my first comment after opening this issue), but runs only the Reel spec.
the first example hangs indefinitely
And obviously also the second example, as I sent SIGINT again in line 7...
This is possibly inside of Celluloid's at_exit
handler /cc @halorgium
here's an example of random Reel errors I see often: https://travis-ci.org/seancribbs/webmachine-ruby/jobs/14209468 it happened on rbx this time.
@robgleeson that's the same error:
ArgumentError: Data object has already been freed
Again this is either a bug in:
- rbx
- http_parser.rb
I would guess http_parser.rb if the same fail occurs on jruby.
If the problem is Travis running multiple adapters in one build, and granted that that's not a real use case (?), could the adapter be parameterized to an env var and the build split into per-adapter?
@samwgoldman regarding shutdown vs. random port assignment, I'm having trouble with shutdown and Rack+Mongrel. @server.server.shutdown
seems to work only with WEBrick.
On a sidenote, I also noticed that only the WEBrick and Hatetepe adapters register shutdown as a SIGINT handler. FWIW, Rack::Handler#start traps SIGINT as well, and it simply exits the process unless the adapter implements shutdown.
@lgierth Confirm that the shutdown implementation on the Rack adapter only works with Rack+WEBrick. This is definitely a problem. From the land of wishes, I really wish Ruby web servers provided usable shutdown methods. Until they do, it would seem that Adapter#shutdown
is not a realistic API to maintain.
I guess that leaves us with two options: 1) random port assignment for adapter tests in a single test process and 2) separate test processes for each adapter. I'll also note that adapters can now be extracted into gems which depend on the exported RSpec shared examples.
I somewhat snuck the Adapter#shutdown method into the public API, so its removal might require a greater-than-patch-level version bump.
I somewhat snuck the Adapter#shutdown method into the public API, so its removal might require a greater-than-patch-level version bump.
From what I understand, adding support for PUT will require a major bump as well, so maybe we could combine the two.
more fails that look random/buggy in nature: https://travis-ci.org/seancribbs/webmachine-ruby/jobs/14867321 (rbx) https://travis-ci.org/seancribbs/webmachine-ruby/jobs/14867321 (jruby)
@robgleeson the Reel adapter is crashing on rbx. Not sure what's at fault: rbx or http_parser.rb. See:
ArgumentError: Data object has already been freed
Maybe @brixen or @dbussink can help? (or @tmm1 on the http_parser.rb side perhaps)
and my comments mysteriously disappear, anyway the failures I linked to are not related to that error. it looks like reel can't acquire a port in one of its specs.
@robgleeson in those tests I'm seeing Reel crash due to that error, and a connection refused error because the server isn't running... unless there's a different test failure I'm not seeing
ah youre right, i didn't see that happen earlier in the trace. im not sure where the other failures are coming from(jruby/MRI). they look different and not related.
jruby
on travis refers to a 1.9 implementation right? I think we can get rid of the RUBY_VERSION
guards from the gemfile if it is. i dont think 1.8 is being tested against anymore.
build passed this time: https://travis-ci.org/seancribbs/webmachine-ruby/builds/14881037 dropped RUBY_VERSION guard.
@robgleeson Travis switched the default for jruby (some time back) and rbx to 1.9 recently, hence #138. So yeah we must have tested against jruby-18mode until some point.
Officially dropping 1.8 support makes one more reason for a major release :)
There are a couple of other RUBY_VERSION branches in the specs, we should remove these as well then.
Officially, it's already unsupported. :)
Sean Cribbs
On Dec 3, 2013, at 4:28 PM, Lars Gierth [email protected] wrote:
@robgleeson Travis switched the default for jruby (some time back) and rbx to 1.9 recently, hence #138. So yeah we must have tested against jruby-18mode until some point.
Officially dropping 1.8 support makes one more reason for a major release :)
There are a couple of other RUBY_VERSION branches in the specs, we should remove these as well then.
— Reply to this email directly or view it on GitHub.