pmap icon indicating copy to clipboard operation
pmap copied to clipboard

Figure out why Travis shows failures for RBX

Open bruceadams opened this issue 10 years ago • 9 comments

The tests run fine on my local laptop on the same RBX version that is failing on Travis.

bruceadams avatar Mar 06 '15 14:03 bruceadams

Still fails occasionally, even after #12. Still sad :cry:

bruceadams avatar Nov 08 '15 12:11 bruceadams

Can you post the Travis output when it fails?

davidbiehl avatar Nov 08 '15 18:11 davidbiehl

This is the most recent failure: https://travis-ci.org/bruceadams/pmap/jobs/89927576 I'm assuming that's a public web page. It doesn't always fail. When it does failure, the failure is consistent, including before your enhancements:

Failure:
  Parallel sleeps too slow: 3.2 seconds.
  <false> is not true.
test_defaut_thread_limit(Pmap_Test)
kernel/bootstrap/proc.rb:20:in `call'
kernel/bootstrap/proc.rb:20:in `call'
test/pmap_test.rb:61:in `test_defaut_thread_limit'
     58:     (1..128).pmap{ sleep 1 }
     59:     elapsed = Time.now-start
     60:     assert(elapsed >= 2, 'Limited threads too fast: %.1f seconds' % elapsed)
  => 61:     assert(elapsed <  3, 'Parallel sleeps too slow: %.1f seconds' % elapsed)
     62:   end
     63: 
     64:   def test_peach_with_index

Since I just did a release of 1.1.0, there are several more jobs running now: https://travis-ci.org/bruceadams/pmap/builds (again, I'm assuming you can see that).

bruceadams avatar Nov 08 '15 19:11 bruceadams

I think the intent of the test is confused. The test that fails is test_defaut_thread_limit but we're actually asserting that this code will finish in less than 3 seconds ... which is a performance test. I'm always leery of performance testing on shared CI infrastructure.

Can we find a better way to test the default thread limit that doesn't depend on the performance of the system or implementation running it?

All we really need to test is that the thread pool will receive a limit of 64 by default (aka, when we don't override it with an argument). A separate test could confirm that the thread pool adheres to the given limit.

davidbiehl avatar Nov 09 '15 15:11 davidbiehl

Several of the tests use sleep and timing to check that threading is really happening. I've never been very happy about it, but I want tests that really demonstrate that threading is happening.

The other tests leaning on sleep use ten threads or less. Those tests remain well behaved.

This failing test is using 64 threads. Apparently starting 64 threads on Rubinius in a Travis-CI container can take a full second of real time. I find that surprisingly slow, but agree it is boiling down to performance, which is not the goal. Using a mock in this to see that the default pool size gets used gets us what we need and avoids this failure.

Thanks for helping think this one through!

bruceadams avatar Nov 09 '15 16:11 bruceadams

Check out this commit as a potential solution: https://github.com/davidbiehl/pmap/commit/a54e1d75b7b1f79bb320cd4ffb5b12325922d698

I'm not quite sure if I like it ... but it ...

  • demonstrates that multiple threads are being created
  • demonstrates that the thread limit is being observed
  • doesn't use sleep and timing

If you're OK with it, I'll create a PR from that branch and have Travis test it.

davidbiehl avatar Nov 10 '15 00:11 davidbiehl

Fixed via 5091a3e6df1e9671500e30db73aabdc879f0a596

bruceadams avatar Nov 21 '15 20:11 bruceadams

Yikes! A new surprise:

https://travis-ci.org/bruceadams/pmap/jobs/92478348

Failure: test_defaut_thread_limit(Pmap_Test)
kernel/bootstrap/proc.rb:20:in `call'
test/pmap_test.rb:71:in `test_defaut_thread_limit'
     68: 
     69:   def test_defaut_thread_limit
     70:     last_size = (1..128).pmap{ ThreadGroup::Default.list.size }.last
  => 71:     assert_equal(65, last_size) # 65 = 64 threads, plus the main thread
     72:   end
     73: 
     74:   def test_peach_with_index
<65> expected but was
<66>

bruceadams avatar Nov 21 '15 20:11 bruceadams

I wonder if checking the ThreadGroup size is really an effective test ... I think checking sleep times is probably better

davidbiehl avatar Nov 21 '15 21:11 davidbiehl