whateverable
whateverable copied to clipboard
segfaults and other stability issues
See this: http://irclog.perlgeek.de/perl6/2016-08-19#i_13055233
Pretty sure that it is not our fault, but we have to rakudobug it.
- [x] Seems like RT #129291 is the most common problem at this moment. Once that is fixed, we will probably see other issues.
- [x] RT #129291 was fixed, next problem is RT #129781.
- [x] RT #129781 was fixed, next problem is that the process is not killed if there's a lot of stuff on stdout of Proc::Async. See RT #130370, but it's not a problem because a workaround has been added in commit c564d8de71e5049e8c93d760fbc6af3316326996.
As of today, there are no segfaults. I'd still have to write tests for some cases mentioned here, but generally it is not an issue anymore.
- [x] OK, bots are not stable anymore. I think it's due to https://github.com/rakudo/rakudo/commit/9658dd98c9.
Getting stuff like this:
MoarVM panic: Internal error: invalid thread ID 284 in GC work pass
Didn't look into it deeply at all, but leaving a note here anyway.
Can be reproduced by running t/bisectable.p6 on the server (sometimes you may get lucky and the whole file will pass, but usually it crashes half way through).
I’m getting those a lot too (happening debugging HTTP::Server::Async issues)
On July 27, 2017 at 11:18:45 AM, Aleks-Daniel Jakimenko-Aleksejev ([email protected]mailto:[email protected]) wrote:
OK, bots are not stable anymore. I think it's due to rakudo/rakudo@9658dd9https://github.com/rakudo/rakudo/commit/9658dd98c9.
Getting stuff like this:
MoarVM panic: Internal error: invalid thread ID 284 in GC work pass
Didn't look into it deeply at all, but leaving a note here anyway.
Can be reproduced by running t/bisectable.p6 on the server (sometimes you may get lucky and the whole file will pass, but usually it crashes half way through).
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/perl6/whateverable/issues/24#issuecomment-318444466, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB75kKe5t-nLzDuQEw_fqm66YFsjJYXQks5sSNSBgaJpZM4JovXH.
Seems to be alright now after fixes by @jnthn++.
- [x] Currently most bots are leaking memory (which is why some things are slower than they were before).
The leakage was reported in RT #131879, and right now it is fixed in a way that it does not leak as much anymore. The memory usage increases if you keep throwing non-existent commits into the bots, but given 16GB of RAM on the server this is hardly a problem.
Right now, the bots are stable.
- [x] Quotable does not work (and was not working for a while): RT #131961 Greppable has a problem with it also, but it is more or less usable.
RT #131961 is resolved, waiting for the next bug to appear now.
- [x] Well, didn't have to wait for too long. Most bots can't pass their tests, I don't know why yet. Things seem to hang.
- [x] Possibly it's RT #132030.
Well, I guess it's not. Things are still broken though.
- [x] Also, we're stuck with non-HEAD version of rakudo because of RT #132191. See also https://github.com/zoffixznet/perl6-IRC-Client/issues/51.
OK, RT #132191 turned out to be an issue in IRC-Client (it was relying on a rakudo bug).
Now there are at least two other problems. Bisectable fails with this output:
ok 60 - Did you mean “HEAD” (new)?
# Failed to get expected result in 11.04535627 seconds (11 nominal)
not ok 61 - Did you mean “HEAD” (old)?
# Failed test 'Did you mean “HEAD” (old)?'
# at /home/bisectable/git/whateverable/t/lib/Testable.pm6 (Testable) line 81
# expected: ["testable742093, Cannot find revision “DEAD” (did you mean “HEAD”?)"]
# matcher: 'infix:<~~>'
# got: []
# Test failed. Stopping test suite, because PERL6_TEST_DIE_ON_FAIL environmental variable is set to a true value.
# Failed to get expected result in 11.04317088 seconds (11 nominal)
not ok 62 - _
# Failed test '_'
# at /home/bisectable/git/whateverable/t/lib/Testable.pm6 (Testable) line 81
# expected: [-> ;; $_? is raw { #`(Block|84942264) ... }]
# matcher: 'infix:<~~>'
# got: []
# Test failed. Stopping test suite, because PERL6_TEST_DIE_ON_FAIL environmental variable is set to a true value.
There is no reason why test 61 would fail. Actually, it passes if you put it higher in that file. I don't know what's going on there, but most likely it's an issue in rakudo.
The second problem is that it runs some other test after the first test failed. Why? It should not be like that.
Ah OK, the ‘_’ test is an issue in whateverable. Nevermind that. Why does it fail in the first place is beyond me however.
- [x] So, to make this clear, the tests are still failing. It simply stops dead when performing these tests: https://github.com/perl6/whateverable/blob/e9ccebadca9a44e4a27a2325737308828568786b/t/bisectable.t#L165-L170
The code involves a lot of calls to Text::Diff::Sift4 module, but nothing special really. This issue didn't exist a few releases ago, and I really am not sure when this happened exactly.
The same test works fine in committable.t, and actually if you move these tests higher in the bisectable.t file, they will pass. Really weird stuff going on.
Alright, some progress on that! First of all, it doesn't hang, it segfaults. The reason I was thinking that it hangs is because the test suite does not really detect if the bot process dies unexpectedly, so there was no easy way to notice. Now I have some code that will help notice the issue in the future, will commit that soon.
Now, the segfault happens in the react block here: https://github.com/perl6/whateverable/blob/e9ccebadca9a44e4a27a2325737308828568786b/lib/Whateverable.pm6#L220-L232
So, that's easy now, right? Just run it under valgrind and you'll immediately see the issue…
Ha-ha.
Nope. You run it under valgrind, and the issue goes away. 💩
I'm suspecting that we may be seeing something like https://github.com/rakudo/rakudo/issues/1202 here, but it's hard to tell.
Same issue under gdb: https://gist.github.com/MasterDuke17/0312dd2af1e3b2b498d91cfacc45343c
- [x] Reportable is currently suffering from this issue (SEGV): https://github.com/rakudo/rakudo/issues/1278
- [x] Bots are currently leaking memory like crazy. I will probably turn off some of them so that they don't max out the memory usage on the server.
- [x] Just had this intermittent fail:
Cannot find method 'specialize' on object of type NQPClassHOW
On this line: https://github.com/perl6/whateverable/blob/46337991a954885fe4c535319275bbb6f797b391/lib/Whateverable.pm6#L326
I cannot reproduce so we will just let it be…
- [x] MoarVM ticket that is probably related to our current memory leaks: https://github.com/MoarVM/MoarVM/issues/680
- [x] More or less isolated memory leak: https://github.com/rakudo/rakudo/issues/1501
- [x] I think it no longer leaks as much, but now bisectable segfaults here: https://github.com/perl6/whateverable/blob/177b77cb2ebc045736b8e7a1cf6eb8e25fdce7b6/t/bisectable.t#L186-L191
Again, there's nothing special with this test. And if you look closely, previous tests have been commented out because they were causing another segv previously. Here's the ticket: https://github.com/rakudo/rakudo/issues/1259
Bots no longer leak memory like crazy, so that issue is resolved. Bisectable still can't get through its tests though.
- [ ] Issue #296 / R#1595
OK issue #296 can be workarounded like this:
-my $host-arch = $*KERNEL.hardware;
+my $host-arch = ‘x86_64’;
$host-arch = ‘amd64’|‘x86_64’ if $host-arch eq ‘amd64’|‘x86_64’;
-$host-arch = $*KERNEL.name ~ ‘-’ ~ $host-arch;
+$host-arch = ‘linux’ ~ ‘-’ ~ $host-arch;
Heh. Not committing this to the repo because I'm hoping it'll get resolved relatively quickly.
Could you try this diff and see if that makes a difference?
$ git diff diff --git a/src/core/Kernel.pm6 b/src/core/Kernel.pm6 index 1cde4c4..7ce4cf8 100644 --- a/src/core/Kernel.pm6 +++ b/src/core/Kernel.pm6 @@ -180,8 +180,8 @@ class Kernel does Systemic { } }
-Rakudo::Internals.REGISTER-DYNAMIC: '$*KERNEL', { +#Rakudo::Internals.REGISTER-DYNAMIC: '$*KERNEL', { PROCESS::<$KERNEL> := Kernel.new; -} +#}
If it does, then it’s something in the auto-vivification of dynamic variables that’s to blame, and not something specific to $*KERNEL.
On 7 Mar 2018, at 22:34, Aleks-Daniel Jakimenko-Aleksejev [email protected] wrote:
OK issue #296 can be workarounded like this:
-my $host-arch = $*KERNEL.hardware; +my $host-arch = ‘x86_64’;
$host-arch = ‘amd64’|‘x86_64’ if $host-arch eq ‘amd64’|‘x86_64’;
-$host-arch = $*KERNEL.name ~ ‘-’ ~ $host-arch; +$host-arch = ‘linux’ ~ ‘-’ ~ $host-arch; Heh. Not committing this to the repo because I'm hoping it'll get resolved relatively quickly.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.