nim-chronos
nim-chronos copied to clipboard
index error in readStreamLoop of transports/stream
I'm experiencing runtime crashes from unhandled exceptions re: nim-chronos, with L1436 in transports/stream.nim
reported as the culprit:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/repos/nim-dagger/experiments/localstore.nim(114) main
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/transports/stream.nim(1436) readStreamLoop
/Users/michael/.choosenim/toolchains/nim-1.6.0/lib/system/fatal.nim(53) sysFatal
[[reraised from:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
]]
Error: unhandled exception: index 4096 not in 0 .. 4095 [IndexDefect]
I've tried with Nim v1.2.14, v1.4.8, v1.6.0 — same thing regardless of version.
When I (naively) change L1435 from:
while true:
to:
while transp.offset < transp.buffer.len:
I no longer experience that runtime crash.
Context
I'm doing some early stage experiments combining use of task_runner
impl_beta_2 branch's threadpool with chronos' http server and async pipes:
https://github.com/michaelsbradleyjr/nim-dagger/blob/localstore/experiments/localstore.nim#L77-L78
Re: those two lines in particular, I only experience the runtime crash (i.e. without change mentioned above) if L77 is enabled and L78 is disabled, i.e. only when the Future returned by writer.write
is not discarded.
I can create separate issues re: the following, but this issue feels like a decent place to start the discussion/s...
(1) When composing async streams, is the way I did it in experiments/localstore.nim
, via readMessage
and a predicate, the way to go about it? Or is there another API for that purpose? I spent awhile looking through the tests and implementations of nim-chronos, but I may have missed something or a lot of things.
(2) Is there a reason why ReadMessagePredicate
of transports/stream.nim
does not return a Future? If it did, then my L77-L78 linked above could be replaced with:
asyncSpawn writer.write(@data, data.len)
And the index error would get re-raised as a Defect per asyncSpawn
.
(3) When the crash reported by this issue is "fixed", or ignored by discarding the Future returned by writer.write
, I'm experiencing a similar looking error when I try to increase the number of concurrent large-ish uploads to the running server (~7 MB per POST request) from a couple of dozen to several dozen:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/repos/nim-dagger/experiments/localstore.nim(114) main
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/apps/http/httpserver.nim(748) processLoop
/Users/michael/repos/nim-dagger/experiments/localstore.nim(68) process
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/transports/stream.nim(2516) fromPipe
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(560) register
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/ioselects/ioselectors_kqueue.nim(221) registerHandle
/Users/michael/.choosenim/toolchains/nim-1.6.0/lib/system/fatal.nim(53) sysFatal
[[reraised from:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/repos/nim-dagger/experiments/localstore.nim(114) main
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/apps/http/httpserver.nim(748) processLoop
]]
[[reraised from:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/repos/nim-dagger/experiments/localstore.nim(114) main
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/.choosenim/toolchains/nim-1.6.0/lib/system/excpt.nim(144) processLoop
]]
[[reraised from:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
/Users/michael/repos/nim-dagger/experiments/localstore.nim(114) main
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
]]
[[reraised from:
/Users/michael/repos/nim-dagger/experiments/localstore.nim(117) localstore
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(1091) waitFor
/Users/michael/.nimble/pkgs/chronos-#export-selector-field/chronos/asyncloop.nim(279) poll
]]
Error: unhandled exception: index -1 not in 0 .. 49151 [IndexDefect]
I've tried with Nim v1.2.14, v1.4.8, v1.6.0 — same thing regardless of version. With v1.6.0, if I add --gc:orc
I get the same crash but the stacktrace is missing, I only get the last line about index -1 ...
and it's sometimes garbled with other characters.
This one surely deserves its own bug report, but the reason I mention it here is that it has a (maybe superficial) resemblance to the problem reported at the top of this issue, i.e. involves another loop processor and index error.
Oooh, this is a bug of the kind that's been with us for quite a while - I'm not sure were never able to track it down in a satisfactory way - do you have a stable repro?
do you have a stable repro?
Yes. I'll give the steps to repro, and if it seems too hacky then I'll try to make a branch where the repro is more self-contained, i.e. less need to do prep work.
-
Clone my fork of nim-dagger and switch to the
localstore
branch: https://github.com/michaelsbradleyjr/nim-dagger/tree/localstore. -
Make or find some several+ MB file that you'll be uploading to the experimental web server + task_runner. And you'll want a script like this:
upload.sh
#!/usr/bin/env bash
filename="$(basename -- "$1")"
extension="${filename##*.}"
filename="${filename%.*}"
for x in {01..24}; do
curl -i -d @$1 -H "Content-Type: application/json" -X POST \
localhost:30080/upload/$filename$x.$extension &
done
-
In
experiments/localstore.nim
, enable L77 and disable L78; and disable L112. -
Run
mkdir -p experiments/files && nimble localstore
. -
After (4) has invoked the built
experiments/localstore
binary, in another terminal runupload.sh [your file]
Shortly after invoking upload.sh
, in the terminal running nimble localstore
you should see localstore
crash, and the output will include:
Error: unhandled exception: index 4096 not in 0 .. 4095 [IndexDefect]
@michaelsbradleyjr I think we finally found the root cause: https://github.com/status-im/nim-chronos/pull/267 - can you run your test again?