distributed-process
distributed-process copied to clipboard
Support for testing process failures and netsplits
From @hyperthunk on January 3, 2013 12:57
When testing distributed systems, it's pretty useful to be able to simulate communication errors. I propose that we write some extensions to network-transport-inmemory that allow a test coordinator to mess with the connectivity between nodes, introducing arbitrary delays, forced disconnects and so on.
I would also like to have something like https://github.com/dluna/chaos_monkey that kills arbitrary processes on demand, but stays away from system processes and supervisors. Quite how one should identify that a process is a supervisor I don't know. Erlang uses the process dictionary for this (storing the initial_call and such things in there) but we might want to think about other approaches.
Copied from original issue: haskell-distributed/distributed-process#107
From @pankajmore on January 3, 2013 16:43
@hyperthunk Do you suggest that we build chaos_monkey in haskell. I have not tried chaos_monkey but it seems like it can be used to kill haskell processes too?
I will have a look at network-transport-inmemmort and see how to see simulate communication errors but first I need to get upto speed with the deployement. It seems that my 7.6 ghc upgrade broke things. seems like distributed-process is not compatible with ghc-7.6.1 yet.
From @hyperthunk on January 3, 2013 17:19
@pankajmore
It seems that my 7.6 ghc upgrade broke things. seems like distributed-process is not compatible with ghc-7.6.1 yet.
Please file a bug with a complete log of the build failure so we can fix this.
From @hyperthunk on January 7, 2013 12:59
Note here that n-t-tcp uses a script based approach to this which might be better or at least provide an alternative