rultor icon indicating copy to clipboard operation
rultor copied to clipboard

Rultor Needs to Allow Disabling Swapping for Containers

Open original-brownbear opened this issue 8 years ago • 14 comments

Rultor currently starts Docker containers with two somewhat tricky flags:

  • --memory-swap=16g
  • --oom-kill-disable

This creates almost impossible to debug situations in tests involving timeouts. Certain software e.g. ZooKeeper will bring a system to an almost standstill when swapping, but never actually kill the process requiring more than the available amount of physical memory.

--oom-kill-disable is even worse creating situations where a container completely freezes without error when allocating too much memory, see here https://github.com/docker/docker/issues/14440 for the somewhat interesting details on this :)

At any rate, we cannot simply remove this behaviour, Rultor is often under heavy load ... turning it off in general would have builds fail all over the place. We should for the time being give the option to disable it for a build via the .rultor.yml, simply so that builds heavily dependant in their behaviour on not swapping will rather die and at least create a controlled and understood situation.

Task requires:

  • Adding an allow-swapping option to the .rultor.yml
    • Set it to true per default and retain the old behaviour as a default
    • If set to false the container needs to be started without the --memory-swap=16g --oom-kill-disable section of the docker run command in _head.sh.
  • Put a brief description of the feature in the Readme, explaining that test including timeouts and waits, that are unstable on Rultor should try tuning off swapping

original-brownbear avatar Mar 15 '16 21:03 original-brownbear

@alex-palevsky this is a bug.

original-brownbear avatar Mar 15 '16 21:03 original-brownbear

@alex-palevsky this is a bug.

@original-brownbear I added bug tag to this ticket

alex-palevsky avatar Mar 16 '16 13:03 alex-palevsky

@original-brownbear I added milestone 2.0 to this issue, let me know if there has to be something else

alex-palevsky avatar Mar 16 '16 14:03 alex-palevsky

@original-brownbear thanks a lot for reporting, 30 mins added to your acc, pmt ID AP-7A52811671755653U

alex-palevsky avatar Mar 16 '16 16:03 alex-palevsky

@alex-palevsky this is urgent.

original-brownbear avatar Mar 16 '16 19:03 original-brownbear

@alex-palevsky this is urgent.

@original-brownbear sure, thanks, I added "urgent" label to it

alex-palevsky avatar Mar 17 '16 13:03 alex-palevsky

@alex-palevsky assign someone to this issue please if possible.

original-brownbear avatar Mar 27 '16 15:03 original-brownbear

@alex-palevsky assign someone to this issue please if possible.

@original-brownbear OK

alex-palevsky avatar Mar 30 '16 12:03 alex-palevsky

@alex-palevsky I'm ready to take this.

xupyprmv avatar Apr 01 '16 11:04 xupyprmv

@alex-palevsky this is not urgent, lets focus on ECS full force for now.

original-brownbear avatar Apr 03 '16 17:04 original-brownbear

@alex-palevsky this is postponed, lets focus on ECS ...

original-brownbear avatar Apr 03 '16 17:04 original-brownbear

@alex-palevsky this is not urgent, lets focus on ECS full force for now.

@original-brownbear thanks, I removed the "urgent" tag

alex-palevsky avatar Apr 03 '16 19:04 alex-palevsky

@alex-palevsky this is postponed, lets focus on ECS ...

@original-brownbear thanks, I added "postponed" label

alex-palevsky avatar Apr 03 '16 19:04 alex-palevsky

@alex-palevsky this is postponed, lets focus on ECS ...

@original-brownbear I will assign somebody else to this issue

alex-palevsky avatar Apr 03 '16 19:04 alex-palevsky