dagr icon indicating copy to clipboard operation
dagr copied to clipboard

CPU and memory resource detection should be Docker container aware.

Open jacarey opened this issue 4 years ago • 1 comments

Background:

Docker can put memory and cpu limits on processes running inside of containers that are a subset of all system resources. However, the current resource detection in dagr is not aware of the control groups that limit the resource usage in a container.

Because of this there is the potential for the dagr pipeline to over allocate resources which can cause out of memory container exits.

Potential Solutions

The current resource detection uses the oshi library who's recommended "fix" is to read the limits from the files on the underlying OS that contain them. https://github.com/oshi/oshi/issues/893

This requires the resource detection to have logic to determine if it is running in a container environment and where to look for those files if it is.

Alternatively, JDK 10 has docker aware resource detection that has been back-ported to JDK 8 v191+

https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8196595

These features are on be default so resource querying does not need any logic to determine if it is running in a container or not.

jacarey avatar Apr 15 '20 19:04 jacarey

I am +1 on this! The bug is rarely hit for me, but when it appears, the entire job comes crashing down.

clintval avatar Apr 18 '20 17:04 clintval