open-balena icon indicating copy to clipboard operation
open-balena copied to clipboard

Add documentation to address when to set `PRODUCTION_MODE: "true"`

Open andrewnhem opened this issue 3 years ago • 3 comments

We've had a few users address openbalena API crashes by setting PRODUCTION_MODE: "true" to force restarts.

https://forums.balena.io/t/api-container-crashing/273784 https://forums.balena.io/t/openbalena-crash-error-getaddrinfo-enotfound-api-github-com/285697

Not sure if this is just a hack/workaround, or if we want to address this in documentation, since both users and support agents have tried to find a link or context on the variable.

andrewnhem avatar Apr 16 '21 18:04 andrewnhem

Should production mode be 'true' by default instead of 'false'? As far as I can tell, looking at the code in the API and VPN, the production mode set to false is kind-of like a debug/development mode.

Most people running openBalena aren't debugging / developing. I can understand it's still 'false' by default, because of the logs outputted by the system, which are needed when debugging a problem. But the "stability" of the system is lower, because in case of an error in the API, it'll stop and you've to restart it manually.

Maybe some food for thought 🙂

bartversluijs avatar Jun 27 '21 08:06 bartversluijs

Today found that api crashed by OOM:

Aug 18 16:37:54 e7f399138796 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=37c3cee2673d0b9e70713ddcd7c282e32f6281de1d0d36e4b287807f4b8b75b1,mems_allowed=0,global_oom,task_memcg=/system.slice/containerd.service/system.slice/open-balena-api.service,task=node,pid=48384,uid=0
Aug 18 16:37:54 e7f399138796 kernel: Out of memory: Killed process 48384 (node) total-vm:2028628kB, anon-rss:218764kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:5692kB oom_score_adj:0
Aug 18 16:37:54 e7f399138796 kernel: oom_reaper: reaped process 48384 (node), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Aug 18 16:37:54 e7f399138796 api[780]: Program node index.js exited with code null

and not restarted, because PRODUCTION_MODE is false by default and supervisor doesn't restart if exit code !== 0. Toggled mode in config/activate, don't know if that's the best place for toggling, but it works as expected now.

I agree with @bartversluijs that it should be production by default. Also docs should mention somewhere for available env vars and where to modify them.

PaulAnnekov avatar Aug 18 '21 17:08 PaulAnnekov

I don't think defaulting to production mode is desirable at this point, considering openBalena isn't supposed to be used in production yet. Apart from restart behaviour, another change this would cause is silencing almost all output from services (definitely the API, maybe others too) which is typically useful for debugging and/or reporting issues. For example, you wouldn't really see that your instance goes OOM without having to dig in much deeper.

Toggled mode in config/activate, don't know if that's the best place for toggling, but it works as expected now.

This is the way to do it 👍 Alternatively, you can selectively override this (or any other environment value) per service in your config/docker-compose.yml file.

dfunckt avatar Aug 18 '21 17:08 dfunckt