oxidized
oxidized copied to clipboard
pid files not cleaned up reliably
pid files do not get cleaned up properly, this is I guess not normally an issue in a regular installation but is particularly noticeable when running the docker image since the processes get the same IDs every time the container boots, resulting in an infinite loop waiting for the previous process to exit when it's not really running.
Steps to reproduce:
docker compose up -d
docker compose down
docker compose up -d
docker compose logs -f
Half the time it will fail to restart giving this forever until the pid file is removed manually:
oxidized-oxidized-1 | A server is already running. Check /root/.config/oxidized/pid
I am having the same issue, upon restart. Need to delete the PID file manually.
I don't have a solution, but a while back I came up with a watchdog which has been solving this on our instance. Feel free to use if needed.
docker-compose.yml addition:
# A watchdog that checks for orphaned PID files that might be left over a crashed host system
watchdog:
restart: always
image: mcr.microsoft.com/powershell:latest
command: "pwsh -ExecutionPolicy Bypass -File /srv/script.ps1"
environment:
OXIDIZED_SERVER: oxidized
OXIDIZED_PORT: 8888
OXIDIZED_PIDPATH: /root/.config/oxidized/pid
WATCHDOG_MAX_TRIES: 6
WATCHDOG_INTERVAL: 10
volumes:
- ./oxidized:/root/.config/oxidized
- ./watchdog:/srv
- /etc/timezone:/etc/timezone
networks:
oxidized:
watchdog/script.ps1
while(1){
# Set current try
$currentTry = 1
Write-Host("[$(Get-Date)] Starting Watchdog..")
# Loop until we've reached max tries
while($currentTry -le $env:WATCHDOG_MAX_TRIES){
# Test the connection
if(Test-Connection -TcpPort $env:OXIDIZED_PORT -TargetName $env:OXIDIZED_SERVER -ErrorAction SilentlyContinue){
# Reset current try as it was reachable
Write-Host("[$(Get-Date)] System reachable.")
$currentTry = 1
}else{
Write-Host("[$(Get-Date)] System unreachable. Try $($currentTry)/$($env:WATCHDOG_MAX_TRIES).")
$currentTry++
}
Write-Host("[$(Get-Date)] Sleeping for $($env:WATCHDOG_INTERVAL) Seconds..")
Start-Sleep -Seconds $env:WATCHDOG_INTERVAL
}
# We fell out of the loop so it is unreachable, remove the PID and restart the container
Write-Host("[$(Get-Date)] System offline, removing PID File at `"$env:OXIDIZED_PIDPATH`"..")
if(Test-Path -Path $env:OXIDIZED_PIDPATH -ErrorAction SilentlyContinue){
Remove-Item -Path $env:OXIDIZED_PIDPATH -Force -Confirm:$false
}
# Sleep until we start the watchdog again
Start-Sleep -Seconds 60
}
in docker-compose:
command: ["/bin/sh", "-c" , "rm -rf /home/oxidized/.config/oxidized/pid && /usr/local/bundle/bin/oxidized"]
in docker-compose:
command: ["/bin/sh", "-c" , "rm -rf /home/oxidized/.config/oxidized/pid && /usr/local/bundle/bin/oxidized"]
That's probably the better solution than mine, even if mine would also detect a crashed instance.
Had the same issue and worked around it changing
pid: "/root/.config/oxidized/pid"
to
pid: "/dev/null"
Gambiarra at its finest.
Haha, yeah, that would do, I guess I still prefer mine even if it has overhead - but at least it does check whether or not it is running.