oxidized icon indicating copy to clipboard operation
oxidized copied to clipboard

pid files not cleaned up reliably

Open davidc opened this issue 2 years ago • 4 comments

pid files do not get cleaned up properly, this is I guess not normally an issue in a regular installation but is particularly noticeable when running the docker image since the processes get the same IDs every time the container boots, resulting in an infinite loop waiting for the previous process to exit when it's not really running.

Steps to reproduce:

docker compose up -d
docker compose down
docker compose up -d
docker compose logs -f

Half the time it will fail to restart giving this forever until the pid file is removed manually:

oxidized-oxidized-1  | A server is already running. Check /root/.config/oxidized/pid

davidc avatar Mar 24 '22 21:03 davidc

I am having the same issue, upon restart. Need to delete the PID file manually.

pfunkylol avatar Mar 30 '22 10:03 pfunkylol

I don't have a solution, but a while back I came up with a watchdog which has been solving this on our instance. Feel free to use if needed.

docker-compose.yml addition:

 # A watchdog that checks for orphaned PID files that might be left over a crashed host system
 watchdog:
   restart: always
   image: mcr.microsoft.com/powershell:latest
   command: "pwsh -ExecutionPolicy Bypass -File /srv/script.ps1"
   environment:
     OXIDIZED_SERVER: oxidized
     OXIDIZED_PORT: 8888
     OXIDIZED_PIDPATH: /root/.config/oxidized/pid
     WATCHDOG_MAX_TRIES: 6
     WATCHDOG_INTERVAL: 10
   volumes:
     - ./oxidized:/root/.config/oxidized
     - ./watchdog:/srv
     - /etc/timezone:/etc/timezone
   networks:
     oxidized:

watchdog/script.ps1

while(1){
    # Set current try
    $currentTry = 1
    Write-Host("[$(Get-Date)] Starting Watchdog..")

    # Loop until we've reached max tries
    while($currentTry -le $env:WATCHDOG_MAX_TRIES){
        # Test the connection
        if(Test-Connection -TcpPort $env:OXIDIZED_PORT -TargetName $env:OXIDIZED_SERVER -ErrorAction SilentlyContinue){
            # Reset current try as it was reachable
            Write-Host("[$(Get-Date)] System reachable.")
            $currentTry = 1
        }else{
            Write-Host("[$(Get-Date)] System unreachable. Try $($currentTry)/$($env:WATCHDOG_MAX_TRIES).")
            $currentTry++
        }
        Write-Host("[$(Get-Date)] Sleeping for $($env:WATCHDOG_INTERVAL) Seconds..")
        Start-Sleep -Seconds $env:WATCHDOG_INTERVAL
    }
    # We fell out of the loop so it is unreachable, remove the PID and restart the container
    Write-Host("[$(Get-Date)] System offline, removing PID File at `"$env:OXIDIZED_PIDPATH`"..")
    if(Test-Path -Path $env:OXIDIZED_PIDPATH -ErrorAction SilentlyContinue){
        Remove-Item -Path $env:OXIDIZED_PIDPATH -Force -Confirm:$false
    }

    # Sleep until we start the watchdog again
    Start-Sleep -Seconds 60
}

RobinBeismann avatar Apr 04 '22 15:04 RobinBeismann

in docker-compose: command: ["/bin/sh", "-c" , "rm -rf /home/oxidized/.config/oxidized/pid && /usr/local/bundle/bin/oxidized"]

resetsa avatar Apr 28 '22 15:04 resetsa

in docker-compose: command: ["/bin/sh", "-c" , "rm -rf /home/oxidized/.config/oxidized/pid && /usr/local/bundle/bin/oxidized"]

That's probably the better solution than mine, even if mine would also detect a crashed instance.

RobinBeismann avatar Apr 28 '22 17:04 RobinBeismann

Had the same issue and worked around it changing pid: "/root/.config/oxidized/pid" to pid: "/dev/null"

Gambiarra at its finest.

dMailonG avatar Feb 22 '23 19:02 dMailonG

Haha, yeah, that would do, I guess I still prefer mine even if it has overhead - but at least it does check whether or not it is running.

RobinBeismann avatar Feb 22 '23 20:02 RobinBeismann