runc icon indicating copy to clipboard operation
runc copied to clipboard

bug:fix runc delete run before delete exec.fifo

Open ningmingxiao opened this issue 8 months ago • 11 comments

if runc delete run before delete exec.fifo runc delete will container root dir "delete /runc/nmx003"

func handleFifoResult(result openResult) error {
	if result.err != nil {
		return result.err
	}
	f := result.file
	defer f.Close()
	if err := readFromExecFifo(f); err != nil {
		return err
	}
	return os.Remove(f.Name())
}

os.remove will reurn a error ERRO[0030] remove /run/runc/nmx003/exec.fifo: no such file or directory

ningmingxiao avatar Apr 21 '25 16:04 ningmingxiao

@ningmingxiao can you provide a reproducer?

kolyshkin avatar Apr 21 '25 21:04 kolyshkin

I add some sleep at

func handleFifoResult(result openResult) error {
	if result.err != nil {
		return result.err
	}
	f := result.file
	defer f.Close()
	if err := readFromExecFifo(f); err != nil {
		return err
	}
        time.Sleep(time.Second*30)
	return os.Remove(f.Name())
}

runc create nmx001 runc start nmx001 when it is running at time.Sleep(time.Second*30) runc delete nmx001 it will happen. @kolyshkin It's not easy to reproduce unless a delay is added

ningmingxiao avatar Apr 21 '25 22:04 ningmingxiao

I think RemoveAll is better

ningmingxiao avatar Apr 22 '25 01:04 ningmingxiao

I think RemoveAll is better

In its current implementation, and if path points to a non-directory (or an empty directory) -- yes, it's essentially does the same as I suggested above, i.e.

        err := os.Remove(path)
        if err == nil || os.IsNotExist(err) {
                  return nil
        }

but in a (theoretically possible) case when path is a non-empty directory, it will go ahead and remove it recursively, which is probably not what we really want here. I would prefer an error in such a case (even if it doesn't look like it's practically possible).

kolyshkin avatar Apr 22 '25 06:04 kolyshkin

done, thanks @kolyshkin

ningmingxiao avatar Apr 22 '25 08:04 ningmingxiao

ping @kolyshkin can this pr be merged ?

ningmingxiao avatar Apr 24 '25 08:04 ningmingxiao

Pull Request Overview

This PR fixes a bug related to the improper deletion of exec.fifo, which causes issues when runc delete is executed before deleting exec.fifo.

  • Updated error handling in handleFifoResult to ignore os.IsNotExist errors during file removal.
  • Prevents an error from being returned when exec.fifo is already removed, ensuring smooth deletion of the container's root directory.

@Copilot Do you think this is really a bug. If the container is deleted by other process, I think runc start / run should return the exact error and have a non-zero exit code, but not to ignore some errors. Or else the users may wrongly think that we start/run the container successfully, but the result is opposite.

lifubang avatar May 28 '25 02:05 lifubang

Do you think this is really a bug. If the container is deleted by other process, I think runc start / run should return the exact error and have a non-zero exit code, but not to ignore some errors. Or else the users may wrongly think that we start/run the container successfully, but the result is opposite.

I think Copilot can't talk with us here.

@ningmingxiao @kolyshkin WDYT

lifubang avatar May 28 '25 02:05 lifubang

@ningmingxiao can you explain the race in more detail? Is this code executed on a runc exec into a container, that while that is executing another process runs runc delete and in the runc exec we return an error to delete the fifo? Or what is it exactly?

rata avatar Jun 18 '25 14:06 rata

we use docker 17.03,I find sometimes if runc start and then container exited shim will call runc delete before delete exec.fifo when run a short term process like ls or date

ningmingxiao avatar Jun 18 '25 15:06 ningmingxiao

Ohh, makes sense. Thanks!

rata avatar Jun 18 '25 17:06 rata

@lifubang friendly ping? I intend to merge if your concern has been solved

rata avatar Jul 14 '25 10:07 rata