nfs4j seqid mismatch during CLOSE

When multiple OPEN and CLOSE operations are interleaved, it's looking like the wrong seqid is used for the test: https://github.com/dCache/nfs4j/blob/b4af4c33051c3d150ed94cbb503730c977fee310/core/src/main/java/org/dcache/nfs/v4/OperationCLOSE.java#L58

Before this test, we added a log to check the different values: _log.info("[contextCurrentStateId = " + context.currentStateid() + " , _args.opclose.open_stateid = " + _args.opclose.open_stateid + " , nfsState.stateid() = " + nfsState.stateid() + " , stateid = " + stateid + "]");

The seqid returned by nfsState.stateid() seems to be the one from the last OPEN operation, so the test is failing.

Here is the capture and the log :

2020-11-23 22:32:53.007 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (6)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007a, seq: 1] , nfsState.stateid() = [5fbc217e000000020000007a, seq: 1] , stateid = [5fbc217e000000020000007a, seq: 1]]
2020-11-23 22:32:53.010 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (3)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 2] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 3] , stateid = [5fbc217e000000020000007b, seq: 2]]
2020-11-23 22:32:53.011 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (10)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 5] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 7] , stateid = [5fbc217e000000020000007b, seq: 5]]
2020-11-23 22:32:53.012 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (12)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 7] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 7] , stateid = [5fbc217e000000020000007b, seq: 7]]
2020-11-23 22:32:53.012 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (14)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 7] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 8] , stateid = [5fbc217e000000020000007b, seq: 7]]
2020-11-23 22:32:53.013 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (22)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 8] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 8] , stateid = [5fbc217e000000020000007b, seq: 8]]
2020-11-23 22:32:53.013 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (13)] INFO  o.d.n.v.OperationCLOSE - [contextCurrentStateId = [000000000000000000000000, seq: 0] , _args.opclose.open_stateid = [5fbc217e000000020000007b, seq: 8] , nfsState.stateid() = [5fbc217e000000020000007b, seq: 8] , stateid = [5fbc217e000000020000007b, seq: 8]]
2020-11-23 22:32:53.013 [minikube.mshome.net/172.24.45.130:735] [nfs4j@2049 (13)] INFO  o.d.n.v.AbstractOperationExecutor - Lost client state: op: CLOSE : NFS4ERR_BAD_STATEID : State not known to the client: [5fbc217e000000020000007b, seq: 8]

bug-nfs-multiple-open-close.zip

In the log, you'll notice three times where the seqid isn't the one expected. And in the capture, we can see that it's the seqid of the previous OPEN. The NFS client is a Linux VM (minikube).

Nov 23 '20 22:11 sraillard

Thanks for the info. I will try to reproduce it.

Nov 24 '20 13:11 kofemann

I'm not able to reproduce this. To you run any special test, or do yo see errors on the application side?

Dec 08 '20 09:12 kofemann

As we have experienced a lot of issues, I was able to write a small piece of code in PHP to reproduce it. It's quite straightforward: just do some reads or write from different process on the same file at the same time and you can reproduce the issue.

The code is forking 4 process, and each process will try to write 4 times a line in the same log file. So, that is a total of 16 write operations with concurrent access to the same file.

Here is the test code (can do a write or a read test):

<?php

	echo ("Start of PARENT\n");

	// Reading or writing test
	$WriteTest=TRUE;

	// Test file management
	$Filename="test-file.txt";
	if ($WriteTest) {
		// Write test, remove the file before testing
		if (file_exists($Filename)) unlink($Filename);
	} else {
		// Read test, create a test file
		file_put_contents($Filename,"Reading test");
	}

	// Loop for forking child processes
	$pid=-1;
	for ($i=0; $i<4; $i++) {
		$pid=pcntl_fork();
		if ($pid == -1) {
			die("ERROR fork #".($i+1)." !\n");
		} else if ($pid==0) {
			// PARENT, do nothing
			echo("Still in PARENT\n");
		} else {
			// CHILD, go to child operation
			goto child;
		}
	}
	die("End of PARENT\n");

	// Child operations
child:
	echo("Start of CHILD #".($i+1)." PID=".$pid."\n");

	$ts1=microtime(TRUE);
	for ($j=0; $j<4; $j++) {
		$ts2=microtime(TRUE);
		if ($WriteTest) {
			// Concurrent write test
			file_put_contents($Filename,"LOG - ".date("H:i:s.v")." - ".intval(($ts2-$ts1)*1000)." ms - CHILD #".($i+1).".".($j+1)." PID=".$pid."\n",FILE_APPEND );
		} else {
			// Concurrent read test
			$contents=file_get_contents($Filename);
		}
		$ts1=$ts2;
	}

	die("End of CHILD #".($i+1)." PID=".$pid."\n");

And the results:

time php fork.php
Start of PARENT
Start of CHILD #1 PID=77
Still in PARENT
Start of CHILD #2 PID=78
Still in PARENT
Start of CHILD #3 PID=79
Still in PARENT
Still in PARENT
End of PARENT
Start of CHILD #4 PID=80
End of CHILD #4 PID=80
End of CHILD #3 PID=79
End of CHILD #1 PID=77
End of CHILD #2 PID=78

real    0m16.801s
user    0m0.008s
sys     0m0.007s

Of course, the total time of 16 seconds is showing a blocking issue here!

And the content of the log file written:

LOG - 10:36:04.000 - 0 ms - CHILD #1.1 PID=77
LOG - 10:36:04.000 - 0 ms - CHILD #3.1 PID=79
LOG - 10:36:04.000 - 0 ms - CHILD #4.1 PID=80
LOG - 10:36:04.000 - 0 ms - CHILD #2.1 PID=78
LOG - 10:36:09.000 - 5590 ms - CHILD #4.2 PID=80
LOG - 10:36:09.000 - 5591 ms - CHILD #2.2 PID=78
LOG - 10:36:09.000 - 5441 ms - CHILD #3.2 PID=79
LOG - 10:36:09.000 - 5416 ms - CHILD #1.2 PID=77
LOG - 10:36:15.000 - 5551 ms - CHILD #4.3 PID=80
LOG - 10:36:15.000 - 5875 ms - CHILD #3.3 PID=79
LOG - 10:36:15.000 - 5637 ms - CHILD #2.3 PID=78
LOG - 10:36:15.000 - 323 ms - CHILD #4.4 PID=80
LOG - 10:36:15.000 - 149 ms - CHILD #3.4 PID=79
LOG - 10:36:15.000 - 6050 ms - CHILD #1.3 PID=77
LOG - 10:36:21.000 - 5312 ms - CHILD #1.4 PID=77
LOG - 10:36:20.000 - 5451 ms - CHILD #2.4 PID=78

You can see the blocking time of around 5 seconds on some operations.

Dec 08 '20 09:12 sraillard

Is this possibly related to #120?

Mar 28 '23 13:03 dkocher

@dkocher Sounds like it. The reproducer on my side was

fio --name 'read-test' --description 'Simple NFS read test' \
        --rw=read  --numjobs 8 --thread --blocksize 1M --filename 10GB

Mar 28 '23 13:03 kofemann

nfs4j nfs4j copied to clipboard

seqid mismatch during CLOSE

nfs4j
nfs4j copied to clipboard