shadow Record locks and open file description locks are acquired by shadow's process instead of emulated

Shadow's implementation of fcntl operations manipulating record locks (F_SETLK, F_SETLKW, F_GETLK) and open file description locks (F_OFD_SETLK, F_OFD_SETLKW, F_OFD_GETLK) performs the native operation on the native open file descriptor, from Shadow's process.

This is probably ok in cases where the locks are never under contention, but will be incorrect otherwise. e.g. if two different managed processes attempt to acquire a write lock on the same region of the same file, Linux will see it as the Shadow process trying to acquire the lock twice. From the man page, I think the second attempt will succeed, which will lead to incorrect and confusing behavior:

A single process can hold only one type of lock on a file region; if a new lock is applied to an already-locked region, then the existing lock is converted to the new lock type.

Lock metadata as returned by e.g. F_GETLK also includes the pid of the process holding the lock, which will be the native PID of shadow's process instead of the emulated PID of the corresponding managed process.

Jul 06 '22 19:07 sporksmith

I wondered if we could use F_GETLK to check whether a lock is already held over the requested range, but AFAICT there's no way to distinguish between "the lock isn't held" and "the lock is held by the current process".

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
  int fd = open("test", O_CREAT|O_TRUNC|O_RDWR, S_IRUSR|S_IWUSR);
  if (fd < 0) {
    perror("open");
    abort();
  }
  if (write(fd, "test", 4) != 4) {
    perror("write");
    abort();
  }
  struct flock flock = {
    .l_type = F_WRLCK,
    .l_whence=SEEK_SET,
    .l_start=0,
    .l_len=4,
  };

  if (fcntl(fd, F_SETLK, &flock) != 0) {
    perror("fcntl");
    abort();
  }

  // Getting the lock again succeeds
  if (fcntl(fd, F_SETLK, &flock) != 0) {
    perror("fcntl");
    abort();
  }

  // Querying the lock looks as if it's unlocked.
  if (fcntl(fd, F_GETLK, &flock) != 0) {
    perror("fcntl");
    abort();
  }
  printf("getlk: ");
  switch(flock.l_type) {
    case F_RDLCK:
      printf("F_RDLCK\n");
      break;
    case F_WRLCK:
      printf("F_WRLCK\n");
      break;
    case F_UNLCK:
      printf("F_UNLCK\n");
      break;
    default:
      printf("? (%d)\n", flock.l_type);
  }
}

Output:

getlk: F_UNLCK

Jul 06 '22 20:07 sporksmith

It looks like we could query the lock state via /proc/locks, but it's probably not worth investing the effort to parse it vs properly emulating and tracking the locks ourselves.

From man proc:

       /proc/locks
              This file shows current file locks (flock(2) and fcntl(2)) and leases (fcntl(2)).

              An example of the content shown in this file is the following:

                  1: POSIX  ADVISORY  READ  5433 08:01:7864448 128 128
                  2: FLOCK  ADVISORY  WRITE 2001 08:01:7864554 0 EOF
                  3: FLOCK  ADVISORY  WRITE 1568 00:2f:32388 0 EOF
                  4: POSIX  ADVISORY  WRITE 699 00:16:28457 0 EOF
                  5: POSIX  ADVISORY  WRITE 764 00:16:21448 0 0
                  6: POSIX  ADVISORY  READ  3548 08:01:7867240 1 1
                  7: POSIX  ADVISORY  READ  3548 08:01:7865567 1826 2335
                  8: OFDLCK ADVISORY  WRITE -1 08:01:8713209 128 191

Jul 06 '22 20:07 sporksmith

shadow shadow copied to clipboard

Record locks and open file description locks are acquired by shadow's process instead of emulated

shadow
shadow copied to clipboard