ltfs icon indicating copy to clipboard operation
ltfs copied to clipboard

Suggestion for possible mitigation of the tape corruption problem in #446

Open JoakimZiegler opened this issue 7 months ago • 3 comments

Is your feature request related to a problem? Please describe. I've just experienced for the first time the tape corruption mentioned in #446. This was my own fault, and the first time I've run into it, because previously when I've tried to eject the tape too early, I've always used mt -f /dev/nst0 rewoffl instead of mt -f /dev/st0 rewoffl to eject the tape, and as far as I can tell, this is not a problem when using the nst device, which does not rewind the tape when closing the device, and it's the rewind behavior that specifically causes this problem. When using nst, I just get an I/O error, but the tape is not corrupted, and all I need to do is try ejecting it again. However, I think it's still confusing to the average user that calling the fairly standard system "eject tape" command can irreparably corrupt a tape with no warning.

I don't like using the eject mount option to ltfs, because I often run tape backups remotely, at night or on weekends, and it happens frequently that I forget to copy something to the tape before umounting it. If it automatically ejects on unmount, I need a person to be physically present to re-insert it in the drive.

Describe the solution you'd like I think this problem would be largely avoided by simply having umount not return until ltfs has finished writing everything to the tape and the tape is safe to rewind, eject, or do whatever to. umount returning almost immediately, but the actual flushing process to tape running in the background for an indeterminate amount of time that can only be discovered by peeping at process lists is not user friendly or intuitive. It's also incongruent with how umount normally works on Linux systems; when running umount on a filesystem on an external disk drive, for instance, umount does not return until all data is flushed to disk and the filesystem properly updated (which in some cases on slow devices can take a while), and when umount returns, it's a sign that the external disk drive is safe to unplug. This also mimics behavior in GUIs and other operating systems where the drive's icon doesn't disappear from the desktop when unmounting until all data has been successfully flushed.

I think making umount wait until the tape has finished flushing would be the preferred behavior, reducing the chance of mishaps, be more intuitive, and would also make scripting, etc., easier. Additionally, I think the documentation should strongly recommend only using the /dev/nstX devices for any manipulation of the tape drive, instead of the /dev/stX devices, since the /dev/nstX devices seem to not risk corruption of the tape.

JoakimZiegler avatar May 03 '25 06:05 JoakimZiegler

I don't like using the eject mount option to ltfs, because I often run tape backups remotely, at night or on weekends, and it happens frequently that I forget to copy something to the tape before umounting it. If it automatically ejects on unmount, I need a person to be physically present to re-insert it in the drive.

Physical re-insertion is not needed, I believe. The drive shall mount tape again by LOAD command after UNLOAD command is completed.

In other words, you can mount the tape with ltfs command without any manual(hand) operation even if you add the -o eject option.

piste-jp avatar May 03 '25 13:05 piste-jp

Physical re-insertion is not needed, I believe. The drive shall mount tape again by LOAD command after UNLOAD command is completed.

In other words, you can mount the tape with ltfs command without any manual(hand) operation even if you add the -o eject option.

I haven't actually tried this, so I don't know, but if that's the case, what exactly does -o eject do? I assumed it was the same as calling mt -f /dev/nst0 rewoffl, which rewinds and offlines the tape, which at least on standalone tape units is equivalent to physically ejecting the tape.

Anyway, this wasn't really the main gist of my suggestion: I think it would be good to make umount on LTFS mounts not return until all data has been flushed and the tape device is safe to manipulate. This would avoid accidents, and would also be the same behavior as umount has on other file systems on Linux.

JoakimZiegler avatar May 05 '25 01:05 JoakimZiegler

Anyway, this wasn't really the main gist of my suggestion: I think it would be good to make umount on LTFS mounts not return until all data has been flushed and the tape device is safe to manipulate. This would avoid accidents, and would also be the same behavior as umount has on other file systems on Linux.

This is not a topic of the LTFS.

As you can see, ltfs_fuse_umount(), which is fuse_operations.destroy of the LTFS, is blocking function. It means FUSE or umount command does not wait the completion of the LTFS process. I believe there is nothing LTFS can do about this topic.

https://github.com/LinearTapeFileSystem/ltfs/blob/adb37220d74e0bdd06dfdf9602a84fcd2f0d82e2/src/ltfs_fuse.c#L1209

https://github.com/LinearTapeFileSystem/ltfs/blob/adb37220d74e0bdd06dfdf9602a84fcd2f0d82e2/src/ltfs_fuse.c#L1151-L1176

I haven't actually tried this, so I don't know, but if that's the case, what exactly does -o eject do? I assumed it was the same as calling mt -f /dev/nst0 rewoffl, which rewinds and offlines the tape, which at least on standalone tape units is equivalent to physically ejecting the tape.

LTFS just issues a SCSI LOAD_UNLOAD command as 'UNLOAD' after LTFS writes down the fresh index on the index partition and close the tape cleanly.

The drive hold the tape in the 'LOCKED' position when 'UNLOAD' operation is finished. User can pull the tape when the tape in the 'LOCKED' position or the drive can re-insert the tape with a SCSI LOAD_UNLOAD command as 'LOAD'.

piste-jp avatar May 06 '25 09:05 piste-jp