pinn icon indicating copy to clipboard operation
pinn copied to clipboard

Loss of read/write activity for mounted devices within PINN

Open annaclets opened this issue 4 years ago • 53 comments

When a PINN-created backup is used to install/restore an OS, e.g. RaspiOS, it loses the ability to read any mounted device -i.e. USB flash drives and other partitions in the same SD card. likewise, Twister OS, when installed by itself in an SD card, runs perfectly well. But when installed in PINN, Twister OS can no longer read any mounted device.

To be fair to PINN, which in my opinion is currently the best multiboot tool for RPi, this problem also occurs in BerryBoot - Twister OS does not have i/o on any mounted device. What gives?

annaclets avatar Dec 09 '20 12:12 annaclets

Weird... I wonder if some file / directory is getting created with the wrong permissions? Can you still see the files on the mounted device, just can't open any of them for reading?

lurch avatar Dec 09 '20 12:12 lurch

Complete blank - no file visible. The devices appear to be dismountable / mountable, though, but no files appear.

annaclets avatar Dec 09 '20 13:12 annaclets

Not seen that before. I shall have to experiment with it

procount avatar Dec 09 '20 13:12 procount

@annaclets Have you done any customisation of RaspiOS (e.g. adding new users), or is it still a vanilla install?

lurch avatar Dec 09 '20 14:12 lurch

RaspiOS was customized, which is why it was backed up so as not to lose the customization. When restored, it lost i/o on mounted devices. Same result when following the instructions from this: https://github.com/procount/pinn/wiki/How-to-Create-a-Multi-Boot-SD-card-out-of-2-existing-OSes-using-PINN

TwisterOS, however, is vanilla install and never had i/o on mounted devices in both PINN and BerryBoot.

annaclets avatar Dec 09 '20 16:12 annaclets

I'm sorry for the stupid question, but are you sure there's actually still files on the drive, and the Pi isn't just (correctly) showing an empty drive? :man_shrugging:

lurch avatar Dec 09 '20 22:12 lurch

I have found the solution, but it leads to some complications. Simply put, it turns out that the tar archiving system does not produce a fully accurate image of an ext4 rootfs partition. I switched from rootfs.tar.xz to rootfs.img.xz and the problem with mounted devices got solved for both TwisterOS and backup RaspiOS.

The complication is it's a tedious process involving shrinking and expanding partition sizes, as stated in the abovementioned wiki.

annaclets avatar Dec 09 '20 23:12 annaclets

Yes I think you are right. The archiving looks to be a bit lacking. I shall look at improving it. Thanks for reporting it.

procount avatar Dec 10 '20 00:12 procount

I hope to have an update to fix this shortly. Would you be willing to test it out?

procount avatar Dec 10 '20 01:12 procount

@procount I'm intrigued to hear what the actual underlying problem was, that caused such seemingly bizarre behaviour?

lurch avatar Dec 10 '20 05:12 lurch

@procount, yes, I'm looking forward to testing the fix.

annaclets avatar Dec 10 '20 07:12 annaclets

Please download https://sourceforge.net/projects/pinn/files/testing/pinn-353a.zip/download You should unzip it over an existing PINN installation (one that doesn't matter, since this is beta software!) (password is backup) It does not include recovery.cmdline or config.txt, so if you don't have an existing installation to test, just copy those 2 files across.

You will need to install a new OS (Twister/RasPiOs etc) to ensure the file attributes are all correct, then back them up and restore them.

(I have not had time to even run this to see if this boots, so use with caution! - but it should be ok)

procount avatar Dec 10 '20 20:12 procount

I have, so far, tried to back up ubuntu, lineage17, raspiOS and recalbox. Success with ubuntu and lineage17, but cannot proceed with raspiOS and recalbox because of detected incompatibity (with tar?). I noticed that you have now shifted to xxx.img.gz, i.e. raw format. Shouldn't you lift the restrictions on raspiOS (unconventional ext4 format?) and recalbox (exfat?) now that youi're into raw imaging?

Initial restore ran into errors. The partition.json still specified for the img partitions' filesystem_type as "ext4" rather then "raw". Upon correction, restore proceeded.

The lineage17 restored successfully. However, the ubuntu restore did not complete due to an error in the partition_setup.sh, which I checked and confirmed is identical to the old one.

Tried booting lineage17, won't boot.

annaclets avatar Dec 11 '20 07:12 annaclets

Backup should use the same method of archiving as was used to create the OS install files originally, with some minor exceptions.

Images were used for some partitions in lineage because it was the only way to convert them and they are fixed size partitions. It is much more complicated to restore an image to a different sized partition, which is normally required.

I shall investigate the problem OSes.

procount avatar Dec 11 '20 07:12 procount

Oops, I'm sensing a misunderstanding here. I now understand that you have not actually switched to raw imaging? The install files I used were mine, which have raw images.

Ok, I will now try again with original base installations which used tarballs.

annaclets avatar Dec 11 '20 08:12 annaclets

At this point, a backup trial of TwisterOS is moot because it never had i/o on mounted devices due to the original install files being tar based.

Just tried raspiOS, ubuntu and retropie, all from original base installations. Oddly, all that was generated from raspiOS and ubuntu were 1KB xxx.tar.gz, so I would consider them failures. Backup of retropie could not proceed because of incompatibility.

annaclets avatar Dec 11 '20 09:12 annaclets

I'm afraid pinn 353a is not going to work. See https://github.com/raspberrypi/noobs/issues/500.

Currently v3.5.2 uses plain tar, but I've verified it has started to cause this problem. I don't remember it being an issue before. Perhaps its due to a bump in the file system versions that tar does not understand?

It looks like the file manager gets half way through mounting a USB drive, because the drive labels are displayed in the folder list, but they are not really mounted. Clicking on the drive label produces an error about not being able to access the drive via its UUID.

@Lurch - another opportunity for you to try PINN perhaps? I'm a bit stumped Might have to revert to backing up to an image!

procount avatar Dec 11 '20 23:12 procount

I've done some investigation this afternoon, and it looks like it's an ACL issue. The USB disk does get mounted (which is why you can see it in the file explorer), but the missing ACL means that only the root user is able to see the contents of the disk! @annaclets The way to fix this is sudo setfacl -m user:pi:r-x /media/pi

I don't remember it being an issue before. Perhaps its due to a bump in the file system versions that tar does not understand?

It seems to me that it's entirely possible that you just never tried mounting a USB disk in a running OS that you'd restored from a PINN-created backup before? :shrug:

@procount WRT the bsdtar problem in https://github.com/raspberrypi/noobs/issues/500 it looks like libarchive has a HAVE_STATFS option, so maybe forcing that to 0 would fix the statfs failed error? In the meantime / as an alternative: a temporary fix is to not add /media/pi into the rootfs backup tarball (or delete it from disk after restoring the backup), and Raspberry Pi OS will then create it automatically (when needed) with the correct ACL.

lurch avatar Dec 13 '20 20:12 lurch

Cheers! TBH, I can't remember specifically mounting a USB drive. It has not been part of my usual test suite, but perhaps it should be now... I'll give the HAVE_STATFS a try and see if it works. The 2nd temporary fix should allow previous backups to start working again if they are re-installed.

procount avatar Dec 13 '20 21:12 procount

The 2nd temporary fix should allow previous backups to start working again if they are re-installed.

Good point, I hadn't considered that :slightly_smiling_face: Also means that a fresh TwisterOS PINN-install would work for @annaclets , once you've implemented it.

lurch avatar Dec 13 '20 21:12 lurch

Although using your setfacl fix would be a lot quicker!

procount avatar Dec 13 '20 22:12 procount

This can be fixed in partition_setup.sh by adding: rm -rf /tmp/2/media/pi

I wonder if we should use rm -rf /tmp/2/media/* instead, so that the username would not matter...? Or could that erase something important?

procount avatar Dec 15 '20 00:12 procount

This can be fixed in partition_setup.sh by adding: rm -rf /tmp/2/media/pi

Probably better to do rmdir /tmp/2/media/pi so that it won't nuke the directory if it's non-empty?

Or could that erase something important?

Possibly... you could guard against that by only deleting empty directories :wink:

for d in /tmp/2/media/*; do
    if [[ -d "$d" ]] && [[ -z "$(ls -A "$d")" ]]; then
        echo "Deleting \"$d\""
        rmdir "$d"
    fi
done

(based on code at https://superuser.com/questions/352289/bash-scripting-test-for-empty-directory )

Although I guess this "solution" only works for OSes where you're able to modify partition_setup.sh and wouldn't work for e.g. backups of Raspberry Pi OS?

lurch avatar Dec 15 '20 04:12 lurch

@annaclets The way to fix this is sudo setfacl -m user:pi:r-x /media/pi

Yes, @lurch, I confirm that this indeed fixes the problem, tested with TwisterOS. Thanks!

Now I'm looking forward to testing pinn v3.5.3b, which I expect would integrate the fix in its generated backups.

annaclets avatar Dec 15 '20 10:12 annaclets

I need to think about how best to implement this. Is it OS dependent? I don't think every OS will have a media/pi folder. I'm sure some won't. Building it into PINN will hardcode it too much I think, causing a maintenance headache. I think adding it to partition_setup.sh is better, but as @lurch says, this would be difficult to add into OSes that I have no/little control over such as Raspios, unless XECDesign will accommodate us. But there are also other OSes.

But really the above is just a sticking plaster over the fact that the tar I use cannot save ACL, and I'm not able to use BSDTAR because of the statfs issue (I tried compiling with HAVE_STATFS=0 and got a kernel panic 😞)

procount avatar Dec 15 '20 11:12 procount

(I tried compiling with HAVE_STATFS=0 and got a kernel panic :disappointed: )

Oh, that's annoying :confused: I wonder if pulling in a newer version from upstream might help? :shrug: https://git.busybox.net/buildroot/tree/package/libarchive

https://www.libarchive.de/ has a newer version than even the latest version of buildroot includes!

I don't think every OS will have a media/pi folder. I'm sure some won't.

As I pointed out above, deleting any empty directories in /media/ might be a broader catch-all? (e.g. would account for different usernames)

lurch avatar Dec 15 '20 11:12 lurch

As I pointed out above, deleting any empty directories in /media/ might be a broader catch-all?

Sorry, I had already taken your point, but my typing was a bit too specific and didn't know when to stop. 😄 Is '/media' always included, or might it be missing e.g. in Android, or some small music box buildroot distro for example? (I have too many OSes to check). Perhaps there are other OSes that use ACL on other folders/files that we are just not aware of. How should I cater for those?

Getting a BSDTAR version that supports ACL without STATFS would be ideal, so I may try those later versions. BUT I recall some conversation with MAXNET about BSDTAR/libarchive and I think later versions omitted something else (xattr maybe - I think it was to do with xbian's btrfs or similar), hence why I use 3.3.1. Got to be careful we don't break something else by fixing this issue.

procount avatar Dec 15 '20 12:12 procount

Is '/media' always included, or might it be missing

Well, obviously you'd check for the existence of the /media directory first! :laughing:

if [[ -d /tmp/2/media ]]; then
    for d in /tmp/2/media/*; do
        if [[ -d "$d" ]] && [[ -z "$(ls -A "$d")" ]]; then
            echo "Deleting \"$d\""
            rmdir "$d"
        fi
    done
fi

Perhaps there are other OSes that use ACL on other folders/files that we are just not aware of. How should I cater for those?

Indeed, as you noted this is just a sticking plaster - the proper fix is to backup and restore ACLs too...

Getting a BSDTAR version that supports ACL without STATFS would be ideal

...agreed!

BUT I recall some conversation with MAXNET about BSDTAR/libarchive and I think later versions omitted something else

If you can find up the details I'd be happy to do a bit of investigation - seems like it'd be weird for bsdtar to actually be dropping features?

Got to be careful we don't break something else by fixing this issue.

Indeed, that's the trouble with supporting so many different OSes :stuck_out_tongue_winking_eye: :rofl:

lurch avatar Dec 15 '20 13:12 lurch

Thinking of adding the above code (copied here):

if [[ -d /tmp/2/media ]]; then
    for d in /tmp/2/media/*; do
        if [[ -d "$d" ]] && [[ -z "$(ls -A "$d")" ]]; then
            echo "Deleting \"$d\""
            rmdir "$d"
        fi
    done
fi

as a permanent script to be executed on OS restoration, separate from partition_setup.sh

procount avatar Jan 10 '21 01:01 procount

Perhaps it's worth doing that for each of the partitions you restore (in case some OS uses a weird layout), rather than only for the 2nd partition? Although maybe it's also worth adding an override flag for that (to prevent this empty-dir-deletion) to the os.json files, just in case there's some OS where it causes problems? (always best to be prepared for the worst :laughing: )

lurch avatar Jan 10 '21 09:01 lurch