Bottles icon indicating copy to clipboard operation
Bottles copied to clipboard

Versioning based on btrfs snapshots

Open Borgvall opened this issue 1 month ago • 8 comments

Description

I actually like the bottles application for managing wine prefixes. However it itches me, that I can not create the bottles as btrfs subvolumes, which I want to use it with my backup solution, to backup/restore bottles independently. I also tried to create the subvolume in place before creating a bottle, but bottles is appending a random number to the path, if the bottle directory already exists.

I went a bit over the top and implemented it further, until I can create and restore bottle snapshots using the bottles GUI. With all updates added to this PR, I think this is ready to be merged.

This is a rework of #3420, that can not be reopened for github reasons. Compared to that PR the commits have been rebased on bottlesdev main, the flatpak module btrfs-progs has been reworked and updated to current version and one small bug fix have been added.

Type of change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [x] This change requires a documentation update

How Has This Been Tested?

  • [x] btrfsutil's calls work also inside flatpak's sandbox
  • [x] Bottles are created as btrfs subvolumes, if the filesystem is btrfs.
  • [x] Creating snapshots and resetting to them using btrfs snapshots.
  • [x] All snapshots have meaningful timestamps
  • [x] The "active" state is marked
  • [x] Duplicated bottles are created as subvolumes. This can be used to "import" existing bottles into btrfs snapshot based versioning
  • [x] Duplicated bottle, where the source bottle is a subvolume, is created as a lightweight snapshot bottle
  • [x] Deleting a bottle, deletes the associated btrfs snapshots
  • [x] If the bottle is not a subvolume or has pre-created FVS states, it falls back to the FVS system. (I encountered a critical bug #3416, which affects the release version, too)
  • [ ] TODO regression testing on non btrfs filesystem. (Partially done, and currently blocked due to UI issue, see https://github.com/bottlesdevs/Bottles/pull/4221#issuecomment-3593288722)

Borgvall avatar Nov 22 '25 20:11 Borgvall

I believe the plan is to use something similar to ostree or git for snapshots, not anything tied to a specific filesystem.

orowith2os avatar Nov 23 '25 03:11 orowith2os

Both proposals wouldn't work:

OSTree is designed for providing read-only filesystem-trees. This is unsuitable for wine-prefixes.

git is designed for text file repositories and doesn't scale well with large or a lot of binary files.

Borgvall avatar Nov 23 '25 07:11 Borgvall

I believe the plan is to use something similar to ostree or git for snapshots, not anything tied to a specific filesystem.

Honestly, this is a pretty sophisticated solution: btrfs is the default for a bunch of modern distros, the snapshot feat is mature and stable and it uses copy-on-write to make them pretty lightweight and is extremely secure due to guaranteed atomicity, this implementation can even be extended to support prefix deduplication. I also think we should have a fallback for other file-systems (maybe rsync could be a good fit, I have seen other projects use it as a fallback for fs snapshots, not atomic tho)

Like the OP said, Git would not be a good fit for this. I am not sure about ostree tho, do you have anything about it I can take a look? Maybe we could design the feature based in interfaces where these multiples backend could work from.

evertonstz avatar Nov 27 '25 16:11 evertonstz

Hi all,

This PR already lays the foundation for supporting multiple versioning backends. Currently, it implements btrfs and the existing FVS versioning, but the architecture could easily extend to other systems like XFS, ZFS, or potentially even OSTree (?) in the future.

One particularly useful enhancement this enables is the ability to delete specific snapshots - something that, to my knowledge, isn't possible with the current FVS system, but works seamlessly with btrfs snapshots. At the moment it isn't provided via the GUI. It would be a follow-up.

As it stands, this PR delivers significant benefits for btrfs users: reliable, lightweight snapshots and efficient bottle duplication through copy-on-write.

I'd be particularly interested to hear from @mirkobrombin, who showed interest in the original PR implementation.

Borgvall avatar Nov 29 '25 07:11 Borgvall

Hi all,

This PR already lays the foundation for supporting multiple versioning backends. Currently, it implements btrfs and the existing FVS versioning, but the architecture could easily extend to other systems like XFS, ZFS, or potentially even OSTree (?) in the future.

One particularly useful enhancement this enables is the ability to delete specific snapshots - something that, to my knowledge, isn't possible with the current FVS system, but works seamlessly with btrfs snapshots. At the moment it isn't provided via the GUI. It would be a follow-up.

As it stands, this PR delivers significant benefits for btrfs users: reliable, lightweight snapshots and efficient bottle duplication through copy-on-write.

I'd be particularly interested to hear from @mirkobrombin, who showed interest in the original PR implementation.

Yes, I think the PR is solid for BTRFS and the current FVS versioning, but I think we could actually improve the architecture itself (could be in a future PR ofc).

I have been thinking about this for a couple days and making some notes as I reach some conclusions (ps I do my notes via voice and and resume everything via AI, so ignore if it sounds too formal or something like that, I am a portuguese speaker so the robot is not very good with tone): here’s a practical way to make snapshots in Bottles easier to extend beyond Btrfs, without scattering filesystem logic all over the app. The idea is simple: one clean interface for versioning, small backend implementations for each filesystem, and a registry that picks the right one at runtime. This keeps the UI predictable, reduces risk, and makes future backends (ZFS, LVM, etc.) straightforward.

What we’re aiming for

  • Put all filesystem-specific code behind a single interface.
  • Have each backend tell us what it can do (capabilities), so the UI never guesses.
  • Use consistent snapshot metadata and error types across backends.
  • Always have a safe fallback (FVS), and make migrations (e.g., directory → subvolume) a first-class thing.
  • Keep bottle lifecycle code simple: resolve a backend, call methods, done.

The building blocks

  • VersioningBackend (the interface every backend implements):

    • backend_type() → short name like "btrfs", "zfs", "fvs"
    • capabilities() → what features are supported
    • ensure_bottle_initialized() → prep the bottle for this backend
    • create_snapshot(label?, read_only?) → returns SnapshotMetadata
    • list_snapshots() → returns normalized snapshot list
    • restore_snapshot(snapshot_id)
    • delete_snapshot(snapshot_id)
    • mark_active(snapshot_id) → optional; raise NotSupportedError if not supported
    • duplicate_bottle(target_path, mode=FULL_COPY|SNAPSHOT_CLONE)
    • cleanup_on_bottle_delete()
  • BackendCapabilities (tiny feature matrix):

    • supports_managed_container
    • supports_read_only_snapshots
    • supports_writable_snapshots
    • supports_active_marker
    • supports_streaming
    • supports_inplace_conversion
  • SnapshotMetadata (one shape for all backends):

    • id, label?, created_at, read_only, is_active, parent_id?, backend_type
  • Errors (shared and simple):

    • VersioningError → operation failed
    • NotSupportedError → backend doesn’t support that feature

Choosing the backend

  • BackendRegistry.resolve(bottle_path, prefer_native=True) returns the best backend:
    • If we’re on Btrfs and can manage/convert → use BtrfsBackend
    • Otherwise → use FVSBackend (guaranteed to work everywhere)

This keeps logic out of the UI and bottle lifecycle. The app just asks the registry and delegates.

Where it plugs into Bottles

  • On bottle creation/import:
    • backend.ensure_bottle_initialized()
  • For snapshots:
    • create/list/restore/delete
    • mark_active when supported
  • For duplication:
    • duplicate_bottle(target_path, mode), choosing lightweight clone when available
  • On delete:
    • cleanup_on_bottle_delete()

In the UI, call capabilities() to decide which buttons to show. For example, only show “Lightweight clone” when the backend supports writable snapshots and managed containers.

How to roll this out

  1. Add the base module:

    • VersioningBackend interface
    • BackendCapabilities, SnapshotMetadata
    • DuplicateMode enum and the shared errors
  2. Move Btrfs logic into BtrfsBackend:

    • Detect filesystem and subvolume robustly (prefer libbtrfsutil when available in Flatpak)
    • ensure_bottle_initialized(): convert directory → subvolume if allowed
    • Implement snapshot/restore/delete/duplicate using btrfs subvolume tools
    • Decide on an “active” marker (file/symlink) if the UI needs it
  3. Wrap the existing FVS code in FVSBackend:

    • Keep current behavior for snapshot/restore/delete/mark_active
    • duplicate_bottle: support FULL_COPY only; raise NotSupportedError for SNAPSHOT_CLONE
  4. Add BackendRegistry and update bottle lifecycle to use it.

  5. UI and settings:

    • Drive features from capabilities()
    • Offer migration options when supported (e.g., “Convert to Btrfs subvolume”)
  6. Tests:

    • Per-backend: capabilities, snapshot lifecycle, duplicate, cleanup
    • Fallback on non-Btrfs filesystems
    • Flatpak sandbox behavior (btrfsutil availability)
    • Regression coverage for existing FVS

Notes for the current Btrfs PR

  • Keep all Btrfs commands and logic inside BtrfsBackend; don’t leak them into the UI or lifecycle.
  • Use real subvolume metadata for timestamps rather than “now”.
  • Where operations need multiple steps, consider a small transactional helper to avoid partial changes.
  • Document how “active” is determined so the UI can reflect it consistently.

Why this is worth it

  • Clean boundaries and cleaner code.
  • Easy to add new backends without touching the UI.
  • Fewer edge cases: the UI only exposes what’s supported.
  • Safer behavior and a consistent user experience across filesystems.

This approach respects the work in the Btrfs PR, gives us a solid foundation for more filesystems, and keeps Bottles maintainable as we grow.

evertonstz avatar Nov 30 '25 15:11 evertonstz

Hi,

I have tested the fallback on non btrfs filesystems using an ext4 loop mount, except for snapshot restoring. The GUI does not show the created snapshots. This happens for me, both on my development flatpak and the official 60.1 flatpak from flathub. Can anyone please check, if it's a local problem or a general issue?

@evertonstz about your notes for the current PR:

Keep all Btrfs commands and logic inside BtrfsBackend; don’t leak them into the UI or lifecycle.

At the moment all work is delegated to the model.btrfssubvolume Python module

Use real subvolume metadata for timestamps rather than “now”.

This is already done, isn't it? The modification time of the snapshot directory is the creation time of the snapshot.

Where operations need multiple steps, consider a small transactional helper to avoid partial changes.

Is this a Python feature? Can you point me to some documentation?

Document how “active” is determined so the UI can reflect it consistently.

Should I add the documentation to the versioning manager?

Borgvall avatar Nov 30 '25 20:11 Borgvall

Where operations need multiple steps, consider a small transactional helper to avoid partial changes.

Is this a Python feature? Can you point me to some documentation?

Not a built‑in magic feature; it’s a pattern. Goal: group multi-step filesystem changes so you can roll them back if something fails mid-way. You could do this in multiple ways, ex: logging your steps in a journal and trigger a recovery to undo these steps in case of a fail (or trigger a full recovery in case of a crash). Lifecycle with sqlite would be something like this (this is backend agnostic btw, would work for zfs, btrfs, etc):

  1. Begin Transaction

    • Insert row into transactions:
      • state = 'PENDING'
      • started_at = now, updated_at = now
    • Insert each planned step into steps with:
      • status = 'PENDING'
      • step_order (0-based)
      • details_json (paths, IDs, etc.)
  2. Execute Steps (Forward Phase) For each step in step_order:

    • Update step status = 'IN_PROGRESS'
    • (Optional) set transaction state = 'APPLYING' if still PENDING
    • Perform the filesystem action (create subvolume, rsync, rename, etc.)
    • On success: update step status = 'DONE'
    • On failure:
      • Update step status = 'FAILED'
      • Update transaction state = 'ROLLING_BACK'
      • Record error message in transactions.error
      • Jump to Rollback Phase
  3. Rollback Phase (if any step failed)

    • Iterate previously DONE steps in reverse order.
    • For each, run its compensating (rollback) action.
      • Rollbacks should be idempotent (ignore missing targets).
    • After all possible rollbacks:
      • Update transaction state = 'ABORTED'
      • Optionally store a final error message / reason.
    • End (no commit).
  4. Commit Phase (only if all steps reached DONE)

    • Update transaction state = 'COMMITTING'
    • Perform any final atomic action (e.g., rename staging → live, delete backups).
    • Validate final state (paths exist, metadata intact).
    • Update transaction state = 'COMMITTED'
  5. Cleanup

    • Release any locks (file lock, advisory lock).
    • Optionally prune or archive old committed/aborted entries (housekeeping task).
  6. Crash Recovery (on next startup)

    • Query transactions where state IN ('PENDING','APPLYING','COMMITTING','ROLLING_BACK')
    • For each:
      • Load ordered steps and their statuses.
      • If state = 'ROLLING_BACK':
        • Ensure rollback finished (redo reverse rollback for any remaining DONE steps).
        • Set to ABORTED.
      • Else if all steps status = 'DONE':
        • Finalize commit actions (if not already done).
        • Set state = 'COMMITTED'.
      • Else:
        • Perform rollback of all DONE steps (reverse order).
        • Set state = 'ABORTED'.
    • Log outcomes for debugging/audit.

evertonstz avatar Dec 01 '25 15:12 evertonstz

Working on this is currently blocked for me, because the Flatpak I create and install, runs extremely unstable. I do not even reach creation of a bottle, let alone creating or restoring snapshots. I might recheck this in a few Weeks or so.

Borgvall avatar Dec 10 '25 15:12 Borgvall

did you try using last from our CI?

mirkobrombin avatar Dec 12 '25 11:12 mirkobrombin

The CI version would not help me, as I need a flatpak with the Python-bindings to libbtrfsutil. I have attached my script to build and install bottles' flatpak below.

#!/bin/bash

# Abort on error
set -e

cd ~/builds
flatpak-builder --force-clean --install-deps-from=flathub --repo=repo ~/builds/bottles ~/src/Bottles/build-aux/com.usebottles.bottles.Devel.json
flatpak build-bundle repo bottles.flatpak com.usebottles.bottles.Devel --runtime-repo=https://flathub.org/repo/flathub.flatpakrepo
flatpak install ~/builds/bottles.flatpak

Borgvall avatar Dec 12 '25 14:12 Borgvall