varasto icon indicating copy to clipboard operation
varasto copied to clipboard

Per-file metadata support

Open joonas-fi opened this issue 4 years ago • 2 comments

Collection vs. file metadata

Currently we support metadata on the collection level, like IMDb ID, plot description, movie length etc.

Only per-file metadata we have is thumbnail for file.

We should keep in mind a goal of implementing collection metadata with file metadata once this gets completed.

File metadata structure

There's progress in all-the-bits project where the individual-file needs have been fleshed out. Example being:

{
    "title": "Team Whack iskee taloyhtiöön",
    "published": "2019-03-03T22:01:00Z",
    "description": "Hakkerit hyökkäävät taloyhtiösi ohjauskeskukseen ja paljastavat ison tietoturva-aukon. Järjestelmän kautta tuhansien ihmisten arki voisi mennä sekaisin. Kyberturvallisuusuhka leviää myös useisiin muihin rakennuksiin.",
    "source": "https://areena.yle.fi/1-4664683",
    "season_episode": {
        "season": 1,
        "episode": 1
    },
    "custom": {
        "areena.yle.fi:id": "1-4664683"
    }
}

This data structure will be defined inside Varasto, both for backend and frontend use. The frontend can then show this data in the preview cards. Also to be added for videos is video length, video codec, video resolution, audio track codecs, channels etc.

Google Drive specs also has some metadata structure we should investigate: https://developers.google.com/drive/api/v3/reference/files (namely: videoMediaMetadata.* and imageMediaMetadata.*)

File identity

The metadata file will be linked to the concrete file by its identity.

Currently we use sha256(content) as the identity, but that does not allow changing file content. Using file's filename as its identity would not allow for renames or moves. Should we just use a random sha256 as its identity?

Metadata file path

This file will be saved in /.sto/meta/<file id>.json where file identity will probably have the same process as for thumbnails (i.e. sha256 shortened)

We should also investigate having Varasto FUSE project symlinks to /.sto/meta-symlinks/<original file path>.json so it's easier for client apps to resolve file's metadata by its path instead of its identity. Think all-the-bits downloading a video, it's easier for it to write the JSON to a path calculated based on original file path instead

joonas-fi avatar Jun 25 '20 10:06 joonas-fi

Alternative idea: just use xattrs

joonas-fi avatar Oct 31 '22 14:10 joonas-fi

Common xattr keys: https://www.freedesktop.org/wiki/CommonExtendedAttributes/#proposedmetadataattributes

joonas-fi avatar Jan 24 '23 10:01 joonas-fi