varasto
varasto copied to clipboard
Per-file metadata support
Collection vs. file metadata
Currently we support metadata on the collection level, like IMDb ID, plot description, movie length etc.
Only per-file metadata we have is thumbnail for file.
We should keep in mind a goal of implementing collection metadata with file metadata once this gets completed.
File metadata structure
There's progress in all-the-bits project where the individual-file needs have been fleshed out. Example being:
{
"title": "Team Whack iskee taloyhtiöön",
"published": "2019-03-03T22:01:00Z",
"description": "Hakkerit hyökkäävät taloyhtiösi ohjauskeskukseen ja paljastavat ison tietoturva-aukon. Järjestelmän kautta tuhansien ihmisten arki voisi mennä sekaisin. Kyberturvallisuusuhka leviää myös useisiin muihin rakennuksiin.",
"source": "https://areena.yle.fi/1-4664683",
"season_episode": {
"season": 1,
"episode": 1
},
"custom": {
"areena.yle.fi:id": "1-4664683"
}
}
This data structure will be defined inside Varasto, both for backend and frontend use. The frontend can then show this data in the preview cards. Also to be added for videos is video length, video codec, video resolution, audio track codecs, channels etc.
Google Drive specs also has some metadata structure we should investigate: https://developers.google.com/drive/api/v3/reference/files (namely: videoMediaMetadata.*
and imageMediaMetadata.*
)
File identity
The metadata file will be linked to the concrete file by its identity.
Currently we use sha256(content)
as the identity, but that does not allow changing file content. Using file's filename as its identity would not allow for renames or moves. Should we just use a random sha256
as its identity?
Metadata file path
This file will be saved in /.sto/meta/<file id>.json
where file identity will probably have the same process as for thumbnails (i.e. sha256
shortened)
We should also investigate having Varasto FUSE project symlinks to /.sto/meta-symlinks/<original file path>.json
so it's easier for client apps to resolve file's metadata by its path instead of its identity. Think all-the-bits downloading a video, it's easier for it to write the JSON to a path calculated based on original file path instead
Alternative idea: just use xattrs
Common xattr keys: https://www.freedesktop.org/wiki/CommonExtendedAttributes/#proposedmetadataattributes