Add Configurations, Hashing & Release Info Providers [WIP]
PoC adding configurations, hashing & release providers. Still things left to test, but it's overall in a working state, minus the still missing MySQL/MariaDB & MS SQL Server support.
Changes internally:
-
Added a new
IPluginManagerinterface for getting information about enabled plugins. Can be extended in the future to support disabled plugins, and/or be made to enable/disable/install/update/uninstall plugins. For now I just needed a way to list information about the installed plugins, so the rest can come later if needed. -
Added a new
IConfigurationService,IConfiguration,IConfigurationDefinitionConfigurationProvider<TConfig>, for plugins (and the core) to create, load, save and validate configurations, in addition to creating a JSON schema (with extensions) to generate a UI for the config in the web ui. The existingISettingsProviderhave been moved to use the new service, and the built-in providers have their own settings served through this service. -
Added the ability to load a setting from an environment variable as long as it's marked with the new
EnvironmentVariableAttribute(string name), and track which of them are loaded globally and on a per-configuration basis. We can also lock the setting so it can't be changed when the env. var. is loaded (the default behavior), or allow it to be changed but load the initial value on every startup from the environment. -
Added the ability to indicate a setting needs a restart to take effect by marking it with the new
RestartRequiredAttribute, and track if a server restart is needed because one or more changed, on a per-configuration, per setting basis. Complete with events and properties to track it globally and on a per configuration basis. -
Added a new
IVideoHashingService,IHashProvider,IHashDigest,HashDigestto the plugin abstraction. The newIVideoHashingServiceoperates on rawSystem.IO.FileInfos and is responsible for providing hashes to theHashFileJobbefore aIVideo&IVideoFileis necessarily assigned to a file location. So far there are runtime checks in place to make sure at least 1"ED2K"hasher is enabled at all times, since we still rely on it as our absolute ID (in combination with the video file size) internally, but the hasher doesn't necessarily need to be provided by the new"Core"hasher. It contains events for when aIVideo&IVideoFilehas been hashed (and added to the system), and when providers have been updated. The service can be used to switch between sequential mode and parallel mode — which controls how plugin providers are called, view all available and/or enabled hash types, enable or disable hash types per provider, and re-order the run priority of providers in sequential mode.
Note: The priority doesn't affect the parallel mode because every provider is… ran in parallel. -
Added a new
"Core"hasher (CoreHashProvider) implementing the"ED2K","MD5","CRC32","SHA1","SHA256", &"SHA512"hash types, with the"ED2K"enabled by default. -
Added a new
IVideoReleaseService,IReleaseInfoProvider,IReleaseInfo,IReleaseVideoCrossReference,IReleaseMediaInfo,VideoReleaseEventArgsto the plugin abstraction. The newIVideoReleaseServiceis responsible for everything release to release info, be it managing providers, doing the auto-search across multiple providers, showing provider info, saving release info to the database, and clearing saved release info from the database. It also contains events for when a release has been saved or cleared, when a auto-search has been started/completed, and when providers have been updated. The service can be used to switch between sequential mode — running each provider in a loop in priority order until we find a match or exhaust the list — or parallel mode — running all providers in parallel and selecting the highest valid priority release, view all providers, enable or disable providers, re-order the priority of the providers. -
Added a new
"AniDB"release info provider (AnidbReleaseProvider), hooking into the existing AniDB UDP lookup logic. As a side-effect of the change in the lookup process have the MyList support in the existing UDP lookup logic has been stripped out, and we now rely entirely on theIVideoReleaseServiceand MyList sync job to add new files and/or or pull watched state from the MyList. -
Added a
IHashProvider<TConfig>interface to create an explicit relation between a hash provider and a configuration. This information is also available to plugins on theHashProviderInfoclass retrived by theIVideoHashingServiceand to RESTful clients in the APIv3 (e.g. for the web ui to act on the information). -
Added a
IReleaseInfoProvider<TConfig>in the same way as theIHashProvider<TConfig>, but for release info providers. -
Added
IReadOnlyList<IHashDigest> Hashes,string? SHA256,string? SHA512toIHashes, to list all hashes stored for aIVideothat may not necessarily by strongly typed and to expose all hash types supported by theCoreHashProvider("Core"in the UI) as strongly typed hashes. The existing strongly typed hash types have been converted to helpers; retrieving the first stored hash from the list of the given type. -
AniDB_File,AniDB_FileUpdate,AniDB_ReleaseGroup,CrossRef_Languages_AniDB_File, &CrossRef_Subtitles_AniDB_Filemodels/tables/repos are gone, and their functionality replaced by the newStoredReleaseInfo&StoredReleaseInfo_MatchAttempt. The AniDB file has also been removed from the abstraction. -
Video file hashes — except the
"ED2K"hash — has been moved to only being stored in the newVideoLocal_HashDigesttable, but the"ED2K"is still stored on the video itself in addition to the new table. -
Currently I've assigned every existing link as a "manual link", because the user is now able to edit every link we store if they so desire, and this was the simplest way to show all the links in the current Web UI.
-
Added a new plugin to simply export/import release info (
Shoko.Plugin.ReleaseExporter). This is both my test case for the plugin system in addition to a small handy provider if you ever need to re-index your collection from scratch and don't want to do the AniDB UDP dance, or if you want to transcode your collection to a newer format at some point and want to preserve the release info in the process.
Changes in APIv1:
-
All file linking/unlinking in APIv1 has been soft deprecated. Use APIv3 instead. By soft deprecated I mean the client can still make the requests, but will only get an error message back from the server.
-
Release info has been migrated to use the new format, but only for releases provided by the
"AniDB"provider. -
Release groups have been migrated to use the new format, but only for release groups with
"AniDB"as a source.
Changes in APIv2:
-
Release info has been migrated to use the new format, but only for releases provided by the
"AniDB"provider. -
Release groups have been migrated to use the new format, but only for release groups with
"AniDB"as a source.
Changes in APIv3:
-
Release info has been migrated to a new API model.
-
File.Hasheshas been changed from a dict of well known, nullable hashes to a list of hash digests, where only the ED2K hash is guaranteed to be included in the list. -
File.AniDBhas been replaced withFile.Release, which now uses the new release info model. TheincludeDataFrom=AniDBquery parameter for file endpoints -
Added a new hashing controller (mounted at
/api/v3/Hashingfor now), to view and edit hashing provider settings, enable disable hash types per provider, and re-order the run order of providers in sequential mode (note: the order doesn't affect the parallel mode because every provider is… ran in parallel). -
Added a new
ReleaseInfoController(mounted at/api/v3/ReleaseInfo), allowing RESTful clients to also interact with the newly addedIVideoReleaseService. You can do anything you can do -
File linking in APIv3 have been converted to use the new service, and as a result the artificial limit of not allowing the user to remove AniDB releases is gone. A release is simply a release now.
To-do;
- [X] Add missing MySQL/MariaDB database migrations.
- [x] Add missing MS SQL Server database migrations.
- [X] Test that the anidb provider somehow works as it should.
- [x] Test out that MyList is still working as it should.
- [x] Test out adding a workflow to edit the providers in the web ui.
- [x] Fix breakage in the web ui as a result of the removal of the anidb property on the file model.
- [x] Fix breakage in Shokofin as a result of the removal of the anidb property on the file model.
This is...a lot, so I'll need to look at it more later. One thing I see first off is the complication and kind of hacky handling of the scheduling and ProcessFile. I would split the jobs if possible, and make it so that it has a flow like so:
Discover
Hash
ProcessFile
Check the state of the data and what providers can/need to update, prolly via an interface/abstraction
Schedule the relevant jobs for each provider. AniDB will have one, which can be handled in scheduler via the exclusion types. Ashen might have one. Maybe an NFO one? It's extensible, after all.
Get Provider File Info
The job mentioned before. It can do the job that Process File did and orchestrate other things. We can make helpers or a base "Get Provider File Info" in the abstractions for providers to extend.
...
The plugin abstractions might need to provide a hook to add Acquisition Filters, jobs, etc
@Cazzar can you comment on some of the design? We aren't nitpicking code quality yet.
though if we were, stop making constructors for models. It'll mess up Entity Framework, and I'm going to get rid of them anyway. Use object initializers. i.e.
public StoredReleaseInfo(IVideo video, IReleaseInfo releaseInfo)
...
Models should be models. If processing needs to be done, it should be in a service/factory. I don't know if your "embedded" models will work. We will see. I'm not sure how Entity Framework will handle loading of relationships through them.
Stuff like this is fine imo, though:
public IReadOnlyList<IReleaseVideoCrossReference> CrossReferences
{
get => EmbeddedCrossReferences
.Split(',')
.Select(EmbeddedCrossReference.FromString)
.WhereNotNull()
.ToList();
set => EmbeddedCrossReferences = value
.Select(x => x.ToEmbeddedString())
.Join(',');
}
Overall design I like the idea, I haven't looked through it extensively as the large amount of changes does make things complex.
This is...a lot, so I'll need to look at it more later. One thing I see first off is the complication and kind of hacky handling of the scheduling and ProcessFile. I would split the jobs if possible, and make it so that it has a flow like so:
Discover Hash ProcessFile Check the state of the data and what providers can/need to update, prolly via an interface/abstraction Schedule the relevant jobs for each provider. AniDB will have one, which can be handled in scheduler via the exclusion types. Ashen might have one. Maybe an NFO one? It's extensible, after all. Get Provider File Info The job mentioned before. It can do the job that Process File did and orchestrate other things. We can make helpers or a base "Get Provider File Info" in the abstractions for providers to extend. ...The plugin abstractions might need to provide a hook to add Acquisition Filters, jobs, etc
The current service can be ran inside or outside the queue/job system, and in the current PoC then the release providers are processed in a user-configurable order until a release is found. Only a single release can be assigned to the same video at any given time, so scheduling "relevant jobs for each provider" won't ever happen in parallel. In short, the logic to select a release happens strictly inside the service and the process-file job is now just asking the service to do it's thing while running in the queue/job system in the background. There are also other ways to interact with the service, be it from other plugins through the abstraction, or from RESTful clients through the new endpoints.
I do admit that the way I modified the AniDB banned acquisition filter is kind of hacky, but also correct, as it was modified to only block the process-file jobs if AniDB banned ONLY IF the AniDB release provider is enabled, as it needs to be able to use the AniDB UDP API. But flipping that would mean that as long as the AniDB provider isn't enabled then we don't need to block the process-file jobs at all, since it's not using the AniDB UDP API to find releases.
That is a reasonable argument, but I would allow multiple so that you can cross reference them. AniDB is more likely correct than perceptual hashing, even though perceptual hashing is pretty accurate. We could even have a filename plugin with very low accuracy. Maybe add an enum for how trustworthy we expect a provider to be in those cases.
We kinda already have a user configurable "priority" to use. Can we add two modes,
- one mode to run in a sync. loop in the priority order until a release is found (AKA the current way), and
- one mode to run multiple providers in parallel, await all the responses, and then pick the highest priority release of the available candidates?
I can add the new setting to the service and modify the description of the auto-finding method and endpoints to reflect the new behavior. The reason I would opt to have both modes is to let the user choose how they want to do it. By default we will only have one provider included (…unless…), so the default mode can be whatever.
My particular flow would require the finding to happen in sync. order; it would first checks the "nfo" (quotes intentional) file before asking any remote services (or a potential fallback local/offline provider). But I know some would maybe want to do it in a parallel fashion as you described with the p-hash and AniDB provider, so I'll say "let them pick their own poison to swallow."
Quality Gate failed
Failed conditions
36 Security Hotspots
C Security Rating on New Code (required ≥ A)
C Reliability Rating on New Code (required ≥ A)
See analysis details on SonarQube Cloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarQube for IDE
Quality Gate failed
Failed conditions
36 Security Hotspots
C Security Rating on New Code (required ≥ A)
C Reliability Rating on New Code (required ≥ A)
See analysis details on SonarQube Cloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarQube for IDE
Quality Gate failed
Failed conditions
28 Security Hotspots
E Reliability Rating on New Code (required ≥ A)
C Security Rating on New Code (required ≥ A)
See analysis details on SonarQube Cloud
Catch issues before they fail your Quality Gate with our IDE extension
SonarQube for IDE