Kyoo icon indicating copy to clipboard operation
Kyoo copied to clipboard

v5 plans

Open zoriya opened this issue 1 year ago • 13 comments

I talked a bit about it on the discord or twitter, but it's time to put everything on text:

I'm planning a complete rewrite of the backend & database restructuration. This is needed to support #87 and #282. This will also make #463 (big thanks to @Arthi-chaud for the brainstorming there) or #549 possible. This is also a great occasion to tackle tech dept and bad decisions I made during the 5 years I've been working on this codebase.

I plan on writing diagrams and asking for feedback on the discord at most turning points of the code, so please give your feedback if you're interested!

To give a concrete vision, I plan to:

  • [x] Rewrite the auth and extract it to its own service
    • [x] Write the spec #573
    • [x] Write the service #610
    • [ ] Handle #346
  • [ ] Rewrite the API
    • [x] Design core database logic (see below)
    • [x] Design a new API to handle mixes of episode/movies/special/recaps/extra
    • [x] Rewrite the existing API (elysia w/ bun - javascript)
    • [ ] Add websockets as a core part of the API (elysia has great websockets support)
    • [ ] Handle scaling (websockets needs custom scaling logic)
  • [ ] Update the scaning API
    • [x] Make the spec #678
    • [ ] Implement it
  • [ ] Probably merge autosync into the API

I'm going to explain each points and why I want to design Kyoo this way, as always feel free to give your opinion or ideas.

Why a separate auth service

Kyoo has multiples HTTP services, for now we have the API & the transcoder. To ensure users have the correct level of permissions, all requests hit the API which does permission validation and then proxy the transcoder. This is bad for performances, scaling and DX. This idea with a centralized auth service is to have the reverse-proxy/ingress/gateway call the service and trade an opaque auth token for a jwt (see #573's phantom token part). The short-lived jwt will be used by downstream services (API, transcoder, scanner...) to check for permissions.

Having this service stand alone also makes it possible/simple to have #346.

The auth service could also be used by others applications (as long as they are compliant w/ the license).

Why rewrite the API from scratch

The current backend is written in C# which lacks sum-types. Kyoo's logic often works on types like Movie | Serie, Episode | Movie, Episode | Special and so on. The lack of sum-types in C# makes it hard to work with, we have multiples interfaces and custom logic scattered everywhere to handle this well. This is why JavaScript was chosen as the replacement (we could have used a more functional language like Elixir, OCaml or even Gleam, but the core value of kyoo is not it's API, so I think trading some perfs for velocity here will be really important. I also think Gleam is too early in development to write everything in it).

For #87, we need to rewrite basically every single type of kyoo (Series, Episodes, Movies... all need their translatable fields moved to another type & so on). Fixing types one by one & their SQL interaction would probably take more time than just rewriting everything (and is wayyy more boring).

What's up with episodes/movies/special/recaps/extra

Right now, Kyoo took the simplest approach of having either a Movie or a Serie containing seasons that contains episodes. In reality, this is a bit more complicated. Serie can have movies that should be watched between seasons.

Most online databases TVDB/TheMovieDB uses the "Season 0" as a special season, and we've used that until now, but this feels more like a workaround than a proper feature. Some specials are:

  • critical to the watching experience and needs to be watched between seasons/episodes.
  • simple recaps that rehash one/multiple episodes and can be skipped (but still need to be shown at their proper place in the timeline of the app)
  • extra content like short episodes (2/3min long)

Note that specials can also be movies.

To give an example:

Made in abyss is an anime with 2 seasons & 3 movies (at the time of writing). The first 2 movies recap the first season and the 3rd movie must be watched before the 2nd season. This means watch order is 1st season -> 3rd movie -> 2nd season. The 1st/2nd movies should be shown close to the 1st season but be greyed out since it's a recap.

Websockets

I wanted to add websockets to kyoo for a long time (for features like #341, #297 or #342). This would also make invalidating cache for "Continue watching", "Next up" and "Watch status" easier in various apps.

I never really got around to writing it, since I was not happy with the options I had. C#'s built-in websocket solution uses a weird format that can only be used w/ their own lib so it felt wrong & writing a service specifically for that was counterproductive since it would need lots of logic shared by the API (I still did a poc in the feat/ws-rabit branch).

Elysia as a good websocket handling & the format is easily readable by any client so I'm happy about this. We would just need a message queue to handle replications.

On the scanning API

Right now, the matcher (part of the scanner that fetch metadata & pushes them to kyoo) is using a REST API to register new videos. When there are a lot of new videos to register, this kinda DDOS the API. This is also inadequate for data that could exist or not. For example. when we register an episode, the associated season/series can be already registered in kyoo or not.

Migrating to a queue based system w/ the matcher producing items to register & the API consuming these items seems like the way to go. When the API encounters an episode missing season/series data, it could push a request in another queue.

Why merge autosync

For those unaware, autosync is the service responsible for marking episodes watched on external services (SIMKL and in the future Trackt, MyAnimeList, AniList & so on).

Making this a separate service was an error, some services need to hook at different times of the playback (for example Trackt want to be notified when playback starts, is paused/resumed and finishes). The current way also makes it impossible to report errors to the client. Integrating it to the backend directly would make this way easier.

Open questions

I'm still undecided about some things:

Should we keep Meilisearch as a search backend, or can postgres do that for us?

this was discuted in #420 and I think meilisearch is a great way to solve search but I'm open to reconsider this if we can have similar results w/ postgres only. Side note but one of most highly rated under consideration feature of their roadmap is a recomandation system.

Should we use both RabbitMQ & Redis?

I plan on adding Redis (probably via valkey) soon for #579, distributing the transcoder's lock and the scanner's cache. I know redis can be used as a message queue, should we simply use redis for everything?

zoriya avatar Aug 14 '24 13:08 zoriya

Here is a draft of the new database schema:

image

I'll open a PR with it once I get some more work in it.

zoriya avatar Aug 14 '24 13:08 zoriya

Would be happy to help you with all this!

Arthi-chaud avatar Aug 14 '24 13:08 Arthi-chaud

Going into the next major version, would be worth considering moving kyoo to an Github organization and moving each of the microservices into projects of their own. Outside of just organizing differently, the next priority would be ensuring a sane development experience.

acelinkio avatar Aug 14 '24 16:08 acelinkio

I think for a small team a multi-repo setup is worse for DX.

Having a single repo means a single issue tracker which is a definite +

It's also possible to do PRs impacting multiple services instead of having two/three and jumping between repo/pr to get the whole context.

zoriya avatar Aug 14 '24 17:08 zoriya

To give a small update: ive started working on the auth service. I decided to do it in golang instead of gleam, gleam feels too early for that yet. (branch is feat/auth)

I'll continue working on it and make a PR with the api's spec in the next week.

zoriya avatar Aug 28 '24 22:08 zoriya

With regard to the DB, I suggest you take a look at EdgeDB. Their demo dataset is actually a movie dataset 😄 I'm a big fan of graph representation of data, and am currently developing an application using EdgeDB. I'm not a developer myself, but my dev team has enjoyed it so far.

thinkbig1979 avatar Sep 06 '24 15:09 thinkbig1979

Wouldn't OneOf be the answer to your complaint about union types, rather than completely rewriting the backend just for this ?
Then they can easily be replaced in the (far) future when the official unions are implemented

K3UL avatar Sep 29 '24 13:09 K3UL

I saw OneOf before using interfaces for types likes Episode | Movie but it lacked in tooling and support to make it worth using.

The database and most types of the backend need to be rewritten either way, to support #87, #282, #463 or #549. Elysia witch i plan to use for the v5 also as great websocket support.

zoriya avatar Sep 30 '24 10:09 zoriya

I don't know if you intend to add JWT validation to each microservice, but if so it might be worth using a proxy like Ory Oathkeeper (the docker images are very light). It can work with websockets too. -> https://www.ory.sh/docs/oathkeeper

Their ecosystem is very robust: https://www.ory.sh/docs/ecosystem/projects

felipemarinho97 avatar Nov 11 '24 18:11 felipemarinho97

The auth spec/service is already available on master here.

zoriya avatar Nov 11 '24 20:11 zoriya

Making this a separate service was an error, some services need to hook at different times of the playback (for example Trackt want to be notified when playback starts, is paused/resumed and finishes).

I don't see it as a mistake, it seems much more elegant to have it as a separate service. Why not emit the playback events on rabbitMQ and listen for them in autosync? The same events could be used to generate statistics or do anything else for other microservices.

The current way also makes it impossible to report errors to the client. Integrating it to the backend directly would make this way easier.

You can use a dead-letter-queue for errors.

felipemarinho97 avatar Nov 15 '24 17:11 felipemarinho97

We want to be able to report errors synchronously to clients (if they use an http call to update the playback status, we need to answer in this http response that syncing to an external service failed). Using a queue & watiing for the separate service is just wasteful.

Playback events would anyway be sent to a queue & over websockets for listeners, so we will not lose the ability to use an external service if needed.

Overall, I think the autosync service is a lot of abstraction & communications for what should be a super simple thing.

zoriya avatar Nov 15 '24 17:11 zoriya

Just merged #680! It's not perfect, but the base is good enough. Now future PRs will be about:

  • [x] run tests & formatter in the ci -> #782
  • [x] about movies
    • [x] better translations handling -> #780
      • [x] preferOriginal parameter for images
      • [x] add tests
    • [x] add with parameter to /movies/id to allow response to contain:
      • [x] all translations
      • [x] all videos -> #834
    • [x] allow items to be sorted by random (w/ a seed so we can paginate) or get a single random item with /movies/random (also reserve the random slug) -> #771
    • [x] add a isAvailable bool in the movie's response to know if a video for this movie is available
    • [x] add a search route (or add a search query in the /movies, tbd) -> #772
    • [x] add all indexes (sort indexes & filter ones)
  • [x] Series metadata handling (series, seasons, episodes, extra...)
    • [x] Get seasons
    • [x] Get entries -> #809
    • [x] Get extras
    • [x] Register a series + all its metadata -> #797
    • [x] Add episodeCount + availableCount or something like that. -> #831
    • [x] Add isAvailable in entries -> #830
    • [x] with firstEpisode -> #841
    • [x] original title (also in romaji/latin if the writting system is different) -> #833
  • [x] Show meta-type:
    • [x] Get series -> #822
    • [x] Get movies
    • [x] Get collections -> #822
    • [x] /shows (group of all movies/series/collections -> the new /items) -> #823
      • [x] With #296 as user settings
  • [x] User specific information (watch-status, history)
    • [x] history -> #881
    • [x] /series/:id?with=nextEpisode -> #843
    • [x] /watchlists/me, /watchlist/:user?status=completed -> #874
    • [x] continue watching list -> #883
    • [x] add watch progress (entries count, percentage of movies/episodes) -> #843
    • [x] user settings in jwt -> #882
  • [x] Images handling -> #846
    • [x] Download images
    • [x] Scale images (low/medium/high)
    • [x] Calculate blurhash
    • [x] Serve images via id -> #852
    • [x] Serve image via /movie/id/poster & co (& support language header here)
  • [x] Mixed routes
    • [x] shows -> #823
    • [x] news -> #839
  • [x] Other metadata
    • [x] Collections -> #821
    • [x] Studios -> #824
    • [x] Staff -> #835
  • [x] Integrate with the auth service & find a way to test this -> #857
    • [x] Create profile table in api for fk
    • [x] Protect every routes -> #873
    • [x] Handle guest accounts (the auth could handle that for us)
    • [x] Merge swaggers from auth & api
    • [x] Fix auth jwt/opaque handling in public routes -> #872
  • [x] Scanner's workflow (see #678 for the spec)
    • [x] Create videos
      • [x] Auto link them if the guess is known
      • [x] Push event for scanner for others
    • [x] Delete videos
      • [ ] maybe add a buffer time for disk unavailable or things like that TBD

(this is more or less sorted by order of priority)

If anyone wants to help with any of those, feel free to drop a message here or on discord!

zoriya avatar Jan 10 '25 11:01 zoriya

Hey, are you still working on v5.0?

wouldntyouknow avatar Nov 06 '25 12:11 wouldntyouknow

yes! work can be tracked on this new issue: https://github.com/zoriya/Kyoo/issues/968.

Most of the old v4 code has been deleted from master & last needed things to release a v5-alpha-dont-use-in-prod are getting resolved.

zoriya avatar Nov 06 '25 13:11 zoriya