dim icon indicating copy to clipboard operation
dim copied to clipboard

feat(tmdb): new tmdb client with request coalescing

Open mental32 opened this issue 2 years ago • 11 comments

Todo

  • [x] request coalescing
  • [x] write new TMDB client
  • [x] cache eviction policy
  • [x] client-side in-flight request limiting
  • [x] proper error propagation

Notable types

struct TMDBMetadataProvider

TMDBMetadataProvider is the main type used to provide a centralized store of the response-cache, the shared reqwest Client type, and the API key.

TMDBMetadataProvider does not implement ExternalQuery or ExternalQueryShow since it is a unified cache and so the media type being searched for must be provided explicitly.

However since the ExternalQuery trait does not allow passing this in the method arguments we have another type MetadataProviderOf<K> which is used to parameterise the search type. TMDBMetadataProvider::movies and TMDBMetadataProvider::tv_shows will each produce MetadataProviderOf<Movies> and MetadataProviderOf<TvShows> respectively. and these are the types for which ExternalQuery is implemented on.

struct TMDBClient

This is a private type that is the API wrapper for the TMDB REST API, it only contains a reference to a provider instance for access of the API key, and nothing else. It has associated methods which each correspond to some kind of TMDB API endpoint/operation.

Example

let provider = TMDBMetadataProvider::new("API_KEY");

let breaking_bad/* : ExternalMedia */ = provider.tv_shows().search("breaking bad", None).await.expect("I am the one who panics.");

let el_camino/* : ExternalMedia */ = provider.movies().search("El Camino: A Breaking Bad Movie", Some(2019)).await.expect("this is the way.");

// only one http request will be made. the rest of the futures will wait on the shared result of the first one.
let _ = futures::join!(
    provider.movies().search("baby driver", None)),
    provider.movies().search("baby driver", None)),
    provider.movies().search("baby driver", None)),
    provider.movies().search("baby driver", None)),
).await;

mental32 avatar May 20 '22 23:05 mental32

Can we add some sort of in-flight request count limiting? Unless it is intended for the users to implement this on their own using something like a semaphore.

vgarleanu avatar May 22 '22 16:05 vgarleanu

Can we add some sort of in-flight request count limiting? Unless it is intended for the users to implement this on their own using something like a semaphore.

not sure what you mean with this, could you elaborate?

mental32 avatar May 22 '22 22:05 mental32

not sure what you mean with this, could you elaborate?

We dont want to end up with 10k in-flight requests to tmdb. We should provide some sort of mechanism for rate-limiting.

vgarleanu avatar May 22 '22 22:05 vgarleanu

We should limit all api consumers to something like 512 api requests at any given point in time.

vgarleanu avatar May 22 '22 22:05 vgarleanu

uh sure normally I'd say that'd be something for the user but since we're the ones using this it makes sense :smile:

pub struct RateLimitedTMDBMetadataProviderRef<K> { ... }

and re-impl the query traits for that but have a builder-style production for the type.

Also the names are starting to give me Java flashbacks, I should redo them at some point.

mental32 avatar May 22 '22 22:05 mental32

One thing to note here for this PR the main thing missing for this impl is a way to handle errors from the underlying in-flight request. The current impl will just discard the error away into None and the error type that gets yeeted back is always "timed out"

mental32 avatar May 22 '22 22:05 mental32

One thing to note here for this PR the main thing missing for this impl is a way to handle errors from the underlying in-flight request. The current impl will just discard the error away into None and the error type that gets yeeted back is always "timed out"

Yeah we should stop doing that. Tmdb could throw us a not found error that we'd need to handle server-side or client-side and we wont even know. We should propagate errors to the consumer, but ideally we should somewhat normalize them. For example tmdb could return a not found, and we'd want to turn that into a external::Error::NotFound. Other errors that we cant really handle can just be turned into an arc'd error and also passed downstream for logging.

vgarleanu avatar May 22 '22 22:05 vgarleanu

One thing to note here for this PR the main thing missing for this impl is a way to handle errors from the underlying in-flight request. The current impl will just discard the error away into None and the error type that gets yeeted back is always "timed out"

Yeah we should stop doing that. Tmdb could throw us a not found error that we'd need to handle server-side or client-side and we wont even know. We should propagate errors to the consumer, but ideally we should somewhat normalize them. For example tmdb could return a not found, and we'd want to turn that into a external::Error::NotFound. Other errors that we cant really handle can just be turned into an arc'd error and also passed downstream for logging.

will handle this in the next commit.

mental32 avatar May 22 '22 23:05 mental32

I think we can replace dashmap with https://crates.io/crates/stretto. If we do, we will also get cache eviction for free, and we could also set policies for how much memory our cache should consume at max before it starts evicting. All of this for free.

vgarleanu avatar Jul 14 '22 09:07 vgarleanu

@mental32, I talked with Val about helping with this. Any chance you can give me a status of what still needs to be done?

cobyge avatar Jul 15 '22 20:07 cobyge

@cobyge Looks like there are a couple of items that are still missing:

  • Proper error mapping and propagation
  • Fetching actors for a media id
  • Fetching seasons for an id
  • Fetching episodes for a season id
  • Cache eviction
  • In-flight request limits.

For easy cache eviction, we can switch our cache type from dashmap to stretto. For in-flight request limits we can use the governor crate.

I think it makes sense to first tackle the error propagation first by creating a couple more error variants in the top-level Error enum in external/mod.rs. When calling the search method for instance, if searching has failed for any reason, I want to get back a error type with the reason wrapped.

After this I think it makes sense to switch from dashmap to stretto and get cache eviction out of the way. The last steps would be implementing the rest of the API calls.

vgarleanu avatar Jul 21 '22 19:07 vgarleanu