icu4x
icu4x copied to clipboard
IterableDataProvider::supported_locales is a misnomer
Now that we have a more well-established notion of a "supported locale" (#58), we should rename this trait function.
I suggest: iter_locales, and maybe make it return an iterator (if not too hard).
I believe this should be
fn supported_requests(&self) -> Result<HashSet<DataRequest>, DataError>
This is because after moving key attributes to the request, supported-locales is not enough to enumerate all supported requests. And if we're using the "requests" phrasing, this will not clash with the notion of supported locales. Also, for performance, I'd like this to return a HashSet instead of an impl Iterator, like we already do internally in datagen.
I see you implemented this in #4981
I'm still a little worried about calling it "supported", since the set of supported requests is not finite, but it is not as bad as "supported locales". Calling it "iter" takes away any notion of the semantics of the return value.
I made the change from locales to requests, as the PR requires that, and from Vec to HashSet, as that gets rid of a hash-set-to-vec conversion in icu_datagen. I have not made a final decision on the name.
Yes, those two changes are good. I would say to use BTreeSet since it is in alloc but we can't do that easily because we don't have Ord, and HashSet is probably going to be a bit faster anyway since there could be hundreds or thousands of entries. Just need to bikeshed the name.
As part of fixing this, we should make the value of the HashSet be something that can implement Borrow to a fully borrowed type so that lookups can be cheaper. https://github.com/unicode-org/icu4x/pull/4981#discussion_r1622838917
I suggested up there that icu_provider should export a helper struct called something like LocaleWithAttributes that has all the impls we need.
Such a struct could be a field of DataRequest, but I am not proposing that.
^ If we use this name, the function can be called iter_locales_with_attributes. Problem solved.
Such a struct could be a field of DataRequest, but I am not proposing that.
No, because we use the fact that locale and attributes can be independently borrowed in fallback.
It would be another public exhaustive struct so things could still be borrowed separately
A struct over two borrowed values cannot be borrowed as a struct over the owned values. The pair we're currently using is basically a struct, so the problem is not solved by having a struct.
Re name, what about allowed_requests?
Maybe just make them both cows?
pub struct LocaleAndAttributes<'a, 'b> {
pub locale: Cow<'a, DataLocale>, // or LanguageIdentifier
pub key_attributes: Cow<'b, KeyAttributes>, // or KeyAttributes<'b>
}
I'd rather have the indirection than all the extra cloning.
Re name, what about
allowed_requests?
Maybe but I'm not sure "allowed" is much better than "supported". They both make it sound like these are the only requests you can pass in, but you can pass in any request.
Yes I'm using Cows in #4995. This API can stay the same.
Preliminary discussion:
struct DataIdentityOwned { pub locale: DataLocale, pub marker_attributes: DataMarkerAttributes }
struct DataIdentityBorrowed<'a> { pub locale: &'a DataLocale, pub marker_attributes: &'a DataMarkerAttributes }
struct DataRequest<'a> { id: DataIdentityBorrowed<'a>, metadata: DataRequestMetadata }
/// A [`DataProvider`] that can iterate over all supported [`DataLocale`] for a certain marker.
///
/// Implementing this trait means that a data provider knows all of the data it can successfully
/// return from a load request.
pub trait IterableDataProvider<M: DataMarker>: DataProvider<M> {
/// Returns a list of supported `DataRequests`, in owned form.
fn iter_requests(&self) -> Result<BTreeSet<DataIdentityOwned>, DataError>;
}
pub trait Bikeshed<M: DataMarker>: DataProvider<M> {
fn can_load(&self, req: DataRequest) -> Result<bool, DataError>;
}
check_reqis internal toDatagenProvider, sosupports_requestdoes not need to be part of the iter trait. The iter trait's invariant should be that it returns the tight set of requests, which is whatcheck_reqinternally does
@robertbastian What is left on this issue?
Nothing.