implement the full set of sort methods on QueryIter
Objective
Currently, a query iterator can be collected into a Vec and sorted, but this can be quite unwieldy, especially when many Components are involved. The itertools crate helps somewhat, but the need to write a closure over all of QueryData
can sometimes hurt ergonomics, anywhere from slightly to strongly. A key extraction function only partially helps, as sort_by_key does not allow returning non-Copy data. sort_by does not suffer from the Copy restriction, but now the user has to write out a cmp function over two QueryData::Items when it could have just been handled by the Ord impl for the key.
sort requires the entire Iterator Item to be Ord, which is rarely usable without manual helper functionality. If the user wants to hide away unused components with a .. range, they need to track item tuple order across their function. Mutable QueryData can also introduce further complexity.
Additionally, sometimes users solely include Components /Entity to guarantee iteration order.
For a user to write a function to abstract away repeated sorts over various QueryData types they use would require reaching for the all_tuples! macro, and continue tracking tuple order afterwards.
Fixes https://github.com/bevyengine/bevy/issues/1470.
Solution
Custom sort methods on QueryIter, which take a query lens as a generic argument, like transmute_lens in Query.
This allows users to choose what part of their queries they pass to their sort function calls, serving as a kind of "key extraction function" before the sort call. F.e. allowing users to implement Ord for a Component, then call query.iter().sort::<OrdComponent>()
This works independent of mutability in QueryData, QueryData tuple order, or the underlying iter/iter_mut call.
Non-Copy components could also be used this way, an internal Arc<usize> being an example.
If Ord impls on components do not suffice, other sort methods can be used. Notably useful when combined with EntityRef or EntityMut.
Another boon from using underlying transmute functionality, is that with the allowed transmutes, it is possible to sort a Query with Entity even if it wasn't included in the original Query.
The additional generic parameter on the methods other than sort and sort_unstable currently cannot be removed due to Rust limitations, however their types can be inferred.
The new methods do not conflict with the itertools sort methods, as those use the "sorted" prefix.
This is implemented barely touching existing code. That change to existing code being that QueryIter now holds on to the reference to UnsafeWorldCell that is used to initialize it.
A lens query is constructed with Entity attached at the end, sorted, and turned into an iterator. The iterator maps away the lens query, leaving only an iterator of Entity, which is used by QuerySortedIter to retrieve the actual items.
QuerySortedIter resembles a combination of QueryManyIter and QueryIter, but it uses an entity list that is guaranteed to contain unique entities, and implements ExactSizeIterator, DoubleEndedIterator, FusedIterator regardless of mutability or filter kind (archetypal/non-archetypal).
The sort methods are not allowed to be called after next, and will panic otherwise. This is checked using QueryIterationCursor state, which is unique on initialization. Empty queries are an exception to this, as they do not return any item in the first place.
That is because tracking how many iterations have already passed would require regressing either normal query iteration a slight bit, or sorted iteration by a lot. Besides, that would not be the intended use of these methods.
Testing
To ensure that next being called before sort results in a panic, I added some tests. I also test that empty QueryIters do not exhibit this restriction.
The query sorts test checks for equivalence to the underlying sorts.
This change requires that Query<(Entity, Entity)> remains legal, if that is not already guaranteed, which is also ensured by the aforementioned test.
Next Steps
Implement the set of sort methods for QueryManyIter as well.
- This will mostly work the same, other than needing to return a new
QuerySortedManyIterto account for iteration over lists of entities that are not guaranteed to be unique. This new query iterator will need a bit of internal restructuring to allow for double-ended mutable iteration, while not regressing read-only iteration.
The implementations for each pair of
sort,sort_unstable,sort_by, sort_unstable_by,sort_by_key,sort_by_cached_key
are the same aside from the panic message and the sort call, so they could be merged with an inner function.
That would require the use of higher-ranked trait bounds on WorldQuery::Item<'1>, and is unclear to me whether it is currently doable.
Iteration in QuerySortedIter might have space for improvement.
When sorting by Entity, an (Entity, Entity) lens QueryData is constructed, is that worth remedying?
When table sorts are implemented, a fast path could be introduced to these sort methods.
Future Possibilities
Implementing Ord for EntityLocation might be useful.
Some papercuts in ergonomics can be improved by future Rust features:
- The additional generic parameters aside from the query lens can be removed once these features are stable:
impl Fn(impl TraitforFntraits)Fn -> impl Trait(impl TraitinFntrait return position) - With type parameter defaults, the query lens generic can be defaulted to
QueryData::Item, allowing the sort methods to look and behave likeslice::sortwhen no query lens is specified. - With TAIT, the iterator generic on
QuerySortedIterand thus the huge visibleimpl Iteratortype in the sort function signatures can be removed. - With specialization, the bound on
Lcould be relaxed toQueryDatawhen the underlying iterator is mutable.
Changelog
Added sort, sort_unstable, sort_by, sort_unstable_by, sort_by_key, sort_by_cached_key to QueryIter.
I changed:
- the bound on
LfromQueryDatatoReadOnlyQueryData - removed
filterfromQuerySortedIteriteration, which made the internalfetch_nextinfallible. - added a test that checks whether the
QueryItersorts and thestd::slicesorts are equal.
(forgot cargo fmt)
#[allow(clippy::unnecessary_sort_by)] on the new test
I'm really proud of it! As soon as it clicked how satisfying this extension would be, I just had to implement it. And it'll become even cleaner with future Rust features! (granted, they'll take time to cook 😄)
Are these not orthogonal concerns?
I agree that table sorts will be very useful, and they can be used to early out of these sort calls, but I do not see why either feature would need to block the other.
Maybe in addition to these impls being located on QueryIter, it could be helpful to explicitly note that these do not sort the underlying tables, just the iterator.
Cases like under the query-sorts test are the simplest they can get.
I am personally working on a game-like project that has to sync tree-like state with an external binary, in which I use a lot of sorts to guarantee iteration order. This is currently rather painful as they are mostly the same actions, but cannot be abstracted away by a function over query types across systems from the user side.
I do agree that it doesn't fully address the concerns of commenters under the issue #1470. Should this issue remain, or a new one be opened?
I agree that table sorts will be very useful, and they can be used to early out of these sort calls, but I do not see why either feature would need to block the other.
It wouldn't block it, just make a future PR & migration guide more annoying. Currently the sort is done by the iterator not the query. If you want to store the result of a sort to early out the next time a system is called you have to store it somewhere (the QueryState is the best candidate for this). Which means this would break:
fn sys(q: Query<(&A, &B)>) {
for (a, b) in q.iter().sort::<&A>() {
// ..
}
for (a, b) in q.iter().sort::<&B>() {
// ..
}
}
In some surprising ways. Additionally putting the sort on the query rather than the iterator would mean you get mut iteration of sorted queries with less headaches.
I do agree that it doesn't fully address the concerns of commenters under the issue #1470. Should this issue remain, or a new one be opened?
IMO this will be clearer as follow-ups.
The problem with sort methods on anything other than the query iterators is that they would explode in API surface.
- There are 6 standard sort kinds.
- AFAIK being generic over mutability here is possible exactly because a different type from Query/QueryState is used. This
might be a Rust limitation or lack of my own knowledge, but I am pretty certain that the "
ReadOnly::Item" andWorldQuery::Itemsplit prevents unification. This split does not exist at the iterator level. This makes for a non_mut/mut method for each sort. - The sort API is not complete unless
iter_manyhas its own variants, they need their own return type. That makes for an additional 6.
That makes for a total of 24 sort methods, which is completely unnecessary IMO. For reference, Query currently has 32 non-trait methods, and QueryState 42.
Could you elaborate on mutable iteration being less of a headache with sort methods not on the iterators? I am not sure what you referring to here, as mutable iteration is completely unrestricted with these sorts.
The problem with sort methods on anything other than the query iterators is that they would explode in API surface
Isn't the opposite the case? You need to make 6 sorts for each iterator type. If you instead have Query::srot, Query::sort_by_key, etc.. & store the sorted vec of entities/table ranges in the query state that you just check for in each iterator type you only have 6 new functions.
AFAIK being generic over mutability here is possible exactly because a different type
You're going to have to take the Query/QueryState as mut with this method anyway & anything but readonly items in a sort function sounds like a code smell. None of the standard library sort functions allow this afaik.
Could you elaborate on mutable iteration being less of a headache with sort methods not on the iterators
As described. You don't need extra mut iterators multiplied by each iteration type.
We could use a trait for sorting (ala iter_tools) to manage some of that complexity. But I'm more than happy to push that to future work. This is useful and pleasant, and gathering feedback from users + performance benchmarks is a more reasonable step than trying to engineer this further in this PR.
I definitely think that some of the ideas above (especially caching!) have value, but I don't see a reason to block on them.
The problem with sort methods on anything other than the query iterators is that they would explode in API surface
Isn't the opposite the case? You need to make 6 sorts for each iterator type. If you instead have Query::srot, Query::sort_by_key, etc.. & store the sorted vec of entities/table ranges in the query state that you just check for in each iterator type you only have 6 new functions.
From the perspective of the user looking at one type at a time, they would still only be six sorts, not 12 "different" ones.
Hmm, I get the feeling we are imagining different sets of use cases/APIs here.
I think we should separate sorting the underlying data storage vs sorting an iteration over such storage.
One case will affect all future iteration, one is local to one call.
For mutation of storage vs an iteration over it to happen at the same time would conflate purposes.
Part of the intent here is f.e. to be able to call iter().sort_* multiple times in the same system! A user does not always want to touch persisting state.
Your description would also require regressing unsorted iteration, which is a much larger use case.
This change is as minimal as can be to existing code, if there really should be a redesign, then we can go for that later down the line! That would be much less invasive IMO.
AFAIK being generic over mutability here is possible exactly because a different type
You're going to have to take the Query/QueryState as mut with this method anyway & anything but readonly items in a sort function sounds like a code smell. None of the standard library sort functions allow this afaik.
Keep in mind that the sort function itself does not mutate anything, it works with read-only data. The mutability I was talking about was for the final returned WorldQuery::Items the user actually receives.
Hmm, I get the feeling we are imagining different sets of use cases/APIs here. I think we should separate sorting the underlying data storage vs sorting an iteration over such storage.
Yes everything I've described doesn't touch the table storage.
fn sys(mut q: Query<(&mut A, &B)>) {
// Doesn't change tables
for (a, b) in q.sort::<&A>().iter() {
}
// Iters unsorted
for (a, b) in q.iter_mut() {
}
// Iter by different sort
for (a, b) in q.sort::<&B>().iter() {
}
}
One case will affect all future iteration, one is local to one call. Mutation of storage vs an iteration over it to happen at the same time would conflate purposes. Part of the intent here is f.e. to be able to call iter().sort_* multiple times in the same system!
The proposed changes would be no more expensive than what the PR currently has. They would just make the query stateful & you would have less API surface.
Your description would also require regressing unsorted iteration, which is a much larger use case
It is precisely because unsorted is disproportionately more common that this kind of branch is the kind that would likely completely get optimized out by branch predictors.
To be clear, the different suggested scopes for sorts:
- global: a table sort
- system-level: a
QueryStatecached sort - local to call: a query iterator sort
Your description would also require regressing unsorted iteration, which is a much larger use case
It is precisely because unsorted is disproportionately more common that this kind of branch is the kind that would likely completely get optimized out by branch predictors.
While not nice, the branch isn't the main problem here. Sorted iteration, a.k.a iteration by a list of entities, even when they are guaranteed to be unique, is inherently slower because you lose locality.
Additionally, the entity iterator QuerySortedIter uses internally is up to date and guaranteed to be made of existing, sorted entities.
A list of entities that is cached across runs of a system would no longer uphold this unconditionally, and need more checks.
Sorted iteration, a.k.a iteration by a list of entities, even when they are guaranteed to be unique, is inherently slower because you lose locality .. is up to date and guaranteed to be made of existing, sorted entities
Sorry I confused myself when I made those examples. Updated. A sort call always recreates the list & an unsorted iteration always ignores the list.
No worries! That is already closer to the performance of this PR, but not there yet.
Putting the sort before the iterator means that the query iteration that happens in the sort cannot be unchecked.
Whatever is called first on Query will need to perform the checks that ensure valid iteration. Currently, sort comes strictly after iter, and as such can elide the checks iter has already performed. If both can be called first, both need internal checks.
Query currently holds a &QueryState, this would have to be turned mutable to allow for direct sort calls.
Many methods on Query use and hold on to their own QueryState reference, which would make sort calls impossible during their lifetime. An example of lost functionality would be a zip between two differently sorted iterators over the same query.
Aside from the branch in the iter call, you'd need to merge the current QuerySortedIter with the current QueryIter into a single return type. I hardly believe this is doable, because the Iterator subtrait conditions differ between sorted and unsorted iterators. F.e. ExactSizeIterator is always implemented for unsorted iteration (the impl itself depends on the internal entity_iter, the entity_iters we pass always impl the trait), whether normal iteration depends on filter type (F::Archetypal).
Ignoring the subtrait impls, this would add a generic and an unneeded iterator field to normal iteration, and a lot of unnecessary fields for sorted iteration. (QueryIterationCursor is quite large)
That is to say that the QuerySortedIter type or one like it can't really be avoided, because it enables mutable Iterator trait iteration of iter_many calls, after a not-yet-present entities_are_unique function call for QueryManyIter and/or QuerySortedManyIter.
In summary, the typestate pattern is indispensable here.
Both for implementation, and perceived API surface/consideration by the user.
I'm not against commenting on performance in the docs, but I found it difficult to do without speaking of implementation details. It doesn't collect the initial iterator, nor does it exactly query for the lens key (Entity is attached to it) and while it fetches for a second time each item, neither fetch is as expensive as a normal one, and so on.
I would like to have actual performance measurements first to actually make an accurate comment on it, so maybe we could open a small issue afterwards.
I'd be surprised if someone were to think that sorting is free, as sorting an iterator always involves collecting it, optionally extracting a key, sorting, if extracted keys, turn back into the actual items, and reiterating.
Iterating over the sorted list of entities by itself is essentially a faster version of what iter_many does, so that method could maybe also use a perf comment.
Some words could be included on Query, where a long comment on functionality/performance already lives!
I would like to have actual performance measurements first to actually make an accurate comment on it .. I'd be surprised if someone were to think that sorting is free
Benchmarks are irrelevant as they would just be measuring the sort impls of the std library. What you actually want to detail is the expectations users should have when using the API. They won't think it's free but might expect it to actually sort the underlying tables which is what is usually desired when this functionality is brought up in discussion.
The comment could definitely be in a followup, it's just useful to have a better idea of the performance implication so people can consider their options. Especially since some usecases could have alternatives that are cheaper (having a sorted list from the previous run, having your query in-order and collecting and sorting the whole thing, etc). There might also be some usecases where the archetypes could be sorted instead and it's not clear to the enduser if it then sorts archetypes or entities.
I'd personally think that the first sentence on each method; "Sorts all query items into a new iterator [...]" pretty clearly speaks of sorting of the iterator elements, and not the underlying query, given that the struct it is on, QueryIter is documented as "An Iterator over query results of a Query", and the iter/iter_mut methods still state iteration order is not guaranteed.
I based the wording on the itertools::sorted docs.
I would also be basing this on the understanding that sorting references doesn't sort the underlying data, whether mutable or not.
I think docs about the performance are still valuable, since these methods can be seen as "let me trade iteration speed for more powerful guarantees".
I am also trying not to bloat the docs for these methods too much, I've cut more comments on uses for these than is currently documented, and think the extra docs might better live in a different place, i.e. Query
An example of a use I have not yet documented or mentioned: a user can call iter().sort::<()>(), a "no-op sort" to get an iterator with more Iterator subtraits than QueryIter.
The docs can still be adjusted afterwards, and I'd rather base that on user feedback than more speculation in advance.
Okay. The suggested changes are more significant than I had scoped out to be. Happy for this to be merged after investigating if the
Fntypes can be turned intoimpl Fnto avoid the explicit type elision.
You were right! I was able to turn the Fn generic into an impl Fn, it works both on stable and nightly.
That makes me wonder one thing: Why doesn't the standard library do this? Their slice::sort_* have the same generics, aside from L of course. Usually, those generics don't have to be specified by the user, so maybe it is preferable for them to keep the slight bit of additional expressiveness of keeping the generics?
For us, it makes more sense to elide them though.
For the sort_by_key methods, the K generic remains.
Why doesn't the standard library do this? Their slice::sort_* have the same generics, aside from L of course. Usually, those generics don't have to be specified by the user, so maybe it is preferable for them to keep the slight bit of additional expressiveness of keeping the generics
Because that was written before opaque types were a thing. It is purely a legacy thing. Editions allow for backwards compatible breaking changes at a language level not at a library level. There is no benefit to providing the option to name the generics because
Fntraits cannot be impld by users for types- Closures are types that are unnameable
Leaving the generics there just creates API noise.
oops, how how do I reopen this? (how did I close this in the first place... I am confused)
thanks, is the PR in the correct state now? I tried to rebase on main, then add a new impl Fn version commit
Edit: It seems to be.
Because that was written before opaque types were a thing. It is purely a legacy thing. Editions allow for backwards compatible breaking changes at a language level not at a library level. There is no benefit to providing the option to name the generics because
* `Fn` traits cannot be impld by users for types * Closures are types that are unnameableLeaving the generics there just creates API noise.
Since a Fn trait generic can be passed explicitly, and impl Trait cannot, I was thinking that it could be useful in cases where inference fails, though that should only happen in esoteric code. (Not relevant for this PR, just wondering)
Sorts all query items into a new iterator [...]
To me this just tells the user they get a new iterator that has sorted query items. Since it uses another query to decide the ordering it could still apply all the weird obscure optimizations end users could think of.
I am also trying not to bloat the docs for these methods too much, I've cut more comments on uses for these than is currently documented, and think the extra docs might better live in a different place, i.e.
Query
If info on sorted query performance is the same between methods it makes sense to explain it once in a section on Query, and only link to that section on the methods
The docs can still be adjusted afterwards, and I'd rather base that on user feedback than more speculation in advance.
How is this speculation? I looked at the PR, thought of usecases I've written or seen, then had to look trough a lot of implementation details to see how the performance stacks up the current approaches taken there. The only speculation is that people could assume it's free or way cheaper than it is. I also doubt we'd get meaningful feedback on docs, usually it's just "too little" or "not up-to-date enough" or something vague like that. Most doc improvements are done by observing users running into issues. The more valuable user feedback would probably also come if end users are aware of the performance implications of this feature and can comment on if it's worth it, as well as suggest possible improvements :thinking:
is the PR in the correct state now?
Looks like it, but you also don't usually want to rebase once you have gotten reviews, since a lot of reviewers seem to hate that.
is the PR in the correct state now?
Looks like it, but you also don't usually want to rebase once you have gotten reviews, since a lot of reviewers seem to hate that.
This is my first proper contribution to anything, so I am still trying to get used to everything there is to manage. Do you mean rebasing on main, or rebasing to update the PR commits?
Do you mean rebasing on main, to rebasing to update the PR commits?
Rebasing your PR onto main and force pushing it. Iirc it breaks this feature: