bevy icon indicating copy to clipboard operation
bevy copied to clipboard

implement the full set of sort methods on QueryIter

Open Victoronz opened this issue 1 year ago • 35 comments

Objective

Currently, a query iterator can be collected into a Vec and sorted, but this can be quite unwieldy, especially when many Components are involved. The itertools crate helps somewhat, but the need to write a closure over all of QueryData can sometimes hurt ergonomics, anywhere from slightly to strongly. A key extraction function only partially helps, as sort_by_key does not allow returning non-Copy data. sort_by does not suffer from the Copy restriction, but now the user has to write out a cmp function over two QueryData::Items when it could have just been handled by the Ord impl for the key. sort requires the entire Iterator Item to be Ord, which is rarely usable without manual helper functionality. If the user wants to hide away unused components with a .. range, they need to track item tuple order across their function. Mutable QueryData can also introduce further complexity. Additionally, sometimes users solely include Components /Entity to guarantee iteration order.

For a user to write a function to abstract away repeated sorts over various QueryData types they use would require reaching for the all_tuples! macro, and continue tracking tuple order afterwards.

Fixes https://github.com/bevyengine/bevy/issues/1470.

Solution

Custom sort methods on QueryIter, which take a query lens as a generic argument, like transmute_lens in Query. This allows users to choose what part of their queries they pass to their sort function calls, serving as a kind of "key extraction function" before the sort call. F.e. allowing users to implement Ord for a Component, then call query.iter().sort::<OrdComponent>()

This works independent of mutability in QueryData, QueryData tuple order, or the underlying iter/iter_mut call. Non-Copy components could also be used this way, an internal Arc<usize> being an example. If Ord impls on components do not suffice, other sort methods can be used. Notably useful when combined with EntityRef or EntityMut. Another boon from using underlying transmute functionality, is that with the allowed transmutes, it is possible to sort a Query with Entity even if it wasn't included in the original Query. The additional generic parameter on the methods other than sort and sort_unstable currently cannot be removed due to Rust limitations, however their types can be inferred.

The new methods do not conflict with the itertools sort methods, as those use the "sorted" prefix.

This is implemented barely touching existing code. That change to existing code being that QueryIter now holds on to the reference to UnsafeWorldCell that is used to initialize it. A lens query is constructed with Entity attached at the end, sorted, and turned into an iterator. The iterator maps away the lens query, leaving only an iterator of Entity, which is used by QuerySortedIter to retrieve the actual items. QuerySortedIter resembles a combination of QueryManyIter and QueryIter, but it uses an entity list that is guaranteed to contain unique entities, and implements ExactSizeIterator, DoubleEndedIterator, FusedIterator regardless of mutability or filter kind (archetypal/non-archetypal).

The sort methods are not allowed to be called after next, and will panic otherwise. This is checked using QueryIterationCursor state, which is unique on initialization. Empty queries are an exception to this, as they do not return any item in the first place. That is because tracking how many iterations have already passed would require regressing either normal query iteration a slight bit, or sorted iteration by a lot. Besides, that would not be the intended use of these methods.

Testing

To ensure that next being called before sort results in a panic, I added some tests. I also test that empty QueryIters do not exhibit this restriction.

The query sorts test checks for equivalence to the underlying sorts. This change requires that Query<(Entity, Entity)> remains legal, if that is not already guaranteed, which is also ensured by the aforementioned test.

Next Steps

Implement the set of sort methods for QueryManyIter as well.

  • This will mostly work the same, other than needing to return a new QuerySortedManyIter to account for iteration over lists of entities that are not guaranteed to be unique. This new query iterator will need a bit of internal restructuring to allow for double-ended mutable iteration, while not regressing read-only iteration.

The implementations for each pair of

  • sort, sort_unstable,
  • sort_by, sort_unstable_by,
  • sort_by_key, sort_by_cached_key

are the same aside from the panic message and the sort call, so they could be merged with an inner function. That would require the use of higher-ranked trait bounds on WorldQuery::Item<'1>, and is unclear to me whether it is currently doable.

Iteration in QuerySortedIter might have space for improvement. When sorting by Entity, an (Entity, Entity) lens QueryData is constructed, is that worth remedying? When table sorts are implemented, a fast path could be introduced to these sort methods.

Future Possibilities

Implementing Ord for EntityLocation might be useful. Some papercuts in ergonomics can be improved by future Rust features:

  • The additional generic parameters aside from the query lens can be removed once these features are stable: impl Fn (impl Trait for Fn traits) Fn -> impl Trait (impl Trait in Fn trait return position)
  • With type parameter defaults, the query lens generic can be defaulted to QueryData::Item, allowing the sort methods to look and behave like slice::sort when no query lens is specified.
  • With TAIT, the iterator generic on QuerySortedIter and thus the huge visible impl Iterator type in the sort function signatures can be removed.
  • With specialization, the bound on L could be relaxed to QueryData when the underlying iterator is mutable.

Changelog

Added sort, sort_unstable, sort_by, sort_unstable_by, sort_by_key, sort_by_cached_key to QueryIter.

Victoronz avatar May 17 '24 20:05 Victoronz

I changed:

  • the bound on L from QueryData to ReadOnlyQueryData
  • removed filter from QuerySortedIter iteration, which made the internal fetch_next infallible.
  • added a test that checks whether the QueryIter sorts and the std::slice sorts are equal.

Victoronz avatar May 18 '24 10:05 Victoronz

(forgot cargo fmt)

Victoronz avatar May 18 '24 10:05 Victoronz

#[allow(clippy::unnecessary_sort_by)] on the new test

Victoronz avatar May 18 '24 10:05 Victoronz

I'm really proud of it! As soon as it clicked how satisfying this extension would be, I just had to implement it. And it'll become even cleaner with future Rust features! (granted, they'll take time to cook 😄)

Victoronz avatar May 18 '24 20:05 Victoronz

Are these not orthogonal concerns? I agree that table sorts will be very useful, and they can be used to early out of these sort calls, but I do not see why either feature would need to block the other. Maybe in addition to these impls being located on QueryIter, it could be helpful to explicitly note that these do not sort the underlying tables, just the iterator.

Cases like under the query-sorts test are the simplest they can get.

I am personally working on a game-like project that has to sync tree-like state with an external binary, in which I use a lot of sorts to guarantee iteration order. This is currently rather painful as they are mostly the same actions, but cannot be abstracted away by a function over query types across systems from the user side.

Victoronz avatar May 18 '24 23:05 Victoronz

I do agree that it doesn't fully address the concerns of commenters under the issue #1470. Should this issue remain, or a new one be opened?

Victoronz avatar May 18 '24 23:05 Victoronz

I agree that table sorts will be very useful, and they can be used to early out of these sort calls, but I do not see why either feature would need to block the other.

It wouldn't block it, just make a future PR & migration guide more annoying. Currently the sort is done by the iterator not the query. If you want to store the result of a sort to early out the next time a system is called you have to store it somewhere (the QueryState is the best candidate for this). Which means this would break:

fn sys(q: Query<(&A, &B)>) {
    for (a, b) in q.iter().sort::<&A>() {
        // ..
    }
    for (a, b) in q.iter().sort::<&B>() {
        // ..
    }
}

In some surprising ways. Additionally putting the sort on the query rather than the iterator would mean you get mut iteration of sorted queries with less headaches.

iiYese avatar May 18 '24 23:05 iiYese

I do agree that it doesn't fully address the concerns of commenters under the issue #1470. Should this issue remain, or a new one be opened?

IMO this will be clearer as follow-ups.

alice-i-cecile avatar May 18 '24 23:05 alice-i-cecile

The problem with sort methods on anything other than the query iterators is that they would explode in API surface.

  • There are 6 standard sort kinds.
  • AFAIK being generic over mutability here is possible exactly because a different type from Query/QueryState is used. This might be a Rust limitation or lack of my own knowledge, but I am pretty certain that the "ReadOnly::Item" and WorldQuery::Item split prevents unification. This split does not exist at the iterator level. This makes for a non_mut/mut method for each sort.
  • The sort API is not complete unless iter_many has its own variants, they need their own return type. That makes for an additional 6.

That makes for a total of 24 sort methods, which is completely unnecessary IMO. For reference, Query currently has 32 non-trait methods, and QueryState 42.

Could you elaborate on mutable iteration being less of a headache with sort methods not on the iterators? I am not sure what you referring to here, as mutable iteration is completely unrestricted with these sorts.

Victoronz avatar May 18 '24 23:05 Victoronz

The problem with sort methods on anything other than the query iterators is that they would explode in API surface

Isn't the opposite the case? You need to make 6 sorts for each iterator type. If you instead have Query::srot, Query::sort_by_key, etc.. & store the sorted vec of entities/table ranges in the query state that you just check for in each iterator type you only have 6 new functions.

AFAIK being generic over mutability here is possible exactly because a different type

You're going to have to take the Query/QueryState as mut with this method anyway & anything but readonly items in a sort function sounds like a code smell. None of the standard library sort functions allow this afaik.

Could you elaborate on mutable iteration being less of a headache with sort methods not on the iterators

As described. You don't need extra mut iterators multiplied by each iteration type.

iiYese avatar May 19 '24 00:05 iiYese

We could use a trait for sorting (ala iter_tools) to manage some of that complexity. But I'm more than happy to push that to future work. This is useful and pleasant, and gathering feedback from users + performance benchmarks is a more reasonable step than trying to engineer this further in this PR.

I definitely think that some of the ideas above (especially caching!) have value, but I don't see a reason to block on them.

alice-i-cecile avatar May 19 '24 00:05 alice-i-cecile

The problem with sort methods on anything other than the query iterators is that they would explode in API surface

Isn't the opposite the case? You need to make 6 sorts for each iterator type. If you instead have Query::srot, Query::sort_by_key, etc.. & store the sorted vec of entities/table ranges in the query state that you just check for in each iterator type you only have 6 new functions.

From the perspective of the user looking at one type at a time, they would still only be six sorts, not 12 "different" ones.

Hmm, I get the feeling we are imagining different sets of use cases/APIs here. I think we should separate sorting the underlying data storage vs sorting an iteration over such storage. One case will affect all future iteration, one is local to one call. For mutation of storage vs an iteration over it to happen at the same time would conflate purposes. Part of the intent here is f.e. to be able to call iter().sort_* multiple times in the same system! A user does not always want to touch persisting state. Your description would also require regressing unsorted iteration, which is a much larger use case. This change is as minimal as can be to existing code, if there really should be a redesign, then we can go for that later down the line! That would be much less invasive IMO.

AFAIK being generic over mutability here is possible exactly because a different type

You're going to have to take the Query/QueryState as mut with this method anyway & anything but readonly items in a sort function sounds like a code smell. None of the standard library sort functions allow this afaik.

Keep in mind that the sort function itself does not mutate anything, it works with read-only data. The mutability I was talking about was for the final returned WorldQuery::Items the user actually receives.

Victoronz avatar May 19 '24 00:05 Victoronz

Hmm, I get the feeling we are imagining different sets of use cases/APIs here. I think we should separate sorting the underlying data storage vs sorting an iteration over such storage.

Yes everything I've described doesn't touch the table storage.

fn sys(mut q: Query<(&mut A, &B)>) {
    // Doesn't change tables
    for (a, b) in q.sort::<&A>().iter() {
    }
    
    // Iters unsorted
    for (a, b) in q.iter_mut() {
    } 
    
    // Iter by different sort
    for (a, b) in q.sort::<&B>().iter() {
    }
}

One case will affect all future iteration, one is local to one call. Mutation of storage vs an iteration over it to happen at the same time would conflate purposes. Part of the intent here is f.e. to be able to call iter().sort_* multiple times in the same system!

The proposed changes would be no more expensive than what the PR currently has. They would just make the query stateful & you would have less API surface.

Your description would also require regressing unsorted iteration, which is a much larger use case

It is precisely because unsorted is disproportionately more common that this kind of branch is the kind that would likely completely get optimized out by branch predictors.

iiYese avatar May 19 '24 00:05 iiYese

To be clear, the different suggested scopes for sorts:

  • global: a table sort
  • system-level: a QueryState cached sort
  • local to call: a query iterator sort

Victoronz avatar May 19 '24 00:05 Victoronz

Your description would also require regressing unsorted iteration, which is a much larger use case

It is precisely because unsorted is disproportionately more common that this kind of branch is the kind that would likely completely get optimized out by branch predictors.

While not nice, the branch isn't the main problem here. Sorted iteration, a.k.a iteration by a list of entities, even when they are guaranteed to be unique, is inherently slower because you lose locality. Additionally, the entity iterator QuerySortedIter uses internally is up to date and guaranteed to be made of existing, sorted entities. A list of entities that is cached across runs of a system would no longer uphold this unconditionally, and need more checks.

Victoronz avatar May 19 '24 00:05 Victoronz

Sorted iteration, a.k.a iteration by a list of entities, even when they are guaranteed to be unique, is inherently slower because you lose locality .. is up to date and guaranteed to be made of existing, sorted entities

Sorry I confused myself when I made those examples. Updated. A sort call always recreates the list & an unsorted iteration always ignores the list.

iiYese avatar May 19 '24 01:05 iiYese

No worries! That is already closer to the performance of this PR, but not there yet.

Putting the sort before the iterator means that the query iteration that happens in the sort cannot be unchecked. Whatever is called first on Query will need to perform the checks that ensure valid iteration. Currently, sort comes strictly after iter, and as such can elide the checks iter has already performed. If both can be called first, both need internal checks.

Query currently holds a &QueryState, this would have to be turned mutable to allow for direct sort calls. Many methods on Query use and hold on to their own QueryState reference, which would make sort calls impossible during their lifetime. An example of lost functionality would be a zip between two differently sorted iterators over the same query.

Aside from the branch in the iter call, you'd need to merge the current QuerySortedIter with the current QueryIter into a single return type. I hardly believe this is doable, because the Iterator subtrait conditions differ between sorted and unsorted iterators. F.e. ExactSizeIterator is always implemented for unsorted iteration (the impl itself depends on the internal entity_iter, the entity_iters we pass always impl the trait), whether normal iteration depends on filter type (F::Archetypal). Ignoring the subtrait impls, this would add a generic and an unneeded iterator field to normal iteration, and a lot of unnecessary fields for sorted iteration. (QueryIterationCursor is quite large)

That is to say that the QuerySortedIter type or one like it can't really be avoided, because it enables mutable Iterator trait iteration of iter_many calls, after a not-yet-present entities_are_unique function call for QueryManyIter and/or QuerySortedManyIter.

In summary, the typestate pattern is indispensable here.

Both for implementation, and perceived API surface/consideration by the user.

Victoronz avatar May 19 '24 02:05 Victoronz

I'm not against commenting on performance in the docs, but I found it difficult to do without speaking of implementation details. It doesn't collect the initial iterator, nor does it exactly query for the lens key (Entity is attached to it) and while it fetches for a second time each item, neither fetch is as expensive as a normal one, and so on.

I would like to have actual performance measurements first to actually make an accurate comment on it, so maybe we could open a small issue afterwards. I'd be surprised if someone were to think that sorting is free, as sorting an iterator always involves collecting it, optionally extracting a key, sorting, if extracted keys, turn back into the actual items, and reiterating. Iterating over the sorted list of entities by itself is essentially a faster version of what iter_many does, so that method could maybe also use a perf comment.

Some words could be included on Query, where a long comment on functionality/performance already lives!

Victoronz avatar May 19 '24 13:05 Victoronz

I would like to have actual performance measurements first to actually make an accurate comment on it .. I'd be surprised if someone were to think that sorting is free

Benchmarks are irrelevant as they would just be measuring the sort impls of the std library. What you actually want to detail is the expectations users should have when using the API. They won't think it's free but might expect it to actually sort the underlying tables which is what is usually desired when this functionality is brought up in discussion.

iiYese avatar May 19 '24 14:05 iiYese

The comment could definitely be in a followup, it's just useful to have a better idea of the performance implication so people can consider their options. Especially since some usecases could have alternatives that are cheaper (having a sorted list from the previous run, having your query in-order and collecting and sorting the whole thing, etc). There might also be some usecases where the archetypes could be sorted instead and it's not clear to the enduser if it then sorts archetypes or entities.

NiseVoid avatar May 19 '24 14:05 NiseVoid

I'd personally think that the first sentence on each method; "Sorts all query items into a new iterator [...]" pretty clearly speaks of sorting of the iterator elements, and not the underlying query, given that the struct it is on, QueryIter is documented as "An Iterator over query results of a Query", and the iter/iter_mut methods still state iteration order is not guaranteed. I based the wording on the itertools::sorted docs. I would also be basing this on the understanding that sorting references doesn't sort the underlying data, whether mutable or not.

I think docs about the performance are still valuable, since these methods can be seen as "let me trade iteration speed for more powerful guarantees". I am also trying not to bloat the docs for these methods too much, I've cut more comments on uses for these than is currently documented, and think the extra docs might better live in a different place, i.e. Query An example of a use I have not yet documented or mentioned: a user can call iter().sort::<()>(), a "no-op sort" to get an iterator with more Iterator subtraits than QueryIter.

The docs can still be adjusted afterwards, and I'd rather base that on user feedback than more speculation in advance.

Victoronz avatar May 19 '24 15:05 Victoronz

Okay. The suggested changes are more significant than I had scoped out to be. Happy for this to be merged after investigating if the Fn types can be turned into impl Fn to avoid the explicit type elision.

You were right! I was able to turn the Fn generic into an impl Fn, it works both on stable and nightly. That makes me wonder one thing: Why doesn't the standard library do this? Their slice::sort_* have the same generics, aside from L of course. Usually, those generics don't have to be specified by the user, so maybe it is preferable for them to keep the slight bit of additional expressiveness of keeping the generics?

For us, it makes more sense to elide them though. For the sort_by_key methods, the K generic remains.

Victoronz avatar May 19 '24 16:05 Victoronz

Why doesn't the standard library do this? Their slice::sort_* have the same generics, aside from L of course. Usually, those generics don't have to be specified by the user, so maybe it is preferable for them to keep the slight bit of additional expressiveness of keeping the generics

Because that was written before opaque types were a thing. It is purely a legacy thing. Editions allow for backwards compatible breaking changes at a language level not at a library level. There is no benefit to providing the option to name the generics because

  • Fn traits cannot be impld by users for types
  • Closures are types that are unnameable

Leaving the generics there just creates API noise.

iiYese avatar May 19 '24 17:05 iiYese

oops, how how do I reopen this? (how did I close this in the first place... I am confused)

Victoronz avatar May 19 '24 17:05 Victoronz

thanks, is the PR in the correct state now? I tried to rebase on main, then add a new impl Fn version commit

Edit: It seems to be.

Victoronz avatar May 19 '24 17:05 Victoronz

Because that was written before opaque types were a thing. It is purely a legacy thing. Editions allow for backwards compatible breaking changes at a language level not at a library level. There is no benefit to providing the option to name the generics because

* `Fn` traits cannot be impld by users for types

* Closures are types that are unnameable

Leaving the generics there just creates API noise.

Since a Fn trait generic can be passed explicitly, and impl Trait cannot, I was thinking that it could be useful in cases where inference fails, though that should only happen in esoteric code. (Not relevant for this PR, just wondering)

Victoronz avatar May 19 '24 17:05 Victoronz

Sorts all query items into a new iterator [...]

To me this just tells the user they get a new iterator that has sorted query items. Since it uses another query to decide the ordering it could still apply all the weird obscure optimizations end users could think of.

I am also trying not to bloat the docs for these methods too much, I've cut more comments on uses for these than is currently documented, and think the extra docs might better live in a different place, i.e. Query

If info on sorted query performance is the same between methods it makes sense to explain it once in a section on Query, and only link to that section on the methods

The docs can still be adjusted afterwards, and I'd rather base that on user feedback than more speculation in advance.

How is this speculation? I looked at the PR, thought of usecases I've written or seen, then had to look trough a lot of implementation details to see how the performance stacks up the current approaches taken there. The only speculation is that people could assume it's free or way cheaper than it is. I also doubt we'd get meaningful feedback on docs, usually it's just "too little" or "not up-to-date enough" or something vague like that. Most doc improvements are done by observing users running into issues. The more valuable user feedback would probably also come if end users are aware of the performance implications of this feature and can comment on if it's worth it, as well as suggest possible improvements :thinking:

NiseVoid avatar May 19 '24 17:05 NiseVoid

is the PR in the correct state now?

Looks like it, but you also don't usually want to rebase once you have gotten reviews, since a lot of reviewers seem to hate that.

NiseVoid avatar May 19 '24 17:05 NiseVoid

is the PR in the correct state now?

Looks like it, but you also don't usually want to rebase once you have gotten reviews, since a lot of reviewers seem to hate that.

This is my first proper contribution to anything, so I am still trying to get used to everything there is to manage. Do you mean rebasing on main, or rebasing to update the PR commits?

Victoronz avatar May 19 '24 17:05 Victoronz

Do you mean rebasing on main, to rebasing to update the PR commits?

Rebasing your PR onto main and force pushing it. Iirc it breaks this feature: image

NiseVoid avatar May 19 '24 17:05 NiseVoid