Bookshelf Optimize and improve `bs.move`, `bs.raycast` and `bs.hitbox`

I implemented simple physics objects using #bs.move:apply_vel. Here's the performance chart after placing 6 objects, with left side showing static object overhead and right side representing moving objects (executing #bs.move):

gamecore.zip

While this performance cost is acceptable in my project, I feel the bs.move:apply_vel overhead might be disproportionately high for this - six moving objects consume equivalent resources to one of my major modules ^^'. After preliminary code review, I identified several optimization opportunities(It might not significantly improve performance, but every little bit helps):

Significant overhead when reading Pos for entities with multiple Passengers
Too much NBT reads during on passengers handling
(Uncertain, requires benchmarking) Suspect modifying global entity positions before teleportation (tp) might outperform macro-based tp
Certain macro/storage-based calls could potentially be replaced with scoreboard operations for better performance

May 10 '25 15:05 Ethanout

Hey! This one’s a bit tricky, actually.

For the first and second issues: The goal here was to compute a hitbox that includes all passengers. Entities without hitboxes are already ignored, so I’m not sure there’s much more I can do in that regard. One possible improvement could be having the hitbox module return the elevation of the riding position, which would avoid needing to grab the Y position manually. (but this is actually an addition to an other module). Another option might be to cache the hitbox values using the custom_data NBT component. This would eliminate the need to call the hitbox module or check passengers each time. However, it would also mean the entity is stuck with a fixed size once it's computed, no support for rescaling, baby growth, or new passengers being added later.
For the third issue: I’m not entirely sure either, but the main reason I’m using tp here is that the module operates in relative space, this allows it to work even near the world border (±30,000,000). If we replaced tp with an entity-based system, we'd need to rely on scaled integers. Reaching those values would significantly reduce accuracy, and any workaround would likely be more expensive than sticking with the macro.
For the last issue: Aside from tp, most of the macros should benefit from full caching. For example, in this case, there are only 8 possible configurations, so the macro can be fully cached and ends up being more efficient than using score checks. The only parts that might exceed the cache capacity are the start of the recursion, where I deliberately used int to minimize cache misses, and user inputs, which ideally don't change frequently anyway. That said, I might have missed a few spots where caching isn’t optimal, so if you noticed something I overlooked, I’m definitely open to checking it again.

Honestly, I’d be happy to optimize this module further, but I’m not sure how much more can realistically be done. A large part of the cost comes from the recursive section, where we need to check the full cube the entity moves through. That’s unfortunately unavoidable if we want to stick with a swept AABB, which is much more robust than a discrete approach. That’s also why we were planning a separate physics module, which I believe will end up using a simpler discrete approach instead (while also supporting OOBBs).

There’s also some overhead from moving data between modules like hitbox and move, but I’d really prefer not to duplicate hitbox logic inside the move module, that would defeat the purpose of having a modular design.

May 10 '25 16:05 aksiome

For the first point, I think teleporting the global entity to @s first (for example in overworld positioned as @s as B5-0-0-0-0-1) before getting entities' Pos might be better for certain entities (especially those with Passengers, since reading NBT processes passenger data as well).

For the second point, if we maintain the original logic but allow developers to define hitbox sizes using scoreboard scores (falling back to the original NBT hitbox reading when no scores exist), perhaps we could avoid frequent NBT reading issues?

May 10 '25 17:05 Ethanout

Yes, definitely for the second point, that’s something I was already considering. As for the first point, you’re probably right; using the marker might be a better option since it’s much faster to read.

May 10 '25 17:05 aksiome

For the second point, if we maintain the original logic but allow developers to define hitbox sizes using scoreboard scores (falling back to the original NBT hitbox reading when no scores exist), perhaps we could avoid frequent NBT reading issues?

Actually, I now remember why I initially wanted to support custom scores but hesitated in the end. The main issue is that if the hitbox score is larger than the real hitbox, the entity might be missed by either the raycast or the move module. That's because these systems operate on a per-cube basis and detect entities using [dx=0]. So, for instance, if a marker is given a large hitbox through scores but the raycast or move doesn't pass through the exact cube it's actually in, it will never collide. Possible solutions:

We don’t implement custom scores.
We allow custom scores but add a warning in the documentation: the score must never exceed the entity's real hitbox size.
We add those custom score but don't expose them publicly, and they should always match the real entity size.
We expand the check to include all entities with a custom score, not just those in [dx=0]. But this could hurt performance significantly with many entities.

@Ethanout, @theogiraudet what do you think?

May 13 '25 16:05 aksiome

We add those custom score but don't expose them publicly, and they should always match the real entity size.

According to the current implemention, I think this is the best choice, since it doesn't break too much things, merely caches the entity's score.

And there's another choice: if its possible to refact the raycast or move module, I have had a method to select entities with custom hitbox in one as @e or few as @es with dichotomy. but in that case, a lot of logic would be changed, which might cause more issues, I'm not sure if it's worth doing so.

May 13 '25 18:05 Ethanout

I'm open to a refactor, but it depends a bit on how big the changes would be, especially if we want that for the next version. Could you explain how the selection is supposed to work?

May 13 '25 18:05 aksiome

I'm open to a refactor, but it depends a bit on how big the changes would be, especially if we want that for the next version. Could you explain how the selection is supposed to work?

Here's the explanation(kinda complex :/):

Selecting all the entities within a large cube, which is a rough selection for those who might be passing through by the ray. I think there're 2 optimal ways:

The first one maybe like:

# get the pos of `positioned 0. 0. 0. ^ ^ ^max_distance` as (dx, dy, dz)
$execute as @e[dx=$(dx),dy=$(dy),dz=$(dz)] ...

The second one:

execute as @e[dx=100000,dy=100000,dz=100000]...

Calculating whether these entities are passing through by the ray. The method is derived as follows: Definitions:

Starting point of the ray is (0,0,0)
Ray: λ*(xd,yd,zd), xd^2 + yd^2 + zd^2 = 1
Collision box: (x0,y0,z0) (x1,y1,z1), m0 < m1, m = x, y, z

Process: Obviously, if x0 <= λ*xd <= x1 and y0 <= λ*yd <= y1 and z0 <= λ*zd <= z1, the ray will pass through the collision box. Now we only focus on x, assume that x is greater than 0:

x0/xd <= λ <= x1/xd, x > 0

It is quite obvious that if xd is less than 0, it should be swapped:

x1/xd <= λ <= x0/xd, x < 0

For convenient, it can be write as:

bound_x0 <= λ <= bound_x1

It's same for y and z:

bound_x0 <= λ <= bound_x1
bound_y0 <= λ <= bound_y1
bound_z0 <= λ <= bound_z1

Obviously, if max(bound_x0,bound_y0,bound_z0) <= min(bound_x1,bound_y1,bound_z1) holds true, then there must be a point on the ray that is within the collision box. So the process is:

# Starting point of the ray is (0,0,0)
get xd, yd, zd (direction unit vector)
get x0, y0, z0, x1, y1, z1 (collision box)
# Division-by-zero handing is omitted here for simplicity
# Btw, division by zero does nothing, so its ok to not handle it because if xd is 0, it can be considered as 1(the smallest unit, maybe 0.001 after scaling)
bound_x0 = x0/xd
bound_y0 = y0/yd
bound_z0 = z0/zd
bound_x1 = x1/xd
bound_y1 = y1/yd
bound_z1 = z1/zd
if xd < 0: swap bound_x0, bound_x1
if yd < 0: swap bound_y0, bound_y1
if zd < 0: swap bound_z0, bound_z1
λ_min = max(bound_x0, bound_y0, bound_z0)
λ_max = min(bound_x1, bound_y1, bound_z1)
if λ_min <= λ_max: 
    collision occurs

The entity who has the minimum λ_min is the first one contacted, and similar for second one, third one... The disadvantages of this method:

It is not suitable for piercing fixed number of entities.
Collision boxes need to be set for all entities detected, which will result in additional costs.
The implementation is a little complicated and cannot be easily integrated into the current implementation.
It might performance worse than the current implementation.

By the way, this is used to address more specific problems like raymarching, that's why I'm not sure whether it fits/worth doing so.

May 16 '25 07:05 Ethanout

Except for shaped entities like paintings and item frames, this is more or less how the system already works. It calculates the relative distance between the starting coordinates and the entity coordinates. Running a single @e check at the start of the recursion is already feasible for sized entities.

The real issue, however, is illustrated in this example: In the image, the actual entity hitbox is shown in red, while the custom one is in blue. As you can see, a dx, dy, dz volume that covers the entire cube may still fail to detect that entity custom hitbox (since the real hitbox of the entity is not in the volume).

The only solution I can think of is to extend the size of the check area, which aligns with what I mentioned in solution 4, but this has a noticeable impact on performance. Distance checks can be expensive depending on the size of the area being evaluated.

That said, I believe the most viable solution is to keep the current system, but at the start of the recursion, run a check for all entities with a custom hitbox score that are within a volume extended by the custom width and height of the largest possible entity. This ensures that even entities that have no real hitbox (marker, display, ...) are still detected within the dx, dy, dz range.

Then about the piercing behavior, i think even with this method it should be possible to make it work

Edit: actually in the move module, switching to a volume check for all entities could improve performances, and would actually be a really simple refactor. The raycast is the tricky one here, with piercing and the fact that it works a bit differently

May 16 '25 09:05 aksiome

Extend the size of the check area usually wont increase more cost i think, since entities are first selected by chunks, then by the actual positions iirc., unless all of them are overlap to one another. For reading the maximum collision box of all entities at the start, I don't know how it performances. :d

What my solution means is that by calculating once at the beginning, we can almost obtain the lambda values for all entities where the ray enters their collision boxes. So there will be less @e selections.

In addition, I haven't read the part handeling entities, and this might be useful: regarding the collision between moving AABBs, it can be transformed into a collision between a ray and an AABB by extending one and shrink the other:

1. Add the stationary AABB's half-widths (dx, dy, dz) with the moving AABB's half-widths to get new half-widths (Dx, Dy, Dz). 2. Use the moving velocity vector (vx, vy, vz) as the ray direction. 3. Calculate the ray-AABB collison

May 16 '25 10:05 Ethanout

In addition, I haven't read the part handeling entities, and this might be useful: regarding the collision between moving AABBs, it can be transformed into a collision between a ray and an AABB by extending one and shrink the other:

That’s actually exactly how it works right now, for both blocks and entities. The move module computes a volume to traverse the block grid and uses a swept AABB for collision detection. I used this approach to avoid a purely discrete check and instead rely on a continuous method, which helps prevent tunneling issues where fast-moving objects can skip past thin obstacles between frames.

Because of that, it should be pretty straightforward to adapt. I just looked at the code, and the raycasting logic doesn’t look too bad either. We’d mainly just need to adjust how collisions are returned so that we always pick the smallest t value (the closest intersection point).

And you must be right that adding the size of the biggest entity to the volume should not be that bad, especially since we should win a bit of performances in the recursive part

EDIT: Honestly, now that I think about it more, I believe we could just use one large check that should cover almost all cases. Something like this:

$execute positioned ^ ^ ^12 as @n[type=!$(ignored_entities),tag=$(entities),tag=!bs.move.omit,distance=..24,limit=2147483647]

This should be sufficient in nearly every scenario, unless someone creates an entity larger than an entire chunk and moves it at an extreme speed (like more than 12 blocks per tick), which is pretty unlikely and kind of a wild edge case anyway.

Also there are ways to early return by checking the size and distance with the current nearest collision point

May 16 '25 16:05 aksiome