navigation2 Zig-zag due to unfavorable angle increments and starting angle

I am experiencing some interesting behavior when using SmacPlanner Hybrid. I am not quite sure whether this is a common mistake/behavior maybe caused by incorrect usage, or whether it is a known limitation. I currently have no idea how to approach this / solve this. If ideas pop up along this thread, I am more than willing to contribute.

So the issue is that depending on which angle-bin the start node falls into, I am getting either a straight line or a very messy zig-zag.

We have 64 angle bins. Here the start is in bin 32 (pi) Screenshot from 2022-09-05 21-53-44

In this image the starting angle is in bin 31 Screenshot from 2022-09-05 21-54-39

I also found that due to my choice of resolution (10cm) and min_turning_radius (50cm), the turning motions of the minimal primitive set (Dubins) cover four angle bins at a time, i.e., increments=4 https://github.com/ros-planning/navigation2/blob/634d2e3d9b6bde5558c90fe6583d52b3ed66cf55/nav2_smac_planner/src/node_hybrid.cpp#L104)

This is actually quite bad since you one will therefore never end up in odd angle bins. But only in every fourth bin that are multiples of four. This also causes the zig-zagging because just can't reach an angle bin that would directly go in the right direction. Also as can be seen in the image, the planner has to expand WAY more nodes to make progress.

This is a distribution of the angle bins. As you can see, we only reach every fourth angle bin. Screenshot from 2022-08-04 16-42-16

I wonder / am collecting ideas, on how we could find remedy for this zig-zagging issue. I know, we could double the resolution or use a larger turning radius (>2 meters if we want to cover only a single bin) but we are running into other issues if we do so. Maybe there are other ideas. I have tried adding additional turning primitives which turn much less than the min_turning_radius but didn't get good results. Again, if I understand the idea / solution, I'd be more than happy to implement, test, and submit a PR.

Sep 05 '22 20:09 mangoschorle

Interesting

Have you tried more angular bins? I did notice generally in my testing that 64 was really not enough for "great" quality paths. 72 I put as the default was really the smallest amount that worked well and is also representative of a meaningful quantization of 5 degrees (many works on similar topics also use that discretization; didn't see a reason to argue with it after trying coarser samplings and didn't get good results). Though, I just noticed that 64 is in the readme, but 72 is the actual default value.

We've also changed quite a bit since the increments = ceil(angle / bin_size); was added for early testing results. This was added because looping was happening in the paths to get a slightly different approach angle to get through a passage way. e.g. it would go in a loop in a circle just to approach on a different vector / bin through some area. It could be that we could remove that now and would help? I do suspect though to some degree that we saw that happening in exactly the zig-zag situation where we over-expanded the search and created that looping behavior.

More thoughts / options:

I hate the idea of asking people to change the resolution of their costmap (which should be based on other requirements) or the turning radius (based on actual vehicle limits) to fix that issue. Though, fudging the turning radius to be like 1% higher/lower could do it -- but I don't want to immortalize that into official procedure.
Seems like the issue rests with how we generate the primitives or the number of angular bins used - that's the real source of things that any other changes would just be trying to work around
I started this trying to avoid having to deal with conflicts of primitive sizes too small to be promised to leave the current cell (e.g. $> \sqrt{2}$ cells or longer). We could relax that so that the primitives can be smaller but we would also need to add in logic to deal with the conflicts if an expansion doesn't actually have a given cell's field.
We could make a requirement that the primitive length is an odd multiple of bin sizes?
We could make a requirement that the number of angular quantizations are odd?

Sep 06 '22 18:09 SteveMacenski

Some more insights:

Its not about the number of angle bins. You can increase it and still get awful results. Its more about the (I have big trouble writing down what I mean) special regularity of possible angles that emerges in some situations.

In the following image we have a starting angle of pi and a bin = 32. Due to the turning radius and the num_angle_bins we have increments = 4. So: The starting bin is 32 and we can hop 4 increments at a time. This is why we get 16 (64/4) possible directions of motion (indicated by the yellow star). If we just double the num_angle_bins (128), we get exactly the same result. Here is why: turning radius and resolution remain the same. The only thing that now changes is increments. Increments is doubled to 8. So on a 128 bin angle resolution we now hop 8 angle bins at a time. Leaving us again with exactly 16 (128/8) possible directions. In the image the plan is very neat and you can see that the first action chosen is a right turn

Screenshot from 2022-09-06 20-54-33

Plans however get super ugly and require a long time in the following two situations:

The motion set (the yellow star in the image) is not perfectly axis aligned.

If we choose a starting point with a odd bin number (for example a bit 31 instead of 32) than there is no motion primitive that allows moving directly to the right (the yellow star has turned a bit and nothing is pointing to the right), and hence we get these zigzag motions. Note that you can always choose a direction that doesn't fit to one of the sixteen direction primitives and .. you will still get zig zag.

Screenshot from 2022-09-06 21-08-50

The motion set (the yellow star in the image) is not regular and has no primitives along the axis. With an odd choice of angle bins (113) you might get irregularily shaped motion sets, planning times start to spike and the plans aren't really usable.

Screenshot from 2022-09-06 20-58-26

In total I have found that (at least for our use-cases which have many parallel roads and right-angle intersections, it suffices to make sure that the primitive set is regular. We then round the actual starting bin (e.g. 31) to the nearest of the regular ones (to the nearest yellow line of the star), and take it from there. The results are satisfying even though my heart is aching. In the end it is an awful hack.

Cheers for the suggestions.

Sep 06 '22 19:09 mangoschorle

So it sounds like just more bins and having an odd number of bins won't help (though, an odd number of bins could help, but only when making the search pattern do a full turn around :laughing: )

We then round the actual starting bin (e.g. 31) to the nearest of the regular ones (to the nearest yellow line of the star), and take it from there.

Yeah, I'm not really liking that either. That only works for your situation because the map is exactly axially aligned. If the map was off at some odd angle, then its possible that being offset is what would generate the better paths. Plus it in no way resolves the issue that all 64 bins cannot be reached and instead uses a far downsampled version.

What about enforcing a constraint that the primitives must go through an odd number of angular quantizations? So that its impossible for 4 to be an option, it has to be 3 or 5. That would increase the primitive lengths slightly (assuming we round up and not down).

The other things I'm thinking about is

We've also changed quite a bit since the increments = ceil(angle / bin_size); was added for early testing results. This was added because looping was happening in the paths to get a slightly different approach angle to get through a passage way. e.g. it would go in a loop in a circle just to approach on a different vector / bin through some area. It could be that we could remove that now and would help? I do suspect though to some degree that we saw that happening in exactly the zig-zag situation where we over-expanded the search and created that looping behavior.

Or maybe having a special case for the first expansion from the start pose uses the normal primitives, but also allows for primitives of length 1 (or 3 or something odd, maybe N-1?) angular quantization bin if that amount puts it into a new spatial bin.

The 2nd figure you show though looks like a straightline path should have been largely possible? That's quite strange - there are a few expansion branches shown that basically just be straight and do a fine enough job at it. What if you remove the planning penalty functions? Maybe non-straight / change are creating this to some level

Sep 06 '22 21:09 SteveMacenski

One way how we could make sure more bins are visited is doing the following: We settle for a certain number of bins so that we have some regularity such as 64.

We now check given the settings for min_turning_radius and resolution, what is the Number of increments (e.g. 4).

We now compute the turning_radii (> min_turning_radius) which would achieve the intermediate bins and add the respective primitives to the set. All primitives have equal length (sqrt(2)).

Here is an example (I used one of your maps, with bin_size=4 and all other parameters default).

This is the old case. 64 bins, increments = 4, therefore we can only reach 16 diffferent orientations. The diagonal motions are zig-zaggy. In blue, you can actually see the distribution of bins in the graph. Wide gaps between the blue lines as we don't reach every orientation bin.

Screenshot from 2022-09-07 09-33-23

This is the new case. 64 bins, increments = 4. I add two additional primitives with increments +2 and -2. (I of course changed the TravelCost computation accordingly) Therefore we can now reach 32 diffferent orientations. The diagonal motions are not zig-zaggy. In blue, you can actually see the distribution of bins covering every second bin. Screenshot from 2022-09-07 09-34-07

This is the new case. 32 bins, increments = 2. I add two additional primitives with increments +1 and -1. Therefore we can now reach all possible 32 diffferent orientations. The diagonal motions are not zig-zaggy. In blue, you can actually see the distribution of bins covering every bin.

Screenshot from 2022-09-07 09-47-36

So to be honest I think we should settle for less but reachable bins (32) at the cost of having to add a bigger primitive set.

Sep 07 '22 07:09 mangoschorle

What about my suggestion to simply make the number of angular bins the trajectories represent non-exact? That seems like the lower hanging fruit before we escalate to more primitives. Though idea in the next paragraph below could be a good solution.

On the more primitives front - what if we only change the first expansion to use the +/- 1 and +/- 2 bin approach you describe, both? That would seed some expansion trees in the non-4 and non-even number distributions to expand from and in no way impact the run-time of the rest of the planner while still being entirely feasible. My assumption here is that the non-zig-zag paths are more optimal than the zig-zag paths so if we give those trees the ability to expand on the off-angle distributions, it should overtake the zig-zag ones. This seems like a good solution, actually. No fudging there since we're not just placing the robot at neighboring bins, we're just using less aggressive turning radii to generate still-feasible trajectories. It should have virtually the same impact.

What is the run-time impact of the additional primitives? We'd like to keep something like 64/72 bins (not 32) to give some really good resolution of paths at 5 deg.

In your last example with +/- 1 bin, is that in addition to the +/- 2 bins you did previously (so its the normal set + 4 primitives, 1 bin apart and 2 bins apart) or in place of (so its the normal set + 2 primitives, 1 bin apart)? Just for clarification.

So to be honest I think we should settle for less but reachable bins (32) at the cost of having to add a bigger primitive set.

I'd disagree, if the user says they want to use 64/128/whatever angular bins, we should be using them all. It doesn't mean they're all used uniformly, but they should be possible to achieve them. But I suppose my opinion on that could be swayed in the other direction if we don't think there is a large overlap of settings of turning radius + resolution that yields this problem so its a relatively minor occurance.

Sep 07 '22 18:09 SteveMacenski

What is the run-time impact of the additional primitives? We'd like to keep something like 64/72 bins (not 32) to give some really good resolution of paths at 5 deg.

On my end the additional primitives had the opposite runtime effect of what one would think. Once you use more primitives and all bins start to be reachable, it takes WAY less expansion to reach the goal. The additional cost of more primitives is negligible in comparison.

In your last example with +/- 1 bin, is that in addition to the +/- 2 bins you did previously (so its the normal set + 4 primitives, 1 bin apart and 2 bins apart) or in place of (so its the normal set + 2 primitives, 1 bin apart)? Just for clarification.

In the last example I reduced the angle bins to 32 (instead of 64). The increments variable is then 2. So I added primitives for +-1 and +-2 (The +-2 are the ones which represent the min_truning_radius, i.e. what is currently implemented). In the second last image I had 64 bins (increments = 4 for min_turn_rad). I there had +-4 (for min_trun_rad) and +-2. Now, only every second bin is reachable. The solution is of course exactly identical to the last image.

I have tried the solution I proposed on several of our maps and I am really happy with the results for now. I will give what you proposed a try let you know the results.

Sep 08 '22 07:09 mangoschorle

Results for:

On the more primitives front - what if we only change the first expansion to use the +/- 1 and +/- 2 bin approach you describe, both? That would seed some expansion trees in the non-4 and non-even number distributions to expand from and in no way impact the run-time of the rest of the planner while still being entirely feasible. My assumption here is that the non-zig-zag paths are more optimal than the zig-zag paths so if we give those trees the ability to expand on the off-angle distributions, it should overtake the zig-zag ones. This seems like a good solution, actually. No fudging there since we're not just placing the robot at neighboring bins, we're just using less aggressive turning radii to generate still-feasible trajectories. It should have virtually the same impact.

As you can see in the close-up below I am expanding all possible primitives (anglebins) only in the very first iteration. Yet, very poor results with 72 bins. Most of them never reached and the graph is large. Screenshot from 2022-09-08 15-45-18 CloseUp (only first)

Results from dense primitive set in every expansion: Screenshot from 2022-09-08 15-47-19

CloseUp (All)

Screenshot from 2022-09-08 15-47-35

Just out of curiosity I also tried expanding the first 500 astar iterations with all primitives after that with the minimal set (3).

CloseUp (First 500)

Sep 08 '22 13:09 mangoschorle

Interesting. So it seems like continuously expanding with more primitives could be helpful for this situation.

From these new results - what do you suggest as the move now?

I'd be interested in some performance testing on whatever you suggest relative to what's there now for a handful of situations in both confined spaces (like what you show) and open-area maps. Either way, even if it comes with a performance hit, I think its worth adding the option to expand in more orientation bins, but it is good to do so with eyes wide open about what the impact on memory/CPU is. For the zig-zag situation, you're right in that it might result in less compute time because we're not over-expanding the tree, but for the situations where zig-zagging didn't occur (when the starting bin was some even increment), I'd like to know how much worse that is compute wise. We're going to be checking a whole lot more things for collision which is not free!

This is probably worth adding a boolean parameterization for to enable / disable so we can get the same behavior as today for folks that have motion primitive lengths / costmap resolutions that don't have this problem. And add some description in the readme / navigation.ros.org about why it is added. You've made some really fantastic figures here as well, no reason to let those be lost to the void :wink:

Sep 09 '22 00:09 SteveMacenski

Before going any further into this I must ask the followng: If it is actually the case (I didn't mess up) and most angle bins aren't visited at all, what is it that you see as a downside of reducing the amount of bins? I mean we could have an analytical expansion to the exact goal coordinate and not to a discretized one. So the planner would reach the goal with floating point precision irrespective of the discretization. We would get rid of the zig-zagging due to the inital bin and no further primitives would have to be added.

Von: Steve Macenski @.> Gesendet: Freitag, 9. September 2022 02:22 An: ros-planning/navigation2 @.> Cc: mangoschorle @.>; Author @.> Betreff: Re: [ros-planning/navigation2] Zig-zag due to unfavorable angle increments and starting angle (Issue #3172)

Interesting. So it seems like continuously expanding with more primitives could be helpful for this situation.

From these new results - what do you suggest as the move now?

I'd be interested in some performance testing on whatever you suggest relative to what's there now for a handful of situations in both confined spaces (like what you show) and open-area maps. Either way, even if it comes with a performance hit, I think its worth adding the option to expand in more orientation bins, but it is good to do so with eyes wide open about what the impact on memory/CPU is.

This is probably worth adding a boolean parameterization for to enable / disable so we can get the same behavior as today for folks that have motion primitive lengths / costmap resolutions that don't have this problem. And add some description in the readme / navigation.ros.org about why it is added. You've made some really fantastic figures here as well, no reason to let those be lost to the void 😉

— Reply to this email directly, view it on GitHubhttps://github.com/ros-planning/navigation2/issues/3172#issuecomment-1241361141, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWR4NSN7PHDMMXEXYIKXDJ3V5J7MRANCNFSM6AAAAAAQFIMX44. You are receiving this because you authored the thread.Message ID: @.***>

Sep 12 '22 10:09 mangoschorle

most angle bins aren't visited at all, what is it that you see as a downside of reducing the amount of bins

If you're OK with only using 16 bins or whatever number that resolves to - nothing. But you, like me I suspect, don't think it should be the role of special numbers in turning radii and costmap sizes to determine what fidelity a robot's behavior should be able to achieve. Further, it will cause problems in confined spaces where you need more fidelity to find a route through the space. This is kind of where that 72 bin number comes from - if you can't find a route through a space with 5 deg bins, any less probably isn't going to do it either. But the same cannot be said for 10+ degrees (and certainly not 22.5 with 16 bins).

I mean we could have an analytical expansion to the exact goal coordinate and not to a discretized one

That's only for the last few meters on approach - not used in full planning. The majority of the plan would still be restricted to those 16 bins, which would be problematic especially in narrow spaces where we require more finite motion primitives to be able to find a route through. But also in situations where you have maps and start poses non-evenly aligned. The same reason why in this comment you added +/-2 primitives so that you could make it through an odd-oriented hallways without zig-zags. You need more bins.

~32 bins @ 11.25 (lets say approximately 10 deg for orders of magnitude) is probably entirely suitable for wider spaces without other things like people, robots, or boxes partially blocking aisleways. But the moment you have some really narrow spaces, even if only due to temporary obstacles, having more discretizations gives you the ability to try from more granular directions to find a valid path through if the easy "straight" doesn't work. From the research I did on this during development, 5 deg (72 bins) was that sweet spot. But 10 deg would work as long as there's always a wide berth like in the map you show. Add some other robots or boxes in the way, and I think you'll see it starts to become somewhat problematic in isolated situations -- but when replanning at 1 hz or when operating with hundreds of robots, isolated situations happen quite frequently.

Sep 12 '22 18:09 SteveMacenski

Thoughts? :smile:

Sep 15 '22 22:09 SteveMacenski

From these new results - what do you suggest as the move now?

I'd be interested in some performance testing on whatever you suggest relative to what's there now for a handful of situations in both confined spaces (like what you show) and open-area maps.

I think that one should really make sure that this is not just happening due to the possibility of me regarding some isolated weird corner cases (I really think that that is not the case but...). Are there some test maps I could have a go at? Like the 3-planners one featured in the smac planner directory.

I guess the flag to switch on/off the new behavior is a good idea especially since you then could directly compare the quality of the paths and the required runtime.

On my side, I only took a look at Dubins paths but I guess the extension to RS should be straight forward. What I had to modify on my end:

The primitive creation (added larger turning radius primitives to cover all increments till min_turning_radius)
Modified the getProjections. The way you currently use your precomputation of delta_xs and delta_ys causes rounding issues. I realized that once I wrote an interface which visualizes the graph. These rounding issues cause that the primitives frequently don't expand as expected but you get these asymmetrical primitive expansions. I do think that the round off errors in getProjections are the only reason you sometimes fall into odd bins.
We would have to also modify the travelcost computation. Maybe we should move this into the nodes rather than have it in astar, where we have to rely on indices and the oreder in which we emplaced the primitives.

If I only could make it to ROSCon ... A discussion on this topic could bring up so many fruitful cool ideas ...

Sep 16 '22 09:09 mangoschorle

Are there some test maps I could have a go at?

What's your email? I can send you a couple. I use the willow garage map for confined space testing and another warehouse map I found at some point for open-space testing. I can't find the links to them quickly to post here. Optionally, my email is on my github page if you want to email me first (if you don't want to post your contact info on github ha).

Mhm, that's alot -- I'd really appreciate to be able to see specifically what you did for these experiments. Any chance this could be open-sourced or even just have a branch publicly available as the basis for me to continue working on this? That's alot of reinventing the wheel for me and time is generally limited. It'd be better for me to wait on the open-source process if you think it'd come through. Especially if you're fixing other subtle bugs.

But I guess that still leaves the main question: what action do you recommend from your testing? Prims at every possible angular bin from a node? Just +/-1 or +/- 2 (or both)? Expanding dozens each node seems wasteful of memory consumption of the graph, but if that meaningfully and tangibly improves paths, that's worth me evaluating. If +/-1 and 2 bins does that same result more or less, that seems the most reasonable to me since it only expands the primitive set from 3 to 7, which is still palpable (for example, the lattice planner typically has 5 and has essentially identical runtime to Hybrid-A*, although its primitives are much longer). Or perhaps modes to support either or?

Sep 16 '22 19:09 SteveMacenski

Regarding the maps I sent you a linked in pm.

Prims at every possible angular bin from a node? Just +/-1 or +/- 2 (or both)?

I’ll make some experiments soon and give an update.

Sep 19 '22 21:09 mangoschorle

This is the path. I took way too long to plan (214 ms). I am not happy. Its a good example though, as it is not axis-aligned.

I don't want to dive into the next topic, but somehow I think that the obstacle heuristic is a bit of an issue. It is aligned with the coordinate system. The map however may not be. I have the feeling that this may introduce an unfair bias towards direct North South / West East expansions. Its more likely to end up in a cell with lower heuristic cost if a primitive goes directly up or directly to the right. Especially since the heuristics map is downsampled by a factor of 2. Does that make sense?

I've also noticed, that the early stages of the tree expansion seem crucial. If it sets off at a skewed angle it usually doesn't recover in the rest of the expansion phase, but expands towards the goal at the cost of zig-zagging.

Sep 20 '22 13:09 mangoschorle

That's unusually longer than I find, typically < 100ms (between 30-80ms depending on the specific nature of the request).

Lets stay focused on the first thing on the primitives using all the bins for zig-zags. The heuristic is a whole other subsystem and we should button up one issue before moving onto another.

It is aligned with the coordinate system. The map however may not be. I have the feeling that this may introduce an unfair bias towards direct North South / West East expansions.

I'm not sure that's entirely true, but some element of it could be. But, the actual primitives are not in any way related to the costmap 2D grid structure, its just an exploring tree of curves. It will of course try to incentivize the expansions with the lowest costs - if this was just a pure-grid-search algorithm visiting its neighboring cells with a wide hallway (but still a hallway), you might expect the planner to produce a path with many cell-by-cell staircase oscillations to try to stay in the center of the space where the heuristic is minimized (when the cost penalty is set very high). Since we sample with curves in Hybrid/Lattice, you could expect something similar with little waves. But the turning paths have penalty functions on them that can be tuned (or removed) to help incentivize a bit more straight motion or when turning, committing to those turns. Or reducing the cost penalty a little so that we really care more about minimal length paths than staying in the center of aisles. Reducing the weighting on cost in the travel cost functions and heuristics would make it care less about the absolute optimal dead-center aisleway and remove a great deal of those artifacts. Keep in mind my default tuning is supposed to give a good baseline for all applications, not that its optimal for any particular one (and not to say I perfectly tuned the system for the 4-5 cases I tuned based on).

Also keep in mind that just because the heuristic is represented by a 2d grid, that doesn't mean that our primitives hit every cell ( since $>\sqrt(2)$ nor that the search neighborhoods are N/E/S/W. There's more analysis we could do here, but I don't believe that changing the heuristic to use, say, 8-connected neighborhoods vs 4-connected neighborhoods would actually make a difference. But it really wouldn't be super hard to test, I think you'd have to change only about 6 lines in this block https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_hybrid.cpp#L473-L520 to make the neighborhood and travel cost use 8-connected neighborhoods in expansion rather than 4. If you find that helps + doesn't impact compute too much, seems like an easy improvement.

Especially since the heuristics map is downsampled by a factor of 2.

You are right, though, that the downsampling does create a level of approximation - that is a real effect. I found though using the simple smoothing algorithm that the planner plugin uses removes those artifacts pretty well (the small waves, so to speak). The penalty functions also help here a bit, but if you increase those costs for non-straight/turning/etc you can create very rigid zig-zaggy behavior (since if you make changing direction so difficult, it won't until it absolutely has to). This is why those penalty functions are set super low (like 1%).

I've also noticed, that the early stages of the tree expansion seem crucial. If it sets off at a skewed angle it usually doesn't recover in the rest of the expansion phase, but expands towards the goal at the cost of zig-zagging.

I think that's a bit of a misnomer, its not that the early expansion matters, that expansion is deterministic is only based on the boundary conditions (e.g. location of the robot, environment, etc) and it will always expand in the way the heuristics and specific available primitive functions enable. I thought we covered that the zig-zagging was largely due to not using the bins properly as we've been discussing above? If you're starting at an "odd" angle and the map is at a "normal" angle, then the restricted primitive set you have available to you can't do straight down spaces, causing the zig-zagging. That seems sensible. It wouldn't surprise me if the heuristic contributed a little bit of the in the localized optimal zones, but wouldn't be all over the map like what you've shown. Those would be little-bitty ones.

Sep 20 '22 18:09 SteveMacenski

Hey, any updates here? I'd love to be able to work on this sooner than later to have it resolved! I think I was looking for your suggestion about what we should do here to maximize the use of the primitive space (e.g. add +/- 1 bin prims, and/or +/- 2, and/or ...) given your extensive experimentation.

Sep 29 '22 20:09 SteveMacenski

Nothing new here. I should be able to invest more time over the next couple of weeks. I am still not 100% happy. Especially in maps that are not well aligned. I am on it and reach out asap

Sep 30 '22 20:09 mangoschorle

Hey, I have been working on the problem and trying different solution proposed here and also some input from @SteveMacenski. I used the following parameters for testing:

planner_server:
  ros__parameters:
    expected_planner_frequency: 20.0
    planner_plugins: ["GridBased"]
    GridBased:
      plugin: "nav2_smac_planner/SmacPlannerHybrid"
      downsample_costmap: false           # whether or not to downsample the map
      downsampling_factor: 1              # multiplier for the resolution of the costmap layer (e.g. 2 on a 5cm costmap would be 10cm)
      tolerance: 0.25                     # dist-to-goal heuristic cost (distance) for valid tolerance endpoints if exact goal cannot be found.
      allow_unknown: true                 # allow traveling in unknown space
      max_iterations: 1000000             # maximum total iterations to search for before failing (in case unreachable), set to -1 to disable
      max_on_approach_iterations: 1000    # Maximum number of iterations after within tolerances to continue to try to find exact solution
      max_planning_time: 5.0              # max time in s for planner to plan, smooth
      motion_model_for_search: "DUBIN"    # Hybrid-A* Dubin, Redds-Shepp
      angle_quantization_bins: 64         # Number of angle bins for search
      analytic_expansion_ratio: 3.5       # The ratio to attempt analytic expansions during search for final approach.
      analytic_expansion_max_length: 3.0  # For Hybrid/Lattice nodes: The maximum length of the analytic expansion to be considered valid to prevent unsafe shortcutting
      minimum_turning_radius: 0.50        # minimum turning radius in m of path / vehicle
      # maximum_turning_radius: 1.0
      reverse_penalty: 2.0                # Penalty to apply if motion is reversing, must be => 1
      change_penalty: 0.0                 # Penalty to apply if motion is changing directions (L to R), must be >= 0
      non_straight_penalty: 1.2           # Penalty to apply if motion is non-straight, must be => 1
      cost_penalty: 2.0                   # Penalty to apply to higher cost areas when adding into the obstacle map dynamic programming distance expansion heuristic. This drives the robot more towards the center of passages. A value between 1.3 - 3.5 is reasonable.
      retrospective_penalty: 0.015
      lookup_table_size: 20.0             # Size of the dubin/reeds-sheep distance window to cache, in meters.
      cache_obstacle_heuristic: false     # Cache the obstacle map dynamic programming distance expansion heuristic between subsiquent replannings of the same goal location. Dramatically speeds up replanning performance (40x) if costmap is largely static.
      smooth_path: True                   # If true, does a simple and quick smoothing post-processing to the path

I was able to replicate the issue that during planning some of the bins are skipped. Also, out of curiosity, I would love to know how the turning radius 0.5 was converted to 10?

[planner_server-2] bin_size: 0.0981748
[planner_server-2] angle: 0.19635
[planner_server-2] min_turning_radius: 10
[planner_server-2] num_angle_quantization: 64
[planner_server-2] increments: 2
[planner_server]: Planner loop missed its desired rate of 20.0000 Hz. Current loop rate is 18.2578 Hz

Figure_1_case_study_1 The path generated during this planning: path_1_case_study_1 In order to tackle the problem, I tried generating more projections. In this case the increment was +-2, so I generated projection for +-1. Based on the increment, I recomputed the angle. Also, i recomputed the turning radius using a fix chord length of sqrt(2). Although, it seem that the problem of the bin skipping was reduced, the computational time way way higher. There might be some optimization techniques that can be done to reduce thus.

[planner_server-2] bin_size: 0.0981748
[planner_server-2] angle: 0.19635
[planner_server-2] min_turning_radius: 10
[planner_server-2] num_angle_quantization: 64
[planner_server-2] increments: 2
[planner_server-2] delta_x: 1.41251
[planner_server-2] delta_y: 0.0693922
[planner_server-2] angle_increment: 0.0981748
[planner_server-2] turn_radius: 14.4108
[planner_server-2] i: 1
[planner_server-2] delta_x: 1.4074
[planner_server-2] delta_y: 0.138618
[planner_server-2] angle_increment: 0.19635
[planner_server-2] turn_radius: 7.21411
[planner_server-2] i: 2
[planner_server-2] [WARN] [1679774605.351331509] [planner_server]: Planner loop missed its desired rate of 20.0000 Hz. Current loop rate is 2.3250 Hz

Figure_2_case_study_2 path_2_case_study_2

For the last case, I tried using non-exact increments but the skipping behavior still persisted. It even took a bit longer to compute.

[planner_server-2] bin_size: 0.0981748
[planner_server-2] angle: 0.141539
[planner_server-2] min_turning_radius: 10
[planner_server-2] num_angle_quantization: 64
[planner_server-2] increments: 1.44171
[planner_server-2] [WARN] [1679775659.963246199] [planner_server]: Planner loop missed its desired rate of 20.0000 Hz. Current loop rate is 14.5725 Hz

Figure_3_case_study_2 path_3_case_study_3

Also, the link provided in the code to the theory is not working(http://planning.cs.uiuc.edu/node821.html).

Mar 28 '23 06:03 stevedanomodolor

Hi! Great analysis starting point here!

I would love to know how the turning radius 0.5 was converted to 10?

I assume its because the turning radius is 0.5m at a 5cm resolution, which would be 10 grid cells. You could verify this by changing the resolution to, say, 10cm and then the radius should be 5.

Although, it seem that the problem of the bin skipping was reduced, the computational time way way higher

That is... odd. I've have to see what you did. Its probably that you computed something incorrectly so the heuristics or traversal costs are poorly estimated. Unless you're computing those primitives in a planning loop (which I doubt you did, if you computed them where we compute the other primitives).

I tried using non-exact increments but the skipping behavior still persisted.

That doesn't overly surprise me to some extent. I'm suspicious of the numbers on the plot you showed (why are they all ending in 0.12 if the angles aren't exact?). You should just be able to run that experiment by removing the ceil here. I'd like to understand what you did here in more detail to make sure its reasonable, but I think the conclusion you came to is generally in line with my expectations.

So the next questions would be: How do we find the number and increments of the primitives we need to add to fully leverage the bins specified by the user? For the particular example you say, it is every 2nd bin was not usable. But you could easily set a set of parameters that its every 3rd or 4th or 5th. We need to be able to find that out when we're generating the primitives and create a set of primitives to more fully explore the space available to us. Though, perhaps first things first is to find the source of that additional planning time so we can scale it up!

Also, an update from @mangoschorle would be fabulous!

Mar 28 '23 20:03 SteveMacenski

That doesn't overly surprise me to some extent. I'm suspicious of the numbers on the plot you showed (why are they all ending in 0.12 if the angles aren't exact?). You should just be able to run that experiment by removing the ceil here. I'd like to understand what you did here in more detail to make sure its reasonable, but I think the conclusion you came to is generally in line with my expectations.

Yes I did this like this. For the non exact bin, I just removed the ceil, while in order to add more bins I did as follows:

  for(int i = 1; i <= int(increments); i++)
  {
    angle_increment = i * bin_size;
    turn_radius = sqrt(2) / (2.0f * sin(angle_increment / 2.0));
    delta_x = turn_radius * sin(angle_increment);
    delta_y = turn_radius - (turn_radius * cos(angle_increment));
    projections.emplace_back(delta_x, delta_y, i);  // Left
    projections.emplace_back(delta_x, -delta_y, -i);  // Right
    std::cout << "delta_x: " << delta_x << std::endl;
    std::cout << "delta_y: " << delta_y << std::endl;
    std::cout << "angle_increment: " << angle_increment << std::endl;
    std::cout << "turn_radius: " << turn_radius << std::endl;
    std::cout << "i: " << i << std::endl;

  }
  projections.emplace_back(hypotf(delta_x, delta_y), 0.0, 0.0);  // Forward

That is... odd. I've have to see what you did. Its probably that you computed something incorrectly so the heuristics or traversal costs are poorly estimated. Unless you're computing those primitives in a planning loop (which I doubt you did, if you computed them where we compute the other primitives).

I actually I am not computing it in a loop but during the initialization step.

So the next questions would be: How do we find the number and increments of the primitives we need to add to fully leverage the bins specified by the user? For the particular example you say, it is every 2nd bin was not usable. But you could easily set a set of parameters that its every 3rd or 4th or 5th. We need to be able to find that out when we're generating the primitives and create a set of primitives to more fully explore the space available to us. Though, perhaps first things first is to find the source of that additional planning time so we can scale it up!

Could the problem be that I did not adjust other parameter affecting the cost?

Apr 04 '23 04:04 stevedanomodolor

OK on the non-exact bins. But then why do they all end with X.12? If its not exactly 1.0, they should be more distributed (e.g. 1.4 increment would give you 0, 1.4, 2.8, 4.2, ...)? That seems to point to an error. The topic of non-exact bin sizes are independent from adding multiple primitives, make sure you're not mixing these together.

For the primitives, you need to make sure each of the primitives are the same length so the traversal cost between them are the same (or adapt the traversal cost function to account for the fact that these aren't all the same length). See the thread above in this ticket with @mangoschorle and the State Lattice planner. What is increments? Do these primitives look reasonable? While my head isn't in this problem in particular at the moment, this seems overly simplistic and not accounting for practical constraints.

For instance, https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_hybrid.cpp#L91 this has the minimum turning radius, you can't willy-nilly change that. That's a real user constraint. It has to be higher than that, never lower. You use angle_increments which at my first glance seems nonsensical making me think these primitives are wacky.

The ideal requirements for the new primitives: They must be at or higher than the turning radius of the car set. They should be the same lengths, and if not, updates to the traversal cost function needs to make sure we correct for that. They must end in other bin anle increments than the full-turning-radius-minimum-distance maneuvers (the default ones). If you set the lengths as fixed based on the default primitives, then that should allow you to back out then the turning radius strictly greater than the minimum radius which ends in the desired angular bin increment. Some engineering drawings may help you visualize this. If you have a fixed length, target angular bin change, and minimum radius, there should be a closed form geometric solution of a primitive of a less extreme turn.

If you know the default primitives have angular bin jumps of 3 (e.g. 3, 6, 9, 12, ...) then you know that the default set will do -3, 0, 3. Which means you need to fill in the 1, 2, -1, 2 angular bin primitives. Once you know how to generate the closed form geometric equations, then you can plug in these values for generating the appropriate primitives to fill in the search space.

Apr 05 '23 20:04 SteveMacenski

OK on the non-exact bins. But then why do they all end with X.12? If its not exactly 1.0, they should be more distributed (e.g. 1.4 increment would give you 0, 1.4, 2.8, 4.2, ...)? That seems to point to an error. The topic of non-exact bin sizes are independent from adding multiple primitives, make sure you're not mixing these together.

You are right, it should give the increments of 0, 1.4 and so on. The problem was the way I plotted the histogram, specifically the way I specified the bins. Figure_1

With respect to the second point, I will try it and see and let you know

Apr 13 '23 06:04 stevedanomodolor

Let me know if my reasoning is wrong, but I think the result s is quite promising. In order to generate the primitive, following your advice, I computed the chord length based on the default primitive(max left, max right and forward) which was computed from the min turning radius and angular bin(-2,0,2). With that, I modified the travelcost based on this new chordlength, which is >= sqrt(2) depending on the increments value. Then after this compute the other primitives based on the remaining angular bins. In this case -1, 1. response_primitive

More info [planner_server-2] bin_size: 0.0872665 [planner_server-2] angle: 0.174533 [planner_server-2] min_turning_radius: 10 [planner_server-2] num_angle_quantization: 72 [planner_server-2] increments: 2 [planner_server-2] Increments[ 1 ] Radius: 19.981 [planner_server-2] projections[0]: 1.74311, 0 ] [planner_server-2] projections[1]: 1.73648, 0.151922 ] [planner_server-2] projections[2]: 1.73648, -0.151922 ] [planner_server-2] projections[3]: 1.74146, 0.0760326 ] [planner_server-2] projections[4]: 1.74146, -0.0760326 ]

The generate path and the histogram response response_graph

Generally speaking, it does covers the remaining angular bins and jumps only occurs at the extrem of the histogram. For comparison, we can see the case were we do just use the default motion primitives:

[planner_server-2] bin_size: 0.0872665 [planner_server-2] angle: 0.174533 [planner_server-2] min_turning_radius: 10 [planner_server-2] num_angle_quantization: 72 [planner_server-2] increments: 2 [planner_server-2] projections[0]: 1.74311, 0 ] [planner_server-2] projections[1]: 1.73648, 0.151922 ] [planner_server-2] projections[2]: 1.73648, -0.151922 ] response_default response_primitive_default response_graph_default

It would be nice if I could have a map where I could reproduce the zig zag behavior exactly as @mangoschorle did, to really test if this really does work.

Apr 14 '23 21:04 stevedanomodolor

Promising indeed!

With that, I modified the travelcost based on this new chordlength,

If you used the chord length from the default primitives, why did you need to change the travel cost if they're all the same length? Or are they not the same length...?

Then after this compute the other primitives based on the remaining angular bins

You show the case with +/-1, does the method you use work for arbitrary ones (e.g. if we do every 3rd, will it generate +/- 1 and 2?)? Thoughtfull, I think it could be +/- 2, 3, 5, 7, ... Though, realistic settings are probably mostly in the 2/3/5 range.

Are the run time issues fixed now? @mangoschorle even addressed that his changes decreased planning times in some situations due to having more granular options to follow the guiding heuristics.

That path looks much improved!

jumps only occurs at the extrem of the histogram

Those aren't jumps anymore, those are just where it was never adventitious for the system to explore solutions that those heading (e.g. the "going right" or "going up" directions where the route is down / left). If you see a reasonable band of dense results filling in the known jumping rate, then we can explore them all (since the primitives are all the same for all orientations).

t would be nice if I could have a map where I could reproduce the zig zag behavior exactly

You can on that map, its just finnicky to get the right initial / goal headings to cause it. Try doing some stuff where you request the starting heading off-axis but a little bit.

This is really great progress!

Apr 17 '23 15:04 SteveMacenski

Sorry for the late response, I was quite busy these weeks with work. I will try to push this forward asap.

You show the case with +/-1, does the method you use work for arbitrary ones (e.g. if we do every 3rd, will it generate +/- 1 and 2?)? Thoughtful, I think it could be +/- 2, 3, 5, 7, ... Though, realistic settings are probably mostly in the 2/3/5 range.

Yes, the idea of the implementation is that it included all increments lower than the one computed. For example for +-4, we compute 0 +-1,2,3. You can see the implementation here-> https://github.com/stevedanomodolor/navigation2/tree/zig_zag_issue

Are the run time issues fixed now? @mangoschorle even addressed that his changes decreased planning times in some situations due to having more granular options to follow the guiding heuristics.

Based on my implementation, the processing time is always higher because we have larger search space. Maybe there is somthing I am not consdering.

You can on that map, its just finnicky to get the right initial / goal headings to cause it. Try doing some stuff where you request the starting heading off-axis but a little bit.

Some bad news is that my method only works when the starting angle is aligned with next motion of the robot. Like the picture I sent you earlier. But when the starting angle varies, the behavior still remains. For example, using the starting angle below, Screenshot from 2023-05-02 20-09-26 .

I will try changing the travel cost based on the arc length of the motion primitve as @mangoschorle did to see the effect.

May 02 '23 18:05 stevedanomodolor

Great! Thanks for the update!

May 02 '23 18:05 SteveMacenski

Just a note, I'm starting to work on this now and have evaluated our options in detail. I want to detail it here, even though I know some of this is redundant to previous elements of the conversation.

We have a problem where when the angular quantization bin jumps are multiple increments (e.g. 0,2,4,6 or 1,3,6,9), we skip some parts of the user requested resolution for use. This is caused by the relationship of costmap resolution and minimum turning radius needed to ensure we leave a cell.

Our two options are to either (A) make primitives that have their angular bin changes be exactly 1 bin or (B) create more primitives that cover the range of skipped bins.

(A) If we do this, then the angular bin constraint might be met, but then we will have spatial overlap requiring bin deconfliction. In this situation, we can have multiple short primitives that land into the same bin that we need to deal with. For turning motions, they'll be +/- 1 angular bin from the origin, but for straight motions they'll be exactly overlap. We can deconflict them, but then we'll proportionately increase the number of iterations and thus collision checks depending on the primitive length. If we skipped every second orientation bin, we'd need to reduce the spatial distance of each primitive by ~2. So we'd have more collision checks proportionate to 3x the skipping ratio (for the 3 new primitives with double the expansion)

(B) We add additional primitives that fully cover the skipped space, which means adding 2*(N-1) new primitives, where N is the angular quantization jump. We can directly compute these primitives using the same equations used for the minimum turning radius curve, but this time with a known angle but unknown turning radius (with constant chord). We can then use that turning radius to find the x,y,yaw for the primitive pretty straight forwardly. This however adds 2*(N-1) new collision checks. I have the math done here as a proof in my notebook, but suffice it to say it aligns with other code blobs posted here (just validated) so no need to repeat that again.

Either way, I think that broadly makes the added costs of compute on collision checking equal. (A) is doing N * 3 (num of primitives) and (B) is doing 2 * (N-1). For N = 2,3,7 -- (B) is better and honestly preferable logically to not deal with annoying deconfliction logic that also has the negative of making the path's poses no longer evenly spaced without other interpolation logic. This is also what other implementations (like MATLAB) do for this reason as well.

An interesting experiment came to mind to try only adding the +/- 1 primitive searches. For the case of even jumps every 7 bins, it would take then at most 5 expansions to be able to reach any given part of the skipped portion. While it wouldn't be possible everywhere, you'd be able to follow the heuristic functions within a relatively short period and opens up the full exploration space, somewhat. I'm not sure its a great, best solution, but I'm interested to explore it briefly.

@mangoschorle you mention some rounding issues in https://github.com/ros-planning/navigation2/issues/3172#issuecomment-1249118410.

Modified the getProjections. The way you currently use your precomputation of delta_xs and delta_ys causes rounding issues. I realized that once I wrote an interface which visualizes the graph. These rounding issues cause that the primitives frequently don't expand as expected but you get these asymmetrical primitive expansions. I do think that the round off errors in getProjections are the only reason you sometimes fall into odd bins.

Can you expand on that a bit? The angular bins are always exact valued, so there's no partial fraction of a bin to cause us to go into another subset of bins. The expansions use the bin numbers (e.g. 1,2,3,...) not the angle, so there's really no chance (?) of that kind of error. The raw angle is really only used for collision checking - though I only did a quick scan through so my memory could be missing something that is worth you pointing out if there's an issue

We would have to also modify the travelcost computation. Maybe we should move this into the nodes rather than have it in astar, where we have to rely on indices and the oreder in which we emplaced the primitives.

Our motion movement should still be sqrt(2) on the circle if we keep the sqrt(2) chord length constant. That's the distance of transit, so I don't think the travel cost function needs to be updated at all? Keep in mind these are distances on the curve (e.g. chord) not the actual spatial hypot distance necessarily. Along the curve is the "actual" travel length

In progress branch: https://github.com/ros-planning/navigation2/tree/3172

Aug 09 '23 22:08 SteveMacenski

This was a long time ago. So it is hard to wrap my head around this. I remember that I tried a lot of approaches including the two you mentioned to come up with a generic solution for the aforementioned issues.

The result was the following: I was able to get very good paths for scenarios, where the corridors and hallways are nicely aligned with the cell grid. The core problem that remained and which I finally gave up on was that there seemed to be a strong "bias" for the planner to prefer motions into the W N S E cells. For maps that are arbitrarily oriented and don't align nicely this will always result in sawtooth/stepping paths along a diagonal hallway.

Ideally, we would want the planner to be invariant to rotations of the map. One could nicely create a unittest for that and follow a test-driven implementation. The paths should coincide up to a rotation and translation around the start of the path.

Regarding the projections: If I can remember correctly the issue of unsymmetrical expansions had to do with the way you compute the deltas. This here is based on a quantized angle.

You add to the projection list the new_heading, but you compute the delta_x and delta_y based on quantized headings which shoud rather be done based on the actual heading in new_heading. The center primitive is along the continuous part of the heading new_heading the left and right primitives are non-symmetrical if the new_heading istn't perfectly aligned with the angular bins, which it never is. If new heading isn't perfectly in the middle of delta_xs[i][node_heading] + node->pose.x, delta_ys[i][node_heading] + node->pose.y, new_heading)

I suggest something like this float new_x = node->x_cell + projections_[0].x * cos_angs.at(static_cast<size_t>(new_heading)); float new_y = node->.y_cell + projections_[0].x * sin_angs.at(static_cast<size_t>(new_heading));

If you don't believe me plot your tree. You will see that the forward primitive isn't always centered between the left and right primitive expansion.

Another thing that I realized: The planner time is heavily dominated by the amount of expansions it takes and not by the amount of primitives. I wouldn't worry about amount of primitives so much. In my experiments, as soon as stepping/zig-zagging was avoided, the amount of expansions required to find the goal dropped by orders of magnitude. If it requires 11 Primitives to avoid zigzagging, you'll still be faster than using 3 primitives.

Aug 10 '23 10:08 mangoschorle

I'm trying to parse this to make sure I understand your point, so please bare with me:

You add to the projection list the new_heading, but you compute the delta_x and delta_y based on quantized headings which shoud rather be done based on the actual heading in new_heading. The center primitive is along the continuous part of the heading new_heading the left and right primitives are non-symmetrical if the new_heading istn't perfectly aligned with the angular bins, which it never is.

I'm trying to figure out what you mean by this. Do you mean in the primitive construction in initDubin or in its live use in getProjections? I think you mean getProjections so I'm going to talk w.r.t. that.

In the Hybrid-A* algorithm as written, each primitive's ending is exactly aligned with an angular bin, that's the role of the following when we compute the original delta_x/y. The spatial X/Y can land anywhere, but the angles always end up being in terms of the bins. In search, we store and use the angular quantization bins (e.g. 1,2,3,4...) not the angles when we expand (which is why you see projections.emplace_back(delta_x, delta_y, increments);, where increments is in bin numbers.

  if (angle < bin_size) {
    increments = 1.0f;
  } else {
    // Search dimensions are clean multiples of quantization - this prevents
    // paths with loops in them
    increments = ceil(angle / bin_size);
  }

When we get to getProjections, the line I think you're referencing float new_heading = node_heading + motion_model._theta; should all be whole valued bin numbers, represented as floats. If you're saying that that isn't true and that e have some outputs of new_headings that are non-whole numbers, I'd 100% agree we have an issue.

If that's what you're saying, please let me know!

I tried printing the new_heading (std::cout << new_heading << std::endl;) in getProjections and I don't see any non-whole valued numbers, however. Which makes me think perhaps my interpretation of this comment was incorrect.

I suggest something like this: float new_x = node->x_cell + projections_[0].x * cos_angs.at(static_cast<size_t>(new_heading));

That would quash the new_heading discretized error down to a whole-value, which would handle the issue indeed. Would you agree though that fixing the issue that new_heading as you see it isn't remaining the whole valued number it should be is the core issue to resolve rather than introducing measures to squash the effects of it? Though, pointing out where you think there are non-exact bin searches would be good, since where I looked above didn't yield it.

I don't argue that you may have found an issue, I'm just not seeing what I think you mean, so please feel free to berate me for looking up the wrong tree :smile:

You will see that the forward primitive isn't always centered between the left and right primitive expansion.

I don't think it should be. Recall that we're traversing on the chord of a curve, not straight-line paths. So the center primitive going straight should be "further" from the origin point because its distance is a perfect line. The turning primitives are traversing the curve of the minimum turning radius circle so its total spatial displacement is less. The three primitive should not exactly align in the center, it should be slightly out ahead.

Unless you mean like aligned symmetrically between right and left, that would indeed point to a problem.

Responses to other items

The planner time is heavily dominated by the amount of expansions it takes and not by the amount of primitives. I wouldn't worry about amount of primitives so much. In my experiments, as soon as stepping/zig-zagging was avoided, the amount of expansions required to find the goal dropped by orders of magnitude.

That generally follows for me too, are you using a circular robot model? I'm wondering because more primitives does involve more collision checking, which is the most expensive part of the planning, since we collision check on creating potential neighbors, not on visitation. Though, the cost aware heuristic drives away from obstacles, so that ratio may still be very small to not be super important. Just wondering.

The core problem that remained and which I finally gave up on was that there seemed to be a strong "bias" for the planner to prefer motions into the W N S E cells. For maps that are arbitrarily oriented and don't align nicely this will always result in sawtooth/stepping paths along a diagonal hallway.

It should also have diagonal biases given how the heuristics are computed. On this front, I don't think there's much to be done other than (A) adjust the obstacle heuristic neighborhood to be 16 connected rather than 8 or (B) do what many users do and use a path smoother after this algorithm -- which is what the common technique of Hybrid-A* does

Aug 10 '23 20:08 SteveMacenski

navigation2 navigation2 copied to clipboard

Zig-zag due to unfavorable angle increments and starting angle

navigation2
navigation2 copied to clipboard