navigation icon indicating copy to clipboard operation
navigation copied to clipboard

move_base cancelGoal doesn't work while executing recovery behavior

Open corot opened this issue 11 years ago • 9 comments

Not sure if this is bug... I posted as ros answer and someone suggested to open an issue. Looks like move_base action client's cancelGoal doesn't work while executing recovery behavior. That is, if I call cancelGoal or cancelAllGoals, the subsequent call to waitForResult doesn't return until all recovery fails. If I disable recovery behavior meanwhile with dynamic_reconfigure. doesn't returns until recovery fails (more than 2 minutes!). I suppose move_base gets somehow blocked when running recovery behavior and don't even attend reconfigure requests.

I'm using hydro-devel code, but I experienced the same with groovy package installation.

corot avatar May 02 '13 11:05 corot

Q&A page is here: http://answers.ros.org/question/61720/move_base-cannot-cancel-while-executing-recovery-behavior/

jack-oquin avatar May 02 '13 13:05 jack-oquin

Definitely a bug. The API will have to change to support this though. Currently the runBehavior call does its whole thing in one thread in the one function call, so there's no way to interrupt it.

I could give that function a return value ( boolean) which says whether it is done or not. Then the recovery behavior will need to have its loop opened up so it returns false after each cycle until it's done.

hershwg avatar May 02 '13 13:05 hershwg

Is there some way to hack it by setting a global boolean variable for the behavior to check periodically, returning if it's set?

jack-oquin avatar May 02 '13 13:05 jack-oquin

Well, we could always hack it, but I would love to avoid that.

Options I'm seeing so far:

  • change RecoveryBehavior API so each RecoveryBehavior gets a pointer to a MoveBase object or a NodeHandle or even just a boolean that lets it query whether it has been cancelled.
  • change RecoveryBehavior API so runBehavior() returns a boolean indicating "done" or "not done" and the docs are updated to indicated that every call to runBehavior() should take only a short (sub-second) amount of time. This may mean opening loops and implementing state machines to break up long-running recovery behavior functions.
  • Leave the API the same, but run the RecoveryBehavior in a separate thread. When a pre-empt or cancel arrives and the behavior is still running, the thread gets killed. This "option" looks pretty dangerous and non-portable, based on my readings. On Windows you can get situations where the child thread is in the middle of allocating memory and it has a lock on that and then just dies, leaving the memory-allocation lock locked (!). Under Posix threads the situation is a bit better, but still pretty awkward.
  • It is true that we could make a global variable which move_base would write to and which recovery behaviors could read from if they wanted. This is just a backwards-compatible way to change the API.
  • API for RecoveryBehavior could have an added pair of functions: bool isCancelled() and setIsCancelledCallback( boost::function<...> ). Then MoveBase can call setIsCancelledCallback() on every recovery behavior at initialization, and recovery behavior implementers can add calls to isCancelled() at their leisure (when they run into this problem, most likely). This is another backwards-compatible way to change the API which does not involve a global variable. :)

So after typing all that out, my favorite is the last one.

hershwg avatar May 02 '13 18:05 hershwg

My naive opinion: your last option looks good.

jack-oquin avatar May 02 '13 21:05 jack-oquin

Sorry, I'm not familiar with RecoveryBehavior API, so I cannot have an informed opinion.

When do you plan to attack this issue?

corot avatar May 06 '13 01:05 corot

Hello from 2017, Is there any solution implemented or available to fix this issue?

tutorgaming avatar Feb 21 '17 08:02 tutorgaming

@Tutorgaming Hello from 2018! Did you ever find a way to work around this?

ccattywampus avatar Feb 07 '18 19:02 ccattywampus

It seems that I disabled the original RecoveryBehavior and try to recover the robot outside the move_base. Everything works fine .

in the recovery behavior need to somehow connected to the move_base server

tutorgaming avatar Feb 08 '18 05:02 tutorgaming