BehaviorTree.CPP icon indicating copy to clipboard operation
BehaviorTree.CPP copied to clipboard

Why do control nodes only halt running children?

Open asasine opened this issue 2 years ago • 4 comments

Control nodes like SequenceNode, FallbackNode, and others, halt their children when they complete. However, we recently discovered a bug with some of our condition nodes where their halt method was not being called when their parent control nodes completed. I traced this down to the BT::ControlNode::haltChild(size_t) method implementation skipping any node that isn't RUNNING.

https://github.com/BehaviorTree/BehaviorTree.CPP/blob/12373ad8d85286181658ea91720538cb53082083/src/control_node.cpp#L44-L52

Why is this implemented as such?

Some of our condition nodes derive directly from BT::LeafNode, as BT::ConfiditionNode overrides halt as final, preventing us from overriding it ourselves.

https://github.com/BehaviorTree/BehaviorTree.CPP/blob/12373ad8d85286181658ea91720538cb53082083/include/behaviortree_cpp_v3/condition_node.h#L28-L32

We do this because, while our condition nodes are synchronous, only returning SUCCESS or FAILURE, they do maintain internal state that we want to reset between each halt.

Since the default implementation for BT::ConditionNode::halt (and BT::SyncActionNode::halt) is a no-op, it seems safe for control nodes to call halt on non-running children.

asasine avatar May 18 '22 14:05 asasine

I think that there is some confusion about what halt() means.

The idea is that halt means "cancel/abort the running Node". If your node was a process, halt would be the equivalent of the command "kill". This is also the reason why a halted Node will not return FAILURE nor SUCCESS... it just doesn't return anything, because it was aborted.

"halt" does not mean "do a cleanup of your Node".

Therefore the implementation of ControlNode::haltChild(size_t i) is correct, in my opinion. You can abort only a RUNNING action.

Some of our condition nodes derive directly from BT::LeafNode, as BT::ConfiditionNode overrides halt as final, preventing us from overriding it ourselves.

There is a reason why ConditionNode::halt() is final. A Condition is, by definition, always Synchronous, NEVER asynchronous. I.e., it can not be in the RUNNING state.

if it is not RUNNING, there is no reason to halt it.

Does it make sense?

facontidavide avatar May 20 '22 08:05 facontidavide

It does, though we find ourselves in a bind now.

"halt" does not mean "do a cleanup of your Node".

How do you propose we implement a “do a cleanup of [our] Node” for stateful, synchronous nodes?

asasine avatar May 20 '22 10:05 asasine

Interesting, is there another way you could construct the tree with the existing set of action, condition, control and decorator nodes that would achieve the same logic? How convoluted would it end up being?

Are you able to post a small, even if contrived example @asasine ?

cmraaron avatar Aug 02 '22 17:08 cmraaron

There is a reason why ConditionNode::halt() is final. A Condition is, by definition, always Synchronous, NEVER asynchronous. I.e., it can not be in the RUNNING state.

So if you did have a condition that was based on information not immediately available, the idea would be to obtain that with an action, returning RUNNING, and put the condition after it in a Sequence?

cmraaron avatar Aug 02 '22 17:08 cmraaron

@cmraaron I uploaded an example to this gist: https://gist.github.com/asasine/a201a88cfe3518b16d1bd3f296dd3c23

The gist of it is a BT condition node that subscribes to a ROS topic and evaluates some condition. In the example, it's a simple std_msgs/msg/Bool topic that copies the data field, but in practice, it can be any topic and condition of arbitrary complexity. One version of the condition node derives from BT::ConditionNode, the other derives directly from BT::LeafNode and overrides halt. In both cases, neither halt is called, so the nodes retain state even after the control node in the tree halts its children.

asasine avatar Aug 19 '22 15:08 asasine

We've had another run-in with this issue, but this time it felt much more surprising as we felt like were doing everything "right."

This CountToThree node will return RUNNING for the first two ticks then return SUCCESS. After it returns SUCCESS, the parent control node will not halt it, therefore it will never restart from 0.

class CountToThree : public BT::LeafNode
{
public:
  CountToThree(const std::string& name, const BT::NodeConfiguration& config)
    : BT::LeafNode(name, config),
    counter_(0)
  {
  }

  BT::NodeType type() const override final
  {
    return BT::NodeType::ACTION;
  }

  void halt() override final
  {
    counter_ = 0;
  }

protected:
  BT::NodeStatus tick() override
  {
    if (++counter_ < 3)
    {
      return BT::NodeStatus::RUNNING;
    }
    else
    {
      return BT::NodeStatus::SUCCESS;
    }
  }

private:
  int counter_;
};

Our actual BT node was a bit more complex than this but the gist remains the same: perform some asynchronous action (returning RUNNING) in tick, eventually complete, reset when halted to be ready for another go at the action.

asasine avatar Sep 09 '22 12:09 asasine

About the "when I do cleanup?" discussion. When you return SUCCESS or FAILURE you do know that your Action is done. You returned those values!!!!! If a terminated action require cleanup, it is your responsibility to do that.

On the other hand, when you are RUNNING and you are halted, that is an unexpected ending, but you are notified by the callback, so you can do your cleanup there.

This is what I would do:

  void halt() override final
  {  
    // DO NOTHING :)
  }

protected:
  BT::NodeStatus tick() override
  {
    if( status() == BT::NodeStatus::IDLE )
    {
      counter_ = 0;
    }
    if (++counter_ < 3)
    {
      return BT::NodeStatus::RUNNING;
    }
    else
    {
      return BT::NodeStatus::SUCCESS;
    }
  }

or, if you prefer:

  void cleanup()
  {
    counter_ = 0;
  }

  void halt() override final
  {  
    cleanup();
  }

protected:
  BT::NodeStatus tick() override
  {
    if (++counter_ < 3)
    {
      return BT::NodeStatus::RUNNING;
    }
    else
    {
      cleanup();
      return BT::NodeStatus::SUCCESS;
    }
  }

You are still forgetting that halt() is NOT a cleanup, you can not expect it to be called... unless your node was aborted during RUNNING.

facontidavide avatar Sep 09 '22 13:09 facontidavide