cpp icon indicating copy to clipboard operation
cpp copied to clipboard

binary-search-tree: left() and right() return modifiable subtree

Open siebenschlaefer opened this issue 5 years ago • 6 comments

In the exercise "binary-search-tree" the methods left() and right() are hard to implement correctly. The example implementation itself is IMHO incorrect.

The tests look like this:

template<typename T>
using tree_ptr = typename std::unique_ptr<binary_tree::binary_tree<T>>;

template<typename T>
static void test_leaf(const tree_ptr<T> &tree, const T& data, bool has_left, bool has_right)

//...

test_leaf<uint32_t>(tested->left(), 2, false, false);

That forces implementations of binary_tree::left() (and binary_tree::right()) to return a tree_ptr reference or rvaue which points to a non-const subtree.
Remember: const std::unique_ptr<some_type> means that the unique_ptr itself is const, not the object it points to.

Somebody could take the example implementation (example.h) and write

auto tree = binary_tree::binary_tree<int>(100);
tree.insert(50);
tree.left()->insert(150);

and thus invalidate the invariant of tree.

I consider any implementation of a binary search tree that can be corrupted that way as faulty. I came up with three alternatives that avoid this issue and pass the current tests:

  • binary_tree could have a member variable allow_insert that is true for the root and false for all subtrees
  • binary_tree could have a parent pointer and insert() could check for each of its parents if the new value violates that parent's invariant
  • left() and right() could copy the subtree and return a unique_ptr to that copy.

But IMHO none of those alternatives feels right.

The easiest solution would be if left() and right() could return const raw pointers. But one could argue that raw pointers result in unclear ownership.

siebenschlaefer avatar Apr 12 '19 02:04 siebenschlaefer

That is a good point, I missed the issue with the left() and right() return values when I reviewed it. I agree that allowing all inserts into whatever is returned from those functions is a bug. Having left() and right() at all feels weird to me for a container, but it seemed necessary to allow that to validate the tests, which are all about the structure of the resulting BST.

Of your options, I think having an allow_insert (or I'd probably prefer is_root) member variable is probably the best way to pass the current tests.

This weekend I'm going to go back and actually try to do this problem myself just from the test suite so I can weigh in a little better. (I did review it, but closed-book implementing it will be a bit different)

arcuru avatar Apr 12 '19 15:04 arcuru

In short, I think returning a const binary_tree<T>* is clear that it's non-owning and is probably the best approach. We could have instead used observer_ptr, but I think in the modern cpp world a raw pointer is equivalent. Bjarne made a good argument here

I wrote a full solution to mess with it. I think retaining the constness of the subtrees is a good extra exercise, since it's not entirely trivial, and passing in a const reference to the test_sort function forces the iterator to actually have the right const markers.

Forcing the student to return a const reference to a unique_ptr is probably not a good approach. @elyashiv may have some thoughts on this.

arcuru avatar Apr 14 '19 02:04 arcuru

Hi. Thanks for the review!

The left and right methods need not be in a tree - they expose internal structure to the user, and he has nothing good to do with them. The only reason I added them is so I can test...

I returned a const unique_ptr because that was the most straight forward way I thought of, and I didn't want any bare pointers... I guess a bare const pointer is a good approach here.

I wander how real world unit tests do these tests - do they test the internal structure?

elyashiv avatar Apr 19 '19 08:04 elyashiv

Yep, I understand why it's implemented the way it is, and ordinarily using unique_ptr everywhere is a great idea. The bare pointer I think more accurately reflects that it could be null and that you don't own it. It's an odd corner case.

In the real world you probably wouldn't test the internal structure at all, and just test for correctness (though possibly with benchmarks to validate the perf properties if needed). Choosing what to test would possibly be motivated by the structure of the implementation.

You could write it similarly to this to test the implementation, but you'd typically try to restrict access to the internal structure to just the test suite. Maybe by putting that access specifically into an internal namespace or something.

arcuru avatar Apr 27 '19 16:04 arcuru

C++ learner here, I was just starting this exercise and trying to understand the use of unique_ptr. If left and right return a unique_ptr to a tree node, that means the calling code takes ownership of that subtree. So am I correct in thinking that this means any call to left or right invalidates the parent tree, because the parent loses ownership of its subtree and no longer controls when its destructor gets called? If the variable holding the unique_ptr drops out of scope, the subtree is freed, the parent's pointer is dangling, and when the parent is freed, it will double-free the subtree?

alexkyllo avatar Oct 24 '19 15:10 alexkyllo

If left() or right would return a std::unique_ptr<binary_tree<T>> you would be right, that would transfer ownership and essentially remove that child from the tree.

But you could return a const std::unique_ptr<binary_tree<T>>& instead.
Does that help?

siebenschlaefer avatar Oct 24 '19 17:10 siebenschlaefer