cpp
cpp copied to clipboard
binary-search-tree: left() and right() return modifiable subtree
In the exercise "binary-search-tree
" the methods left()
and right()
are hard to implement correctly. The example implementation itself is IMHO incorrect.
The tests look like this:
template<typename T>
using tree_ptr = typename std::unique_ptr<binary_tree::binary_tree<T>>;
template<typename T>
static void test_leaf(const tree_ptr<T> &tree, const T& data, bool has_left, bool has_right)
//...
test_leaf<uint32_t>(tested->left(), 2, false, false);
That forces implementations of binary_tree::left()
(and binary_tree::right()
) to return a tree_ptr
reference or rvaue which points to a non-const
subtree.
Remember: const std::unique_ptr<some_type>
means that the unique_ptr
itself is const
, not the object it points to.
Somebody could take the example implementation (example.h
) and write
auto tree = binary_tree::binary_tree<int>(100);
tree.insert(50);
tree.left()->insert(150);
and thus invalidate the invariant of tree
.
I consider any implementation of a binary search tree that can be corrupted that way as faulty. I came up with three alternatives that avoid this issue and pass the current tests:
-
binary_tree
could have a member variableallow_insert
that istrue
for the root andfalse
for all subtrees -
binary_tree
could have a parent pointer andinsert()
could check for each of its parents if the new value violates that parent's invariant -
left()
andright()
could copy the subtree and return aunique_ptr
to that copy.
But IMHO none of those alternatives feels right.
The easiest solution would be if left()
and right()
could return const
raw pointers. But one could argue that raw pointers result in unclear ownership.
That is a good point, I missed the issue with the left()
and right()
return values when I reviewed it. I agree that allowing all inserts into whatever is returned from those functions is a bug. Having left()
and right()
at all feels weird to me for a container, but it seemed necessary to allow that to validate the tests, which are all about the structure of the resulting BST.
Of your options, I think having an allow_insert
(or I'd probably prefer is_root
) member variable is probably the best way to pass the current tests.
This weekend I'm going to go back and actually try to do this problem myself just from the test suite so I can weigh in a little better. (I did review it, but closed-book implementing it will be a bit different)
In short, I think returning a const binary_tree<T>*
is clear that it's non-owning and is probably the best approach. We could have instead used observer_ptr
, but I think in the modern cpp world a raw pointer is equivalent. Bjarne made a good argument here
I wrote a full solution to mess with it. I think retaining the constness of the subtrees is a good extra exercise, since it's not entirely trivial, and passing in a const reference to the test_sort
function forces the iterator to actually have the right const markers.
Forcing the student to return a const reference to a unique_ptr is probably not a good approach. @elyashiv may have some thoughts on this.
Hi. Thanks for the review!
The left
and right
methods need not be in a tree - they expose internal structure to the user, and he has nothing good to do with them. The only reason I added them is so I can test...
I returned a const unique_ptr because that was the most straight forward way I thought of, and I didn't want any bare pointers... I guess a bare const pointer is a good approach here.
I wander how real world unit tests do these tests - do they test the internal structure?
Yep, I understand why it's implemented the way it is, and ordinarily using unique_ptr
everywhere is a great idea. The bare pointer I think more accurately reflects that it could be null and that you don't own it. It's an odd corner case.
In the real world you probably wouldn't test the internal structure at all, and just test for correctness (though possibly with benchmarks to validate the perf properties if needed). Choosing what to test would possibly be motivated by the structure of the implementation.
You could write it similarly to this to test the implementation, but you'd typically try to restrict access to the internal structure to just the test suite. Maybe by putting that access specifically into an internal namespace or something.
C++ learner here, I was just starting this exercise and trying to understand the use of unique_ptr
.
If left
and right
return a unique_ptr
to a tree node, that means the calling code takes ownership of that subtree. So am I correct in thinking that this means any call to left
or right
invalidates the parent tree, because the parent loses ownership of its subtree and no longer controls when its destructor gets called? If the variable holding the unique_ptr
drops out of scope, the subtree is freed, the parent's pointer is dangling, and when the parent is freed, it will double-free the subtree?
If left()
or right
would return a std::unique_ptr<binary_tree<T>>
you would be right, that would transfer ownership and essentially remove that child from the tree.
But you could return a const std::unique_ptr<binary_tree<T>>&
instead.
Does that help?