ruby icon indicating copy to clipboard operation
ruby copied to clipboard

binary-search-tree: Binary Search Tree Exercise

Open bmulvihill opened this issue 9 years ago • 12 comments

Hey I've been working on the BST exercise over at https://github.com/exercism/xcrystal/pull/72 - and while researching other implementations I noticed a lot aren't fully implementing a binary search tree.

We are only implementing and testing insertion and traversal, but not searching and deletion. A binary search tree is a great data structure to learn and implementing all the operations provide additional challenge to the programmer. Maybe we could structure the tests in approximate descending difficulty - search, traverse, insert, delete.

I also think we should separate tests which are language specific (with some sort of annotation) from the actual BST functionality (i.e. returning an enumerator).

bmulvihill avatar Dec 21 '16 15:12 bmulvihill

We are only implementing and testing insertion and traversal, but not searching and deletion. A binary search tree is a great data structure to learn and implementing all the operations provide additional challenge to the programmer. Maybe we could structure the tests in approximate descending difficulty - search, traverse, insert, delete.

Some chaotically written down thoughts:

I like to add those options as suggestions when I nit them, but I kind of feel like search (#include?) should be mandatory for a BST. A nice thing about using #include? (I think) is that you can either include Enumerable, which gives you an easy solution, but isn't very performant ( vs ), or you can define it yourself to be more efficient. This is a nice way to let people decide for themselves which trade-offs they want to make.

It's worth pointing out that it's not necessarily standard to implement delete operations for binary trees, because in general doing so changes the performance behaviour of searching. Unless I'm much mistaken, depending on the order of operations, it's possible to end up with a tree in which searching takes .

There are also different ways of traversing the tree. Should we add those as well? More generally, what would comprise a "complete BST" and what goals would adding each requirement provide? Is the goal to introduce exercists to the data structure, to the ruby language and its features, or something else?

remcopeereboom avatar Jan 26 '17 12:01 remcopeereboom

@remcopeereboom When thinking about what operations to include on a BST I look back to CS Data Structures course and they usually always go through the main 4 (Traverse, Search, Insert, Delete).

Quick question how would implementing delete change the behavior of search? I've always seen these methods implemented separately of each other.

Regarding the different ways of traversal I don't think there is a "wrong" way, but through nits you can encourage more efficiency. I think the goal for this particular problem (and any algorithm type problem) is not only to practice the language but also learn the algorithm/data structure and its benefits since they are building blocks for a deeper understanding of programming in general.

bmulvihill avatar Jan 26 '17 15:01 bmulvihill

@bmulvihill

Quick question how would implementing delete change the behavior of search? I've always seen these methods implemented separately of each other.

The delete operations can cause the tree to be unbalanced towards one or the other direction. It's possible to end up with a tree that looks like:

/\
  /\
    /\

This transforms the search from logarithmic to linear worst case and average case performance. The problem stems from the fact that while it is trivial to delete nodes that have either zero children or one child, deleting nodes that have two children mean that you now have to select a branch to be the top node. Depending on your bias towards either the predecessor or the successor child node, this can eventually lead to either a left-biased or a right-biased tree respectively. This will lead to a tree that has linear performance for the major operations.

You can mitigate the problem by alternating left and right biasses. Experimentally (after many random insertions or deletions) this leads to O(sqrt n) average case performance, because deletions and insertions take place in random branches (apparently the alternating bias isn't quite enough to keep the tree perfectly balanced, but still good enough to prevent the tree from being linear). I'm pretty sure that nobody has proven this algebraically yet however, so that is still an open problem in computer science!

remcopeereboom avatar Jan 26 '17 17:01 remcopeereboom

@remcopeereboom Couldn't this problem also exist in insertion if we are always inserting values to one side of the tree, and wouldn't this be more of a problem of balancing the tree? Which can be solved by more specific types of BST (i.e. Red Black Trees). That would be outside the scope of this exercise though, or it could be a harder version of the same problem.

bmulvihill avatar Jan 26 '17 18:01 bmulvihill

Couldn't this problem also exist in insertion if we are always inserting values to one side of the tree, and wouldn't this be more of a problem of balancing the tree?

Absolutely. Worst case performance of a BST is O(n), not O(log n). But generally we don't have worst case performance because we don't insert elements in monotonically increasing or decreasing order. Instead we insert elements at random. The average case or expected performance is O(log n) for a BST with random insertions. But only O(sqrt n) for a tree with random insertions and deletions.

Which can be solved by more specific types of BST (i.e. Red Black Trees).

A self balancing tree like a Red-Black Tree does solve this problem. In fact Red-Black trees are used specifically for this purpose!

remcopeereboom avatar Jan 27 '17 11:01 remcopeereboom

Good points @remcopeereboom I had to do a little research and refresh my brain a bit. I still think there is value in implementing the delete operation as it requires some thought, and is considered one of the basic operations of a BST (a least from an educational standpoint). Through nits one can encourage a more robust BST, or this exercise could just be a "basic" BST and there could be a different exercise that builds on the basic tree and deals with balancing.

bmulvihill avatar Jan 27 '17 13:01 bmulvihill

I still think there is value in implementing the delete operation as it requires some thought, and is considered one of the basic operations of a BST (a least from an educational standpoint).

I can get behind that.

Through nits one can encourage a more robust BST, or this exercise could just be a "basic" BST and there could be a different exercise that builds on the basic tree and deals with balancing.

There's value in both approaches. My concern with the nits is that I think most people won't get one. Perhaps some information in the README on the performance would nudge people in the right direction?

remcopeereboom avatar Jan 27 '17 17:01 remcopeereboom

I've been working on writing new tests for this problem that fix some of the issues discussed above, will provide a little more challenge, and will hopefully guide people a little more to some Ruby conventions for data structures.

I am kind of stuck on whether to force the tree to add nodes in the order they are given or to allow people to reorder the tree as they see fit as long as they hold to the requirements that all value left are <= to the parent and that all values to the right are > to the parent. Didactically I think fixing the node order is best. It also has the benefit of making the tests easier. But the downside is that there is less flexibility in how to implement things. I would love other people's thoughts on that.

This question also has some implications for what to consider tree identity. Do we care only that the tree contains the same items, or do we care in which subtrees they are in. I am strongly leaning towards the latter, because it seems reasonable that if a == b, that a.left == b.left and a.right == b.right. That said, there are plenty of examples where that kind of transitive behaviour does not hold in Ruby (operations on Float's being a prime example).

The proposed (expanded) API:

class BST
  def self.new
    # Returns an empty tree.
  end

  def size
    # Returns the size of the tree.
  end

  def empty?
    # Returns true if the tree is empty, false otherwise.
  end

  def item
    # Returns the item at the root of the tree.
  end

  def left
    # Returns the left sub-tree.
  end

  def right
    # Returns the right sub-tree.
  end

  def include?(item)
     # Returns true if the item is in the tree, false otherwise.
  end

  def insert(item)
    # Inserts an item into the tree.
  end

  def delete(item)
    # Deletes the item if it is in the tree, does nothing if it is not.
  end

  def each_in_order(&block)
    # Yields the elements in sorted order
    # Returns an enumerator to each_in_order if no block is given.
    # Returns self if a block is given.
  end

  def ==(other)
    # Returns false if other is not a BST.
    # Returns true if item in root equals item in other's root and left == other.left and right == other.right,
    # false othewise.
  end
end

remcopeereboom avatar Feb 25 '17 13:02 remcopeereboom

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 29 '17 20:04 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 30 '17 09:06 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 29 '17 19:08 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 31 '17 10:10 stale[bot]