Subscribe to Stuck in an Infiniteloop        RSS Feed
-----

An In-Depth Look At Binary Search Trees

Icon 4 Comments
Author's Note: I started this project with the intention of using just a regular ol' binary tree. No qualifiers, no balancing functions, etc... As I wrote the functionality, I realized that a regular ol' binary tree serves no purpose, save academia. Why? It's not useful. Without a min/max check or any balancing capability it's just a complicated linked list. There is no time or execution complexity advantage (i.e. no left/right check means O(n) time). So this initial foray into Binary Trees will utilize a Binary Search Tree (AVL and other flavors to follow, if time allows).

--------------

Binary Search Trees


Quote

A binary tree is a data structure in which each node has at most two children. Typically the first node is known as the parent and the child nodes are called left and right.


A few examples:

Posted Image

The tree can be unbalanced, but as long as the "at most two children" is followed, it is a binary tree. A binary search tree is a binary tree where values smaller then the root go to the left and those larger go to the right. A few examples:

Posted Image


(Strings are/can be compared in various ways, here it is just an example using first character equivalence).

The following implementation is again templated C++. If you're not familiar with C++ or templates, that's ok, the concept of a linked list is the same regardless of specific language or feature implementation. Let your eyes glaze over the template<class T> portions :)/>


Like the Linked List, a Binary Tree is composed of Nodes:

template<class T>
class Node {
private:
	T data;
	Node<T>* left;
	Node<T>* right;
public:
	Node();
	Node(T);
	T getData()						{return data;};
	Node<T>* getLeftChild()			{return left;};
	Node<T>* getRightChild()		{return right;};
	void setLeftChild(Node<T>*);
	void setRightChild(Node<T>*);
	void setData(T);
};



It's implementation:

template<class T>
Node<T>::Node(){
	data = NULL;
	left = right = NULL;
}

template<class T>
Node<T>::Node(T element){
	data = element;
	left = right = NULL;
}

template<class T>
void Node<T>::setLeftChild(Node<T>* node){
	left = node;
}

template<class T>
void Node<T>::setRightChild(Node<T>* node){
	right = node;
}

template<class T>
void Node<T>::setData(T element){
	data = element;
}



A visual representation:

Posted Image

There are more scenarios then a Linked List, so the assignment of children will vary based on context. The retrieval functions are pretty much self explanatory.


Again, these Nodes are worthless on their own, they need a container/structure to reach their full potential:

#include "Node.h"
template<class T>
class BinarySearchTree{
private:
	Node<T>* root;
	Node<T>* curNode;
	//private recursive helper functions
	void destroyTree(Node<T>*);			//helper function for destructor
	void add(Node<T>*, T);
	void preOrder(Node<T>*);
	void postOrder(Node<T>*);
	void inOrder(Node<T>*);
	bool containsElement(T);
	int size;
public:
	BinarySearchTree();
	BinarySearchTree(T);
	~BinarySearchTree();

	//functionality
	void addNode(T);
	void removeNode(T);
	void preOrderTraversal();
	void postOrderTraversal();
	void inOrderTraversal();
	int totalNodes()					{return size;};
};



The first thing you'll notice right off the bat are the numerous private functions. These are not called by the user at all. Why are they there? A quick digression:

--

Recursion:

Binary Trees and by association, most tree structures, are excellent candidates for recursion. Looping through by calling itself is the most elegant way to handle the situation. This is certainly not a hard and fast rule, but most people agree on this method. However, recursion presents its own problems. While the code may be elegant, it can be harder to originally develop and later debug. For this post I have implemented nearly every function using recursion except for the removal of nodes. Why? I find it easier to explain what exactly is going on given each scenario (but I'm getting ahead of myself).

--

These private functions are the actual recursive elements of each part. the user/program calls the public "version" of the function, which in turn, calls the private one. This is also important because using a public function as a recursive one would require exposing the root of the tree (since the root is generally passed as the parameter), not a good idea.


The BinarySearchTree's implementation:


template<class T>
BinarySearchTree<T>::BinarySearchTree(){
	root = NULL;
	size = 0;
}

template<class T>
BinarySearchTree<T>::BinarySearchTree(T element){
	Node<T>* temp = new Node<T>(element);
	root = temp;
	size = 1;
}

template<class T>
BinarySearchTree<T>::~BinarySearchTree(){
	cout << endl << endl;
	//use helper function to deallocate the tree
	destroyTree(root);

	cout << "Size is (zero indicates all memory was freed): " << size << endl;
}

template<class T>
void BinarySearchTree<T>::addNode(T element){
	if (root == NULL){
		root = new Node<T>(element);
		size++;
		return;
	}
	else if (containsElement(element)){
		cout << "Tree already has value [" << element << "], insertion aborted." << endl;
		return;
	}
	else{
		add(root, element);
	}
}

template<class T>
bool BinarySearchTree<T>::containsElement(T element){
	curNode = root;
	while(curNode != NULL){
		if(curNode->getData() == element){
			return true;
		}
		else {
			if(element > curNode->getData())	curNode = curNode->getRightChild();
			else curNode = curNode->getLeftChild();
		}
	}
	return false;
}

template<class T>
void BinarySearchTree<T>::removeNode(T element){
	//some basic checks 
	if(root == NULL){
		cout << "No tree or it's empty!" << endl;
		return;
	}
		else if (root->getData() == element){
		cout << "Don't delete the root! (I'll work this in later.)" << endl;
		return;
	}
	bool found = false;
	Node<T>* parent; //temporary handle for the cur node's parent
	curNode = root; //get a temporary handle current node

	//search the tree for the element
	while(curNode != NULL){
		if(curNode->getData() == element){
			found = true;
			break;
		}
		else {
			parent = curNode; //this is important
			if(element > curNode->getData())	curNode = curNode->getRightChild();
			else curNode = curNode->getLeftChild();
		}
	}
	if(!found){
		cout << "Data does not exist in this tree!" << endl;
		return;
	}

	//ok we now know the node we want to rmeove is in here
	/*
		Three cases:
		1. leaf node -- very easy
		2. node w/ one child, somewhat challenging
		3. node w/e two children - annoying
	*/

	//leaf
	if((curNode->getLeftChild() == NULL) && (curNode->getRightChild() == NULL)){
		//ensure there are no dangling links
		if(parent->getLeftChild() == curNode) parent->setLeftChild(NULL);
		else parent->setRightChild(NULL);

		//remove
		delete curNode;
		size--; //our rudimentary memory allocator handle
		return;
	}

	//one child
	if((curNode->getLeftChild() == NULL && curNode->getRightChild() != NULL) ||
		(curNode->getLeftChild() != NULL && curNode->getRightChild() == NULL)){
			//handle the first instance
			if(curNode->getLeftChild() == NULL && curNode->getRightChild() != NULL) { //yes I did the same check twice
				/*
					This part in plain English:
					the node we want to renove only has one child, its right child
					we check the parent, if the "to be deleted" node is its left child
					assign linkage and delete
				*/
				if(parent->getLeftChild() == curNode){
					parent->setLeftChild(curNode->getRightChild());
				}
				else{ //same deal, but if the deleted node is on the right 
					parent->setRightChild(curNode->getRightChild());
				}
			}
			else { //left child present, no right child 
				/*
					This part in plain English:
					same deal as above, excpet now the left child is "alive"
					and the right child is nonexistent
				*/
				if(parent->getLeftChild() == curNode){
					parent->setLeftChild(curNode->getLeftChild());
				}
				else{
					parent->setRightChild(curNode->getLeftChild());
				}
			}
			//cut down on duplicate code
			//the delete/size-- could go inside each if/else portion
			delete curNode;
			size--;
			return;
	}

	//complex case, two children
	/*
		in short:
		We want to replace the node to be deleted with the smallest value
		in the right subtree. We then delete the leaf
	*/
	if((curNode->getLeftChild() != NULL)&&(curNode->getRightChild() != NULL)){
		Node<T>* treeChecker;
		treeChecker = curNode->getRightChild(); //assign handle and begin searching

		if((treeChecker->getLeftChild() == NULL) &&(treeChecker->getRightChild() == NULL)){
			//praise the lord its a leaf node, how easy
			curNode->setData(treeChecker->getData());
			delete treeChecker;
			size--;
			curNode->setRightChild(NULL);
		}
		else{ //right child has children
			//if it has a left child, move all the way down to find the smallest element
			if(curNode->getRightChild()->getLeftChild() != NULL){ //go to the curNode's right left child
				//long annoying names, but you'll thank me later
				Node<T>* leftCurrent;
				Node<T>* leftCurrentParent; 
				//inital assignment
				leftCurrentParent = curNode->getRightChild();
				leftCurrent = leftCurrentParent->getLeftChild();
				while(leftCurrent->getLeftChild() != NULL){ //let's go down the left child rabbit hole
					leftCurrentParent = leftCurrent;
					leftCurrent = leftCurrent->getLeftChild();
				}
				//assign the leftest most node's info to the node that was flagged for deletion
				curNode->setData(leftCurrent->getData());

				delete leftCurrent;
				size--;
				leftCurrentParent->setLeftChild(NULL); //take care of any danglers
			}
			else{ //left child is non existent, use the right instead
				Node<T>* rightTemp;
				rightTemp = curNode->getRightChild();
				curNode->setData(rightTemp->getData());
				delete rightTemp;
				size--;
			}
		}
		return;
	}

}

template<class T>
void BinarySearchTree<T>::add(Node<T>* root, T element){
	//we want to maintain a somewhat balanced tree
	//so we'll do a binary search tree check
	
	//lesser values to the left, greater to the right
	if (element < root->getData()){
		if(root->getLeftChild() != NULL){
			add(root->getLeftChild(), element);
		}
		else{
			root->setLeftChild(new Node<T>(element));
			size++;
		}
	}
	else{
		if(root->getRightChild() != NULL){
			add(root->getRightChild(), element);
		}
		else{
			root->setRightChild(new Node<T>(element));
			size++;
		}
	}
}

template<class T>
void BinarySearchTree<T>::destroyTree(Node<T>* root){
	//recurse down the tree
	if(root != NULL){
		destroyTree(root->getLeftChild());
		destroyTree(root->getRightChild());
		delete root;
		size--;
	}
}

template<class T>
void BinarySearchTree<T>::preOrderTraversal(){
	cout << endl <<  "Pre Order:" << endl;
	preOrder(root);
}

template<class T>
void BinarySearchTree<T>::postOrderTraversal(){
	cout << endl << "Post Order:" << endl;
	postOrder(root);
}

template<class T>
void BinarySearchTree<T>::inOrderTraversal(){
	cout << endl << "In Order:" << endl;
	inOrder(root);
}

template<class T>
void BinarySearchTree<T>::preOrder(Node<T>* root){
	cout << root->getData() << " ";
	if (root->getLeftChild() != NULL) {preOrder(root->getLeftChild());}
	if (root->getRightChild() != NULL) {preOrder(root->getRightChild());}
}

template<class T>
void BinarySearchTree<T>::postOrder(Node<T>* root){
	if (root->getLeftChild() != NULL) {postOrder(root->getLeftChild());}
	if (root->getRightChild() != NULL) {postOrder(root->getRightChild());}
	cout << root->getData() << " ";
}

template<class T>
void BinarySearchTree<T>::inOrder(Node<T>* root){
	if (root->getLeftChild() != NULL) {inOrder(root->getLeftChild());}
	cout << root->getData() << " ";
	if (root->getRightChild() != NULL) {inOrder(root->getRightChild());}
}



I've left a great deal of comments all around the above implementation just in case my self documenting code fails to adequately explain my thought process and/or intentions.

Few notes:

-- Every time a node is added or removed, size is incremented/decremented accordingly. This saves a full linear iteration to count the number of nodes currently in the tree.
-- You can delete a root node in practice, however for the purposes of this post I have intentionally removed that functionality. An error message is thrown if the user tries.
-- Adding a new node uses a java-esque method of adding newly allocated objects. Since the tree handles its own memory on clean up this is acceptable.
-- Duplicates are not allowed and will notify the user if attempted.
-- As previously mentioned, everything operates on recursion except for the removeNode() algorithm. If this offends you, feel free to stop reading ;)/>
-- It's challenging to fully encompass recursion in an illustration (see below)

Visual guides to the above functionality:

Constructors/Destructor:

Posted Image


Functions:

Posted Image

Posted Image

The code has this in the comments, but to briefly summarize what you just saw in the above picture in regards to insertion/deletion of nodes:

Insertion:
1. No root? Make the node the root and return
2. Else, check the value and determine which subtree to traverse
3. add the new node

Three cases in deletion:
1. leaf node -- very easy, just delete it and set parent's link to NULL
2. node w/ one child, assign the parent node's link of the "to be deleted" node to its child
3. node w/ two children - traverse left subtree OR go down the right if left node is NULL

Traversal:

Posted Image

These are pretty self explanatory. Each utilizes recursion to print the elements of the tree. The only difference is when the current item is printed. These traversals are also the basis of infix, prefix, and postfix equations and their evaluations (neat, huh?).

Examples:

Example one (basic tree building, no removals):

#include <iostream>
#include <string>
#include "BinarySearchTree.h"
using namespace std;

int main(){
	BinarySearchTree<int>* tree = new BinarySearchTree<int>(6);
	tree->addNode(10);
	tree->addNode(3);
	tree->inOrderTraversal();
	tree->preOrderTraversal();
	tree->postOrderTraversal();
	delete tree;
	return 0;
}



Output:

Quote

In Order:
3 6 10
Pre Order:
6 3 10
Post Order:
3 10 6

Size is (zero indicates all memory was freed): 0
Press any key to continue . . .


Diagram:

Posted Image

Example two (leaf node deletion):

#include <iostream>
#include <string>
#include "BinarySearchTree.h"
using namespace std;

int main(){
	BinarySearchTree<int>* tree = new BinarySearchTree<int>(6);
	tree->addNode(10);
	tree->addNode(3);
	tree->removeNode(3);
	tree->inOrderTraversal();
	tree->preOrderTraversal();
	tree->postOrderTraversal();
	delete tree;
	return 0;
}



Output:

Quote

In Order:
6 10
Pre Order:
6 10
Post Order:
10 6

Size is (zero indicates all memory was freed): 0
Press any key to continue . . .


Diagram:

Posted Image

Example three (deletion with one child):

#include <iostream>
#include <string>
#include "BinarySearchTree.h"
using namespace std;

int main(){
	BinarySearchTree<int>* tree = new BinarySearchTree<int>(6);
	tree->addNode(10);
	tree->addNode(3);
	tree->addNode(2);
	tree->removeNode(3);
	tree->inOrderTraversal();
	tree->preOrderTraversal();
	tree->postOrderTraversal();
	delete tree;
	return 0;
}



Output:

Quote

In Order:
2 6 10
Pre Order:
6 2 10
Post Order:
2 10 6

Size is (zero indicates all memory was freed): 0
Press any key to continue . . .


Diagram:

Posted Image

Example four (deletion with two children):

#include <iostream>
#include <string>
#include "BinarySearchTree.h"
using namespace std;

int main(){
	BinarySearchTree<int>* tree = new BinarySearchTree<int>(6);
	tree->addNode(10);
	tree->addNode(3);
	tree->addNode(2);
	tree->addNode(4);
	tree->removeNode(3);
	tree->inOrderTraversal();
	tree->preOrderTraversal();
	tree->postOrderTraversal();
	delete tree;
	return 0;
}



Output:

Quote

In Order:
2 4 6 10
Pre Order:
6 4 2 10
Post Order:
2 4 10 6

Size is (zero indicates all memory was freed): 0
Press any key to continue . . .


Diagram:

Posted Image

There are an infinite number of scenarios given the nature and order of possible input. However, all will boil down to one of the three cases (no children, one child, or two children) as mentioned in the code. I have shown one of each of those "base cases" above. (examples two-four).

Hopefully you found this post helpful. While I have tested extensively there is a chance that there are typos or errors. If you happen upon one please feel free to post a comment below or PM me.

Look for more data structures in the near future! Happy Coding!

4 Comments On This Entry

Page 1 of 1

LJ080805 Icon

26 November 2009 - 04:49 PM
Really neat stuff man. I don't know why but as I'm learning new concepts in my class you're posting up content exactly about what I'm about learn and code for our problem sets, just thought it was eerie how nicely coincidental it all is. Anyways great post it's been helping me learn a lot easier than my book is. Hope you keep it up!
0

KYA Icon

26 November 2009 - 05:48 PM

LJ080805, on 26 Nov, 2009 - 04:49 PM, said:

Really neat stuff man. I don't know why but as I'm learning new concepts in my class you're posting up content exactly about what I'm about learn and code for our problem sets, just thought it was eerie how nicely coincidental it all is. Anyways great post it's been helping me learn a lot easier than my book is. Hope you keep it up!



Weird :ph34r:

Glad you found it helpful :)
0

DaneAU Icon

30 November 2009 - 11:14 PM
This is exactly what this site needs - more comprehensive indepth looks and we could write a community course :o

Gj
0

Locke Icon

01 December 2009 - 12:25 AM
Yeah, I'm having the same thing pop up. We just started binary search trees about a week (or so) ago, and then you post this. :D

You're spying on us. LEAVE US ALONE!
0
Page 1 of 1

May 2018

S M T W T F S
  12345
6789101112
13141516171819
2021 22 23242526
2728293031  

Tags

    Recent Entries

    Recent Comments

    Search My Blog

    0 user(s) viewing

    0 Guests
    0 member(s)
    0 anonymous member(s)