Topic : Binary Trees
Author : Unknown
Page : 1

Binary Trees

Linked lists overcome some of the problems of arrays, but they create new ones. Accessing an element in a linked list is a linear (O(n)) time operation as is searching it. The only search that is possible on a linked list is linear search. Binary trees retain the linked lists' ability to grow and shrink as required, but they also have an O(log n) average access and search time. On this page we're going to particularly concentrate on binary search trees (but I'll just keep calling them binary trees).

The binary tree node is quite similar to the linked list node at first glance. However whereas the linked list node had one successor the binary tree node has two. I'll call these left and right (you'll see why later). So the basic structure is:
struct Node
{
int data;
Node *left, *right;
};

Again the tree starts simply with a pointer to the first node. The complete class specification is:
class binaryTree
{
private:
struct Node
{
int data;
Node *left, *right;
};

public:
void Insert(int newdata);
void Remove(int data);
int Find(int data);
};

Again this class doesn't do that much and you won't find it in a real program, but filling in the functions gives you an idea of how to deal with this type of structure. The power of the binary tree lies in where you put the data. When you are inserting a node you place it to the left if it is less than the value of the current node and put it to the right otherwise. This means that the data is effectively sorted. The Insert function is:
void binaryTree::Insert(int newdata)
{
Node *temp = new Node;
temp->data = newdata;
temp->left = NULL;
temp->right = NULL;
else
{
// find the correct location for the new node
// and insert it
int inserted = 0;
while(!inserted)
{
if(locn->data > temp->data)
{
if(locn->left == NULL)
{
locn->left = temp;
inserted = 1;
}
else
locn = locn->left;
}
else
{
if(locn->right == NULL)
{
locn->right = temp;
inserted = 1;
}
else
locn = locn->right;
}
}
}
}

The find function is probably the simplest of the three. It is quite similar to the binary search algorithm, but its worst case performance is O(n). It returns 1 if it finds the data, 0 otherwise.
int binaryTree::Find(int data)
{
while(1)   // infinite loop
{
if(temp == NULL)
return 0;
if(temp->data == data)
return 1;
if(temp->data < data)
temp = temp->right;
else
temp = temp->left;
}
}

As with linked lists the remove function is basically a find that also tracks the previous node.
void binaryTree::Remove(int data)
{
while(1)
{
if(temp == NULL)
return;
if(temp->data == data)
{
int left = 0;
if(prev->left == temp)
left = 1;
// deal with the simple case(s) where one
// child is empty
if(temp->left == NULL && left)
{
prev->left = temp->right;
delete temp;
return;
}
if(temp->right == NULL && left)
{
prev->left = temp->left;
delete temp;
return;
}
if(temp->left == NULL && !left)
{
prev->right = temp->right;
delete temp;
return;
}
if(temp->right == NULL && !left)
{
prev->right = temp->left;
delete temp;
return;
}
// the difficult case where both children are nonempty
while(temp->left != NULL && temp->right != NULL)
{
temp->data = temp->right->data;
prev = temp;
temp = temp->right;
}
if(temp->right == NULL)
prev->right = temp->left;
if(temp->left == NULL)
prev->right = temp->right;
delete temp;
return;
}
prev = temp;
if(temp->data < data)
temp = temp->right;
else
temp = temp->left;
}
}

As I said the remove function is basically just a find with some extra tracking. It looks much more complex because of what happens when we find the item we're looking for. The code before and after this if statement is identical to the code in the find function.

The complexity in the remove function is caused by having two nodes to go to from each node. For linked lists all we had to do was assign prev->next to temp->next to bypass temp. In this case the previous node has one link that points to temp and the other that points to a different tree altogether. We must take this into account. In addition there are four possible cases for the pointers from temp. These are:
-both pointers are NULL; ie no children
-the left pointer is NULL, but the right has children
-the right pointer is NULL, but the left has children
-both pointers are not NULL; temp has children from both
The first three cases are fairly easy to deal with. There is one pointer to temp and one meaningful pointer from temp so we set the pointer to temp equal to the pointer from temp and bypass temp that way (if both the pointers from temp are NULL then it doesn't matter which we choose). These cases are dealt with in the first four if statements. In these statements left is 1 if temp to the left of the previous node and 0 otherwise.

The last case is the hardest to deal with. In this case there is a pointer to temp in the previous node, but two meaningful pointers from temp. A simple assignment will not do. The answer is to turn it into one of the easy cases. To do this we copy the data from temp's right node into temp and then we go right. We keep doing this until we get to a node with one or zero children. Think about what this did. The first assignment removed the data in temp, but it also created a duplicate node. We're now moving that duplicate down the tree until we get to the end where we can delete it as above.

Page : 1