CMSC 335  Advanced Data Structures

 

Lecture # 4

Introduction To Binary Trees

 

1.      Definition

A binary tree is either NULL or consists of a root that holds some item or items of data and has a left and a right subtree which are themselves binary trees.

 

2.      External and Internal Path Length

Consider the binary tree (2-tree) where all paths terminate in a special external node (all leaves have left and right subtrees which are external nodes)

Define

E = external path length = the total distance of all external nodes from the root

I = internal path length = the total distance of all internal nodes from the root

 

       Then,

                        E = I + 2n        where n is the number of internal nodes

       Proof (by induction)

            Step 1.  Show that  P(n): E = I + 2n      is true for the base case (n= 1)

                        Then the tree consists of a single internal node, the root, and its two

(external node) children.

I = 0,   n = 1,  and E = 2           and the statement P(1) is correct.

            Step 2.  The inductive hypothesis – show P(n) à P(n+1)

                        Assume P(n) is true.

                        Replace any one of the external nodes – say the one at a distance d from

the root with an internal node (and its two external children).

 

Then E’ the external path length of the new tree (with n+1 nodes) is

 

            E’ = E – d + 2(d + 1) 

 

since the original tree lost an external node at depth d but gained two new

ones at depth d + 1.

 

And similarly,

 

            I’ = I + d

 

Since one additional internal node was added at depth d.  Then

 

            E’ = E – d + 2(d+1) = I’ + 2(n + 1)      ?

            E + d + 2 = I + d + 2(n + 1)

Which reduces to

            E = I + 2n

Which is assumed true.

 

3.      Binary Search tree

A binary search tree is a binary tree, one of whose data elements is a comparable object called a key, whose left subtree is a binary search tree all with all of its keys less than the key of the root, and whose right subtree is a binary search tree with all of its  keys greater than the key of the root.

 

Given the recursive definition of a binary tree, we will distinguish between a tree object and its nodes.  We will locate the recursion in the node objects.  The bstnode objects will enforce the constraint on the keys of its children, and will have a tree of nodes as its subtrees.

 

4.      Determine the number of compares to find a key in a binary search tree of size n nodes

Consider the 2-tree where an unsuccessful search terminates at an external leaf node. 

Let S(n) = number of compares done in an average successful search of a 2-tree with n internal nodes.

Let U(n) = number of compares for an average unsuccessful search of the same 2-tree.

 

The average number of comparisons in a successful search is equal to the average of the internal path length plus 1 (the initial comparison with the root)

 

      S(n) = I/n  + 1

 

And the average number of comparisons for an unsuccessful search would be equal to

 

      U(n) = E/(n + 1)

 

Since in a 2-tree with n internal nodes, there are always n + 1 external nodes.

 

Using the theorem stating E = I + 2n  that we proved earlier, we can now demonstrate that

     

      S(n) = (1 + 1/n) U(n) – 1                                             (1)

 

Where

 

      E = (n + 1) U(n)           I  = n S(n) - n  from above, and plugging into E = I + 2n

 

      (n + 1) U(n) = n S(n) - n + 2n

 

      S(n) = (1 + 1/n) U(n) – 1          the desired result.

 

The number of comparisons to find a node in the tree is exactly 1 more than the number of comparisons needed to insert it.  An insertion would involve a traversal down to an external node (an unsuccessful search).  Therefore

 

      S(n) = 1 + (U(0) + U(1) + … + U(n-1)) /n                   (2)

 

since the node to be found is equally likely to be the first node inserted {where number of compares is U(0) since insertion is into an empty tree} or the kth node

inserted {U(k-1) since there are k – 1 nodes already in the tree when this node is inserted} for k = 1, n.

 

Combining formulae (1) and (2) we obtain:

 

      (n + 1) U(n) = 2n + U(0) + U(1) + … + U(n-1)            (3)

 

Write this recurrence for n –1  {replace n with n-1 everywhere it appears in (3)

 

      n U(n – 1) = 2(n – 1) + U(0) + … + U(n –2)

 

and subtract from equation (3) to obtain:

 

      (n + 1) U(n) – n U(n – 1) = U(n – 1)  + 2

 

      (n + 1)U(n) = (n + 1)U(n – 1) + 2

 

      U(n) = U(n – 1) + 2/(n + 1)                  where U(0) = 0

 

The solution to this recurrence is

 

      U(n) = 2(1/(n+1) + 1/n + 1/(n-1) + … + ½ + 1)

 

And for n à ¥  this sum goes to 2ln(n), the natural log of n

 

And converting to base 2 logs, this becomes 2(ln 2)(lg n) = 1.39 lg(n)

 

Thus U(n) @ 2ln(2)lg(n) = 1.39 lg(n)

 

     The average number of comparisons needed needed in the average binary search tree

with n nodes is approximately 1.39 lg(n)

 

5.      Insertion into a Binary Search Tree

All new keys are inserted into a binary search tree at a leaf.

There can be no duplicate keys in a BST.

 

  template <class Comparable>

  void binstree<Comparable>::add(const Comparable &  item)

  {

        if (!root)

                    root = new bstnode<Comparable> (item);

        else

                    root -> add (item);

  }

 

  template <class Comparable>

  bstnode<Comparable> * bstnode<Comparable>::add(const Comparable & value)

  {

        bstnode<Comparable> * newNode = new bstnode<Comparable> (value);

        insert (newNode);

        return newNode;

  }

 

  template <class Comparable>

  void bstnode<Comparable>::insert (bstnode<Comparable> * newNode)

  {

        assert(newNode != 0);

        if (newNode -> data < data ) {

                    if (leftptr != 0)

                                leftptr -> insert (newNode);

                    else

                                leftptr = newNode;

                    }

        else if (newNode -> data > data) {

                    if (rightptr != 0)

                                rightptr -> insert (newNode);

                    else

                                rightptr = newNode;

                    }

        //else do nothing if key already appears in the tree

  }

 

6.      Deleting from a Binary Search Tree

If the key to be removed is in a leaf node, simply delete the node

 

If the key to be removed is in a node with no right child, make that node's left child, the parent's right child and delete the node.

 

Else, replace the node holding the key with the leftmost descendant of the right child, and delete the node.

 

 

The remove method in a binstree object delegates the actual working of removing the node to the bstnode objects beginning with the root.   A pointer to the deleted node will be returned to junknode which is then deleted.

 

  template <class Comparable>

  void binstree<Comparable>::remove (Comparable &  item)

  {

        if (root)  {

                    bstnode<Comparable> * junknode = 0;

                    bstnode<Comparable> * hold;

                    hold = root -> remove (item, junknode);

                    if (junknode) {

                           if (root == junknode)

                                   root = hold;

                    delete junknode;

                                          }

                    }

  }

 

The remove method in bstnode will return a pointer to the node to be reattached, and a reference parameter to a pointer to the node to be junked.

 

  template <class Comparable>

  bstnode<Comparable> *  bstnode<Comparable>::remove(Comparable & value,

                                          bstnode<Comparable> * & bst)

  {

        if (value == data) { // we're the one

                    bst = this;

                    if (!right ())

                                return left ();

                    // else find and remove the leftmost descendant of the right child

                    bstnode<Comparable> * newroot;

                    right(right() -> removeLeftmostDescendant(newroot) );

                    // connect the new root

                    //make the left child of this node the left child of  newroot

                    newroot -> left(left() ); 

                    newroot -> right(right() );

                    return  newroot;

                    }

        else if (value < data) {  // remove from left child

                    if (!left() )

                                return this;   // no left child

                    // do the deletion

                    left(left()-> remove(value, bst));

                    return this;

                    }

        else { // remove from right child

                    if (!right() )

                                return this;  // no right child

                    // do deletion

                    right(right() -> remove(value, bst));

                    return this;

                    }

  }

 

  template <class Comparable>

  bstnode<Comparable> * bstnode<Comparable>::removeLeftmostDescendant

                                                                                                                                          (bstnode<Comparable> * & childptr)

  {

        // see if we are the leftmost node

        bstnode<Comparable> * leftchild = left();

        if (!leftchild)  { // we are

                    childptr = this;

                    return right(); // remove self

                    }

        // else do the deletion

        left(leftchild -> removeLeftmostDescendant (childptr) );

        return this;

  }