INT2008 Data Structures and Algorithms

Lecture 9 - AVL (Adelson-Velskii and Landis) Trees

READING:

"Data Structures and Algorithm Analysis in C", M A Weiss – Chapter 4, section 4


Introduction

Trees are for Faster Search

Recall the purpose of using binary trees is that they provide the benefits of a linked list (ease of insertion and deletion) with the speed of a binary search (ie logarithmic time).

Balanced Trees

However for a tree search to have the efficiency of a binary search it must be balanced (see below for a definition). So far our first tree building algorithm (InsertNode from Lecture 7) will create an approximately balanced tree given a random distribution of key values.

BUT: key values are often inserted in sequential order, in which case InsertNode will create the equivalent of a linked list with consequent linear sequential search times.

AND: our delete algorithm from the last lecture, although simple, can cause large imbalances depending on the point of insertion.

Answers:

Periodically test the tree to see how imbalanced it has become (e.g. using the FindMaxDepth function from Lecture 8). If it goes beyond a predefined point we can rebuild the tree so that it becomes balanced again. In this way we can keep using our simple (and therefore fast) insertion and deletion algorithms.

Use better insertion and deletion algorithms so that the tree remains balanced after each insertion and deletion (i.e AVL).

Balance Conditions

The strongest balance condition is to ensure the difference between the longest and shortest paths from the root to any null pointer in a tree is either zero for trees whose number of nodes = 2k - 1 (where k is a positive integer) or at most one for all other trees.

In practice this standard of balance is expensive to achieve so a looser condition is used: a tree is considered balanced if for every node in the tree the height of the left sub-tree differs from the height of the right sub-tree by at most 1. This is the definition of an AVL tree.

For example:

AVL tree

non-AVL tree

balanced tree

AVL tree

non-AVL tree

fully balanced tree

The definition of an AVL tree is important because fairly simple algorithms exist to insert and delete from an AVL tree without destroying the AVL property.

 

Insertion into an AVL Tree

An AVL tree node has an additional data element to keep track of the height of each sub-tree. Kruse uses a balance factor variable that stores whether the left and right sub-trees of a node are of equal height, left is higher than the right, or right is higher than the left. Weiss (Figure 4.37) stores the actual height of the sub-tree extending from each node:

typedef struct avlnode  {
   Item Value;
   int Height;
   struct avlnode *Left;
   struct avlnode *Right;
} AVLNode;

The basic insert procedure is the same as the simple InsertNode function from Lecture 7. However, if the insertion causes the left and right sub-trees of any node to differ by more than 1 then this needs to be fixed using a technique called rotation.

Four cases of insertion must be considered:

  1. Insertion into the left subtree of the left child of ‘root’
  2. Insertion into the right subtree of the left child of ‘root’
  3. Insertion into the left subtree of the right child of ‘root’
  4. Insertion into the right subtree of the right child of ‘root’

Rotation

Consider the following tree, the right and left sub-trees of k2 differ by two so the tree is out of balance. The way we fix this is to move the root of the larger sub-tree (in this case k1) so that it becomes the new root of the overall tree (replacing k2). k2 then becomes the root of the right sub-tree. This is called a single right rotation because the root is rotated to the right:

Single Right Rotation

rotation - before

rotation - after
(Weiss Figure 4.31)

Note that the original right sub-tree of k1(Y) becomes the new left sub-tree of k2: this is the only way that the keys will still remain in the correct order.

This fixes case 1 above.

Consider the alternative case where the largest sub-tree is on the far right:

Single Left Rotation

left rotation - before

left rotation - after
(Weiss Figure 4.33)

This fixes case 4.

This is called single left rotation because the root of the imbalanced tree is rotated left becoming the root of the left sub-tree. To make this clearer consider the following example:

example of single rotation

Double Rotations

So far we have looked at two cases where the larger sub-tree (the one we need to rotate upwards) is either on the far left of the out of balance sub-tree (case 1= right rotation) or on the far right (case 4 = left rotation). Unfortunately, these are the simple cases. If the larger sub-tree is one of the inner sub-trees (which means it’s values are between k1 and k2) then a single rotation will not change the situation:

single rotation - case2 - before

single rotation - case2 - after
(Weiss Figure 4.34)

Instead we use a procedure called double rotation where we move the root of the root of the larger sub-tree to be the root of the original imbalanced tree:

left right double rotation - before

First we rotate k2 to the left:

left right double rotation - during

Now we have the longer sub-tree on the far left so we are back in a situation we have already considered and a right rotation will restore the tree’s balance:

left right double rotation - after
(Weiss Figure 4.35)

The previous manipulation was a left rotation around the root of the left sub-tree, then a right rotation around the root of the original sub-tree. Hence it is called a left right double rotation. Note it still uses the basic rotation actions used in the simpler examples.

The final case is where the inner sub-tree on the right hand side is the larger tree. This is the mirror of the previous example and is fixed with a right left double rotation:

right left double rotation - before
right left double rotation - after
(Weiss Figure 4.36)

To make this clearer, consider the following tree:

example - double rotation - 1

Here we insert 15 which makes the tree imbalanced at 9:

9’s left sub-tree has height 0, the right sub-tree has height 2. Because the larger subtree is to the left of 16 (ie it is between the values of 9 and 16), we need to do a right left double rotation.

First we rotate to the right with 16 as the root

Then rotate left with 9 as the root

example - double rotation - 2

example - double rotation - 3

The algorithm for AVL insertion follows the same structure as our earlier InsertNode algorithm except we have to keep track of the height of each node and test for each of the 4 situations described above:

InsertAVLNode(NewValue, Root)
{
   if(!Root){
      Root = allocate memory for node;
      Root->Value = NewValue;
      Root->Height = 0;
      Root->Left = Root->Right = NULL;
   } else if(NewValue < Root->Value){
      Root->Left=InsertAVLNode(NewValue,Root->Left);
      if(Root->Left->Height-Root->Right->Height == 2)
         if(NewValue < Root->Left->Value)
            Root = SingleRightRotation(Root);
         else
            Root = DoubleLeftRightRotation(Root);
   } else if(NewValue > Root->Value){
      Root->Right=InsertAVLNode(NewValue,Root->Right);
      if(Root->Right->Height-Root->Left->Height == 2)
         if(NewValue < Root->Right->Value)
            Root = SingleLeftRotation(Root);
         else
            Root = DoubleRightLeftRotation(Root);
   if(Root->Left->Height >= Root->Right->Height)
      Root->Height = Root->Left->Height + 1;
   else
      Root->Height = Root->Right->Height + 1;
 
   return Root;
}

The action of rotation is fairly simple:

SingleRightRotation(Root)
{
   NewRoot = Root->Left;
   Root->Left = NewRoot->Right;
   NewRoot->Right = Root;
 
   if(Root->Left->Height >= Root->Right->Height)
      Root->Height = Root->Left->Height + 1;
   else
      Root->Height = Root->Right->Height + 1;
   if(NewRoot->Left->Height >= NewRoot->Right->Height)
      NewRoot->Height = NewRoot->Left->Height +1;
   else
      NewRoot->Height = NewRoot->Right->Height+1;
 
   return NewRoot;
}

A single left rotation is just a mirror of a right rotation. Given single left and right rotation functions, a double rotation becomes trivial:

DoubleRightLeftRotation(Root)
{
   Root->Right = SingleRightRotation(Root->Right);
   return SingleLeftRotation(Root);
}

Again the case for a double left right rotation is the mirror of the above function.

Deletion from an AVL Tree

Deletion from an AVL tree involves a slight modification of our earlier deletion algorithm and then we test and rebalance the tree in the same way as for insertion.

The modification involves only deleting nodes with a single child. You may ask what happens when a node has two children? The answer is: we find the immediate predecessor of that node. This means going left from our original node and then moving all the way to the right of that sub-tree (i.e. the previous value of a node with a left pointer is the extreme right value of that node’s left sub-tree). By definition the predecessor node cannot have a right pointer value or it would not be the largest value in the sub-tree. Therefore we can delete the predecessor node without dramatically changing the shape of the tree (recall the deletion algorithm from the last lecture).

But, we didn’t want to delete the predecessor, hence we copy the value of the predecessor node into the node we want to delete. This has the effect of swapping the delete node with it’s predecessor, then deleting the delete node.

By definition this swap does not change the ordering of the tree because the predecessor was the largest value in the left sub-tree - it has now become the root, so it is still larger than any value in the left sub-tree and smaller than any value in the right sub-tree.

Consider an example:

deletion

Delete 5:

  1. Find 5
  2. Go left one node, then all the way to the right to find the predecessor of 5 which is 4.
  3. Copy 4 into 5’s position
  4. Delete 4’s original position

deletion - result

(Deleting 4 is the simple operation of connecting node 3 to node 2 and is the same operation used in our earlier deletion algorithm.)

The above example leaves the tree balanced so no further action is required. However as with insertion, a deletion may cause a left and right sub-tree to become unbalanced. In this case we can use the same rotation techniques introduced for insertion to rebalance the tree. This again (of course) involves updating the height of each sub-tree that is changed by a deletion.

The Efficiency of AVL Trees

Although there is extra work in doing rotations for an AVL tree in comparison to a random binary tree, because the tree is kept in a balanced state we would expect the average path length for an insertion or deletion to be shorter (and this is the case in practice). Also many insertions and deletions do not require any rotation, so the only extra work is in updating the height of each sub-tree.

Empirical and theoretical studies have shown that as the size of the tree grows the average times for insertions and deletions in an AVL tree are better than for a random binary search tree. This means the effect of having a more balanced tree outweighs the extra work in keeping it balanced.

Note that an AVL tree is not perfectly balanced. The height of a perfectly balanced tree is approximately lg n, where n is the number of nodes (see the earlier lecture) whereas the worst case height of an AVL tree is approximately 1.44 lg n. In practice, however, the AVL height approaches lg n as n gets large and so is almost optimal.

Conclusions

The AVL tree requires more coding work than a simple random binary tree, but provides better and better performance as n gets larger. Therefore in an application that requires a binary tree, where performance is important and n is large, the AVL tree would be the preferred data structure.


Copyright © Dr Ian G Graham, 2000. All rights reserved.
For use only by students enrolled for the INT2008 Data Structures and a\Algorithms course at Griffith University, Gold Coast Campus. Not to be printed out or copied by any other persons or used for any other purpose without written permission of the author.