Background Image

ADS 0x05: AVL Trees

Published December 21st, 2021

ADS Wallpaper

Introduction Get link to this section

Binary search trees are very important data structures in computer science. They can be used to implement ordered sets and ordered maps.
To make them more efficient, it is often necessary to implement them as self-balancing trees. Self-balancing trees are trees that balance themselves after every insertion and deletion.
In this article, we will implement an AVL tree, which is such a self-balancing tree.


Binary Search Trees Get link to this section

Binary search trees are trees that are ordered by their keys. If we were to do an in-order traversal of the tree, we would get the keys in ascending order:



Due to their sorted nature, binary search trees provide three operations:

  • Insertion: traverse down the tree until we find a leaf node, and insert a new node with the provided value as its left or right child.
    If we find a node with the same value along the way, the value is already in the tree and we do nothing. Binary search trees do not support duplicate values.
  • Deletion: traverse down the tree until we find a node with the value we want to delete. We replace the node with its left child. If it doesn't have a left child, we replace it with its right child.
    If we don't find a node with the value to delete, we do nothing.
  • Search: traverse down the tree until we find the node we are looking for. Then we return true. If we don't find it, we return false.


Binary search trees are quite easy to implement, but they can be very inefficient, because they are not self-balancing.
Imagine that we implement a binary search tree with the algorithms described above, and we insert the integers between 1 and 5 in order.
We will end up with a tree like this:

That doesn't look very much like a binary tree! In fact, we have a unary tree, which is a tree with only one node as child. Do you know another name for this data structure?
I do, it's a linked list.
When we try to search for the value 5, we have to traverse down the entire tree, which completely defeats the purpose of a binary search tree.
This is why we need self-balancing trees.


"Balanced" Trees Get link to this section

Before we can understand what a self-balancing tree is, we first have to define what makes a tree balanced. This definition depends on the type of self-balancing tree we are dealing with.

Since we are going to implement an AVL tree in this article, we will use the following definition:

An AVL tree is balanced if it consists solely of balanced nodes. A node is considered to be balanced if and only if the heights of its left and right child nodes differ by at most one.

The above tree is balanced, because there is no node which children's heights differ by more than one.
Note that the height of a null node is 0.

Let's look at node 3 and find out if it is balanced or not.
Node 3 only has a left child, which has a height of 1. The right child of node 3 is null, so it has a height of 0. Since the height difference between the heights of the children of node 3 is one, this node is balanced.

Now let's look at node 8.
Both children of node 8 have heights of 1, so node 8 is balanced.

We can use the same logic for all other nodes and see that they are all balanced.

The definition of a balanced AVL tree also allows for some intuition: AVL trees are balanced if they look "kind of complete". All levels in the tree must be filled, except for the last two levels, which may be partially filled with gaps in between.

If we were to add a node to the tree, we might have to rebalance the tree.

In the above image, we can see that when we add a node with value 2 to the tree, we have to rebalance the tree, because the tree is not balanced anymore at node 3.
We will have to perform a rotation to fix the tree. In particular, this rotation is called a "left-right rotation", because the three nodes we are rotating around are in a left-right order. We can see that the tree is properly balanced again after the rotation.

To make things more efficient, we store the height of each node in the tree. It's important to realise that we have to update the heights of all nodes in the traversal path when we insert the new node.


AVL Tree Rotations Get link to this section

We can generalise what we have discussed so far to four different scenarios, which all require their own rotation. Before we go over these four scenarios, let's first write some essential code for AVL tree nodes:

 1
class AVLTreeNode<T>
 2
{
 3
	public T value;
 4
	public int height;
 5
	public AVLTreeNode<T> left;
 6
	public AVLTreeNode<T> right;
 7

 8
	/**
 9
	 * Constructs an AVL tree leaf node with a given value.
10
	 */
11
	public
12
	AVLTreeNode(T value)
13
	{
14
		this.value = value;
15
		height = 1;
16
		left = null;
17
		right = null;
18
	}
19
}
20

21
class AVLTree<T extends Comparable<T>>
22
{
23
	public AVLTreeNode<T> root;
24
	int size;
25

26
	/**
27
	 * Constructs an empty AVL tree.
28
	 */
29
	public
30
	AVLTree()
31
	{
32
		root = null;
33
		size = 0;
34
	}
35

36
	public void
37
	insert(T value) { ... }
38

39
	public void
40
	remove(T value) { ... }
41

42
	public boolean
43
	has(T value) { ... }
44

45
	/**
46
	 * Returns the height of a node in an AVL tree.
47
	 * If the node is null, returns 0.
48
	 */
49
	private int
50
	avlHeight(AVLTreeNode<T> node)
51
	{
52
		if(node == null)
53
		{
54
			return 0;
55
		}
56

57
		return node.height;
58
	}
59
}

Now we will go over the four rotation scenarios. We will create a function for each of them, which will take in a node and perform the appropriate rotation.
In the end, the new root node of the subtree is returned.


Left-Left Rotation Get link to this section

In this scenario, we have a left-left imbalance. The tree leans too much towards the left-left side.

The height of Y is at least two greater than the height of T4.
The highest child of Y is its left child: X. Therefore we will have to perform a rotation around X, Y and Z to fix the tree.

 1
/**
 2
 * Rebalances the subtree rooted at node `z` for the left-left case.
 3
 */
 4
private AVLTreeNode<T>
 5
avlRebalanceLeftLeft(AVLTreeNode<T> z)
 6
{
 7
	AVLTreeNode<T> y = z.left;
 8
	AVLTreeNode<T> x = y.left;
 9
	AVLTreeNode<T> T1 = x.left;
10
	AVLTreeNode<T> T2 = x.right;
11
	AVLTreeNode<T> T3 = y.right;
12
	AVLTreeNode<T> T4 = z.right;
13

14
	// Reorganise the nodes.
15

16
	z.left = T3;
17
	y.right = z;
18

19
	// Update the heights.
20

21
	x.height = 1 + Math.max(avlHeight(T1), avlHeight(T2));
22
	y.height = 1 + Math.max(avlHeight(T3), avlHeight(T4));
23
	z.height = 1 + Math.max(x.height, y.height);
24

25
	// Return the new root.
26

27
	return y;
28
}

Left-Right Rotation Get link to this section

In this scenario, we have a left-right imbalance. The tree leans too much towards the left-right side.

The height of Y is at least two greater than the height of T4.
The highest child of Y is its right child: X. Therefore we will have to perform a rotation around X, Y and Z to fix the tree.

 1
/**
 2
 * Rebalances the subtree rooted at node `z` for the left-right case.
 3
 */
 4
private AVLTreeNode<T>
 5
avlRebalanceLeftRight(AVLTreeNode<T> z)
 6
{
 7
	AVLTreeNode<T> y = z.left;
 8
	AVLTreeNode<T> x = y.right;
 9
	AVLTreeNode<T> T1 = y.left;
10
	AVLTreeNode<T> T2 = x.left;
11
	AVLTreeNode<T> T3 = x.right;
12
	AVLTreeNode<T> T4 = z.right;
13

14
	// Reorganise the nodes.
15

16
	y.right = T2;
17
	z.left = T3;
18
	x.left = y;
19
	x.right = z;
20

21
	// Update the heights.
22

23
	y.height = 1 + Math.max(avlHeight(T1), avlHeight(T2));
24
	z.height = 1 + Math.max(avlHeight(T3), avlHeight(T4));
25
	x.height = 1 + Math.max(y.height, z.height);
26

27
	// Return the new root.
28

29
	return x;
30
}

Right-Right Rotation Get link to this section

In this scenario, we have a right-right imbalance. The tree leans too much towards the right-right side.

The height of Y is at least two greater than the height of T1.
The highest child of Y is its right child: X. Therefore we will have to perform a rotation around X, Y and Z to fix the tree.

 1
/**
 2
 * Rebalances the subtree rooted at node `z` for the right-right case.
 3
 */
 4
private AVLTreeNode<T>
 5
avlRebalanceRightRight(AVLTreeNode<T> z)
 6
{
 7
	AVLTreeNode<T> y = z.right;
 8
	AVLTreeNode<T> x = y.right;
 9
	AVLTreeNode<T> T1 = z.left;
10
	AVLTreeNode<T> T2 = y.left;
11
	AVLTreeNode<T> T3 = x.left;
12
	AVLTreeNode<T> T4 = x.right;
13

14
	// Reorganise the nodes.
15

16
	z.right = T2;
17
	y.left = z;
18

19
	// Update the heights.
20

21
	z.height = 1 + Math.max(avlHeight(T1), avlHeight(T2));
22
	x.height = 1 + Math.max(avlHeight(T3), avlHeight(T4));
23
	y.height = 1 + Math.max(x.height, z.height);
24

25
	// Return the new root.
26

27
	return y;
28
}

Right-Left Rotation Get link to this section

In this scenario, we have a right-left imbalance. The tree leans too much towards the right-left side.

The height of Y is at least two greater than the height of T1.
The highest child of Y is its left child: X. Therefore we will have to perform a rotation around X, Y and Z to fix the tree.

 1
/**
 2
 * Rebalances the subtree rooted at node `z` for the right-left case.
 3
 */
 4
private AVLTreeNode<T>
 5
avlRebalanceRightLeft(AVLTreeNode<T> z)
 6
{
 7
	AVLTreeNode<T> y = z.right;
 8
	AVLTreeNode<T> x = y.left;
 9
	AVLTreeNode<T> T1 = z.left;
10
	AVLTreeNode<T> T2 = x.left;
11
	AVLTreeNode<T> T3 = x.right;
12
	AVLTreeNode<T> T4 = y.right;
13

14
	// Reorganise the nodes.
15

16
	z.right = T2;
17
	y.left = T3;
18
	x.left = z;
19
	x.right = y;
20

21
	// Update the heights.
22

23
	z.height = 1 + Math.max(avlHeight(T1), avlHeight(T2));
24
	y.height = 1 + Math.max(avlHeight(T3), avlHeight(T4));
25
	x.height = 1 + Math.max(y.height, z.height);
26

27
	// Return the new root.
28

29
	return x;
30
}

AVL Tree Insertion Get link to this section

One of the three main operations in an AVL tree is the insertion of a particular value.

The insertion process takes time, where is the number of nodes in the tree.

The insertion process is implemented in the following three steps:

  1. Traverse down the tree in binary search fashion; if we come across a node with a value larger than the value we are trying to insert, we go left. If we come across a smaller node, we go right.

    We have to keep in mind one edge case: if we find a node with the same value as the value we are trying to insert, we will stop the insertion process and not do anything. AVL trees don't store duplicate values.

  2. When we are done traversing, we have found a leaf node where we will insert the new value. We create a node for the value and attach it to the correct side of the leaf node.

  3. We traverse back up the tree. In each step we recompute the height of the current node, and check if the node is balanced.

    If a node becomes unbalanced, we check which rotation we have to perform in order to fix the tree. We perform the rotation.

 1
private AVLTreeNode<T>
 2
avlInsert(AVLTreeNode<T> node, T value)
 3
{
 4
	// When we reach the end of the tree
 5
	// insert the new value at this position.
 6

 7
	if (node == null)
 8
	{
 9
		size++;
10
		return new AVLTreeNode<T>(value);
11
	}
12

13
	// Traverse the tree according to how the new value compares
14
	// to the left and right children of the current node.
15

16
	if (value.compareTo(node.value) < 0)
17
	{
18
		node.left = avlInsert(node.left, value);
19
	}
20
	else if (value.compareTo(node.value) > 0)
21
	{
22
		node.right = avlInsert(node.right, value);
23
	}
24
	else
25
	{
26
		// The value already exists in the tree.
27
		// We will do nothing.
28

29
		return node;
30
	}
31

32
	// Get the height of the left and right children.
33

34
	int leftHeight = avlHeight(node.left);
35
	int rightHeight = avlHeight(node.right);
36

37
	// After the recursive downwards step, we will traverse the
38
	// tree upwards again and update the height of the current node.
39

40
	node.height = 1 + Math.max(leftHeight, rightHeight);
41

42
	// We will check if the current node is unbalanced.
43
	// If it is, we will have to find a rotation to fix it.
44

45
	int balance = leftHeight - rightHeight;
46

47
	if (balance > 1)
48
	{
49
		// The left subtree is too tall.
50

51
		if (value.compareTo(node.left.value) < 0)
52
		{
53
			// Left-left case.
54

55
			return avlRebalanceLeftLeft(node);
56
		}
57
		else
58
		{
59
			// Left-right case.
60

61
			return avlRebalanceLeftRight(node);
62
		}
63
	}
64

65
	if (balance < -1)
66
	{
67
		// The right subtree is too tall.
68

69
		if (value.compareTo(node.right.value) > 0)
70
		{
71
			// Right-right case.
72

73
			return avlRebalanceRightRight(node);
74
		}
75
		else
76
		{
77
			// Right-left case.
78

79
			return avlRebalanceRightLeft(node);
80
		}
81
	}
82

83
	// This node is already balanced, so we just return it
84
	// as it currently is.
85

86
	return node;
87
}

Note for the attentive reader: I have received a very good question about the part of the above code where we decide which rotation to perform.

In the code above, we check in what subtree we inserted the new value. We make the decision based on that information, and not on which subtree is actually higher. This is not a mistake, but an optimisation. It is slightly more efficient to compare the node to the value we inserted than to compare heights of both children.

The reason this works is because the newly inserted node will always be in the subtree that is too large. After all, the new node caused the imbalance in the first place.

If we get to an unbalanced node, we can now simply check in what subtree we inserted the new value to get the correct rotation we have to perform. If the node is in the left-right subtree, we will perform a left-right rotation etc.

We cannot perform this optimisation if we want to remove a node. So in the AVL removal code you will see below, we will instead check which subtree is higher the unoptimised way.


AVL Tree Removal Get link to this section

The second fundamental operation of an AVL tree is the removal of a value from the tree.
This algorithm also takes time.

The algorithm is as follows:

  1. Traverse down the tree in binary search fashion; if we come across a node with a value larger than the value we are trying to remove, we go left. If we come across a smaller node, we go right.

  2. When we come across the node we want to delete, we will delete it from the AVL tree.

    There are three cases to consider:
    • If the node has no children, we can simply delete it.
    • If the node has one child, we will can it with its child.
    • If the node has two children, we will replace it with the smallest value in its right subtree.
    If we did't find the node we are looking for, we will simply return the tree as it is.

  3. We traverse back up the tree. In each step we recompute the height of the current node, and check if the node is balanced.

    If a node becomes unbalanced, we check which rotation we have to perform in order to fix the tree. We perform the rotation.

  1
private int
  2
avlBalance(AVLTreeNode<T> node)
  3
{
  4
	if (node == null)
  5
	{
  6
		return 0;
  7
	}
  8

  9
	return avlHeight(node.left) - avlHeight(node.right);
 10
}
 11

 12
private AVLTreeNode<T>
 13
avlRemove(AVLTreeNode<T> node, T value)
 14
{
 15
	// When we reach the end of the tree,
 16
	// the value we want to remove is not in the tree.
 17
	// We will simply return and do nothing.
 18

 19
	if (node == null)
 20
	{
 21
		return null;
 22
	}
 23

 24
	// Traverse the tree according to how the value to be removed
 25
	// compares to the left and right children of the current node.
 26

 27
	if (value.compareTo(node.value) < 0)
 28
	{
 29
		node.left = avlRemove(node.left, value);
 30
	}
 31
	else if (value.compareTo(node.value) > 0)
 32
	{
 33
		node.right = avlRemove(node.right, value);
 34
	}
 35
	else
 36
	{
 37
		// We found the node to be removed.
 38
		// Now we have to check if this node has one or two children
 39
		// or if it is a leaf.
 40

 41
		size--;
 42
		node = avlDeleteNode(node);
 43
	}
 44

 45
	// Make sure the node exists.
 46

 47
	if (node == null)
 48
	{
 49
		return null;
 50
	}
 51

 52
	// Get the height of the left and right children.
 53

 54
	int leftHeight = avlHeight(node.left);
 55
	int rightHeight = avlHeight(node.right);
 56

 57
	// After the recursive downwards step, we will traverse the
 58
	// tree upwards again and update the height of the current node.
 59

 60
	node.height = 1 + Math.max(leftHeight, rightHeight);
 61

 62
	// We will check if the current node is unbalanced.
 63
	// If it is, we will have to find a rotation to fix it.
 64

 65
	int balance = leftHeight - rightHeight;
 66

 67
	if (balance > 1)
 68
	{
 69
		// The left subtree is too tall.
 70

 71
		if (avlBalance(node.left) >= 0)
 72
		{
 73
			// Left-left case.
 74

 75
			return avlRebalanceLeftLeft(node);
 76
		}
 77
		else
 78
		{
 79
			// Left-right case.
 80

 81
			return avlRebalanceLeftRight(node);
 82
		}
 83
	}
 84

 85
	if (balance < -1)
 86
	{
 87
		// The right subtree is too tall.
 88

 89
		if (avlBalance(node.left) <= 0)
 90
		{
 91
			// Right-right case.
 92

 93
			return avlRebalanceRightRight(node);
 94
		}
 95
		else
 96
		{
 97
			// Right-left case.
 98

 99
			return avlRebalanceRightLeft(node);
100
		}
101
	}
102

103
	// This node is already balanced, so we just return it
104
	// as it currently is.
105

106
	return node;
107
}
108

109
private AVLTreeNode<T>
110
avlDeleteNode(AVLTreeNode<T> node)
111
{
112
	// Check how many children this node has.
113

114
	if (node.left == null && node.right == null)
115
	{
116
		// The node has no children.
117
		// We can just delete it from the tree.
118

119
		return null;
120
	}
121

122
	if (node.left == null)
123
	{
124
		// The node has only a right child.
125
		// We can just replace the node with its right child.
126

127
		return node.right;
128
	}
129

130
	if (node.right == null)
131
	{
132
		// The node has only a left child.
133
		// We can just replace the node with its left child.
134

135
		return node.left;
136
	}
137

138
	// The node has two children.
139
	// We will replace the node with its in-order successor.
140
	// This is the left-most child of the right subtree.
141

142
	AVLTreeNode<T> successor = node.right;
143

144
	while (successor.left != null)
145
	{
146
		successor = successor.left;
147
	}
148

149
	// Copy the value from the successor node.
150

151
	node.value = successor.value;
152

153
	// Delete the successor node.
154

155
	node.right = avlRemove(node.right, successor.value);
156

157
	// Calling the above function has the unwanted side-effect
158
	// of decrementing the size of the AVL tree again.
159
	// We will counter this by incrementing the size back up.
160

161
	size++;
162

163
	// Return the "new" root of the subtree.
164
	// It's not actually new, because we just replace the value
165
	// with the value of the successor node.
166

167
	return node;
168
}

AVL Tree Find Get link to this section

The final important AVL tree function is the search function. It takes a value and returns true if the value is in the tree, or false otherwise.

 1
public boolean
 2
avlHas(AVLTreeNode<T> node, T value)
 3
{
 4
	// If we reach the end of the tree, the value is not in the tree.
 5

 6
	if (node == null)
 7
	{
 8
		return false;
 9
	}
10

11
	// Traverse the tree according to how the value to be found
12
	// compares to the left and right children of the current node.
13

14
	if (value.compareTo(node.value) < 0)
15
	{
16
		return avlHas(node.left, value);
17
	}
18

19
	if (value.compareTo(node.value) > 0)
20
	{
21
		return avlHas(node.right, value);
22
	}
23

24
	// We found the value.
25

26
	return true;
27
}

Wrapping It Up Get link to this section

With the code segments above, we have now completed all the necessary functions to implement an AVL tree. We can now create implementations for the three fundamental methods of an AVL tree:

 1
public void
 2
insert(T value)
 3
{
 4
	root = avlInsert(root, value);
 5
}
 6

 7
public void
 8
remove(T value)
 9
{
10
	root = avlRemove(root, value);
11
}
12

13
public boolean
14
has(T value)
15
{
16
	return avlHas(root, value);
17
}

The full code listing can be found here.


More Ideas Get link to this section

Before ending this post, I would like to make you think about how our implementation of the AVL trees can be modified to implement a map data structure.

The implementation we made is a data structure that implements a set with operations. In a set, we can add distinct values, remove them, and check for their existence.
A map builds on top of the idea of the set, and allows us to look up values by their keys. In a map, we can add values by their keys, remove them by their keys, and check for their existence by their keys. We can also update the value associated with a key.

What would we have to change in the functions we wrote, and how can we implement the updating?


Final Thoughts Get link to this section

AVL trees are efficient self-balancing trees that can be used to implement sets and maps. AVL trees support operations for insertion, removal, and lookup.

I hope you enjoyed this post and properly understand how AVL trees work under the hood and how they are implemented.

Feel free to contact me if you have any feedback.