ADS 0x03: Dynamic Data Structures

Published November 25th, 2021

ADS
Java

Introduction

Dynamic data structures are data structures that can grow and shrink as needed. The simplest example of a dynamic data structure is a dynamic array. Most other dynamic data structures are based on dynamic arrays. Dynamic arrays solve most of the problems of a normal array, and enables us to use almost all of the linked list operations in a more efficient way.

In this post, we will discuss the dynamic array and how it is used in other dynamic data structures. We will also implement a stack and queue using dynamic arrays.

Dynamic Array

When we allocate a normal array, we allocate a fixed amount of memory. If we have more elements than we can fit in the allocated memory, we cannot add any more elements to the array.

Dynamic arrays do support the behaviour of adding any number of elements to itself. A dynamic array allocates a fixed amount of memory in the background, but it will automatically grow and shrink the array as needed.

Growing and shrinking the array is done by reallocating the memory. This is done by creating a new array, copying the elements from the old array to the new array, and then deleting the old array.

It is trivial that it takes time to copy all the elements over, which is inefficient. A dynamic array solves this problem by reallocating a new array with double the capacity of the old array if we grow the array.

Dynamic Array Implementation

Let's consider a dynamic array implementation and analyse the time complexity of its methods to see that it is in fact more efficient.

			
class DynamicArray<T>

{

	private T[] array;

	private int size;

	public

	DynamicArray(int capacity)

	{

		array = new T[capacity];

	}

	public int

	size()

	{

		return size;

	}

	public T

	get(int index)

	{

		if(index < 0 || index >= size)

		{

			throw new IndexOutOfBoundsException();

		}

		return array[index];

	}

	public void

	append(T element)

	{

		// Check if the array has space on it to append the element.

		if (size == array.length)

		{

			// Double the capacity of the array.

			reallocate(array.length * 2);

		}

		// Now we are sure the array has enough capacity.

		// Append the element to the array.

		array[size] = element;

		size++;

	}

	private void

	reallocate(int newCapacity)

	{

		// Create a new array of the requested new capacity.

		T[] newArray = new T[newCapacity];

		// Copy the elements over to the new array.

		for (int i = 0; i < size; i++)

		{

			newArray[i] = array[i];

		}

		// Delete the old array.

		array = newArray;

	}

}

This implementation of a dynamic array has two public methods that can be used to access and add elements: get and append.

get just returns the element at the given index or throws if the index is out of bounds. It takes time.

The append method appends an element to the end of the array. If the array has enough space, it will just append the element, which takes time:

If the array does not have enough space, it will double the capacity of the array, copy the elements over to the new array, delete the old array, and then append the element. This takes time:

What's interesting though, is that the average time complexity of the append method is actually .
This might seem counter-intuitive, but it is true. Consider the following scenario:

			
DynamicArray<int> array = new DynamicArray<int>(16);

for (int i = 0; i < 1073741824; i++)

{

	array.append(i);

}

Here, we create a dynamic array of initial capacity 16 and append 1073741824 () elements to it.

For the first 16 elements, the time complexity is of the append method is always , because the array has enough space to append the element.

We have to grow the array after it has reached 16 elements. The next time we have to grow the array is when the array has 32 elements on it. Then, when the array has 64 elements, and so on.
These are the powers of two, starting from 16. In these cases, we have to copy all elements on the array over to the new bigger array.
All other times, the array has enough space to append the element, so we can just put the element in the next free slot.

So, we're doing two operations: adding elements and and copying elements.

Let's compute the total number of individual operations that we have to perform to append all 1073741824 elements to the dynamic array.
It is trivial to see that we will have to perform add operations, since we certainly have to add all elements to the array.

But how many times will we have to copy elements over to the new array?
If the number of elements on the array is a power of two, the array is completely full and we have to grow it, which involves copying all currently stored elements over to a bigger chunk of memory.
Since there are 26 powers of two between (16) and (1073741824), we have to construct a new array 26 times.
The first time, we have to copy elements, the second time elements, ..., the last time elements.

It turns out we have to do approximately copy operations, which is the same number as the number of add operations.

So for each call to the append method, we have to perform two operations on average. Since this is a constant number and we can generalise this to any number of elements, we can conclude that the average time complexity of the append method is .

Appending an element to a dynamic array takes time on average.

Dynamic Array Interface

Many dynamic array implementations support the following operations: (Which the dynamic array implementation from the Java standard library ArrayList also does.)

Operation	Time complexity
Accessing by index (get & set)
Appending an element to the end
Extracting an element from the end
Adding an element at a given index (shifts the elements after the index to the right)
Removing an element at a given index (shifts the elements after the index to the left)

We have already implemented the getting by index and appending an element operations. The implementations of the rest of the operations are left as an exercise for the reader.

Dynamic Array Vs. Linked List

In the previous post, we discussed that arrays are much faster than linked lists, but less flexible.

Dynamic arrays try to take the best of both worlds and are more flexible, while still storing its elements contiguously for maximum performance.

Dynamic arrays can do everything linked lists can do, but they are faster.
The one thing dynamic arrays are slower at is adding elements at arbitrary indices, provided that we have a reference to the element we want to add the item after.

A linked list really only makes sense if we have a very large number of elements and we keep a pointer to one or a couple of hotspot elements that we're currently processing.

A good example of a problem where using a linked list makes sense is the third problem of the TU Delft ADS Quadruple Quest . The input file can be found here .

Funny enough, an optimised implementation of a dynamic array still beats linked lists in this problem, because the input size (10000) is not extremely large.

Stack and Queue Implementation

Stacks and Queues are two very common dynamic linear data structures. We will discuss them both and how they can be efficiently implemented using dynamic arrays. Stacks and Queues can also be implemented using linked lists, but as you can see from the results of the benchmark I did, those implementations are much slower.

Stacks

A stack is a last-in-first-out data structure. We can push elements onto the stack, and pop elements off the stack. The last element that is pushed on the stack is the first element that is popped off the stack.

The stack in the picture above is represented horizontally, but it might be more intuitive to think of it as a vertical stack of elements.
Since it is hard to make a vertical image look good in a blog post, we will think of the stack as a horizontal array with a "bottom" on the left and a "top" on the right.

When we push an element onto the stack, we add it to the right of the stack:

When we pop an element off the stack, we remove the element from the right of the stack:

With the two main operations of a stack in mind, let's implement it using a dynamic array.

Stack Implementation

			
class Stack<T>

{

	private T[] array;

	private int size;

	public

	Stack(int capacity)

	{

		array = new T[capacity];

		size = 0;

	}

	public int

	size()

	{

		return size;

	}

	public void

	push(T element)

	{

		// If the array is too small, grow it.

		if (size == array.length)

		{

			reallocate(array.length * 2);

		}

		// Add the element to the top of the stack.

		array[size] = element;

		size++;

	}

	public T

	pop()

	{

		if (size == 0)

		{

			throw new EmptyStackException();

		}

		// Remove the element at the top of the stack.

		size--;

		T element = array[size];

		// If the array is very big, shrink it to save memory.

		if (array.length > 16 && size == array.length / 2)

		{

			reallocate(array.length / 2);

		}

		return element;

	}

	private void

	reallocate(int newCapacity)

	{

		// Create a new array of the requested new capacity.

		T[] newArray = new T[newCapacity];

		// Copy the elements over to the new array.

		for (int i = 0; i < size; i++)

		{

			newArray[i] = array[i];

		}

		// Delete the old array.

		array = newArray;

	}

}

As you can see, the stack implementation looks very similar to the dynamic array we implemented earlier.
The push and reallocate methods are essentially the same as the append and reallocate methods from the dynamic array implementation, respectively.
We just added a method to pop an element off the stack.

Both the push and the pop methods are on average, because they double and cut in half the size of the array as needed.

The Java standard library has a stack implementation too.

Queues

A queue is a first-in-first-out data structure. The first element that is pushed onto a queue is the first element that is popped off.

Since the terms "bottom" and "top" don't really make sense in a queue, we'll use the terms "front" and "back" instead. The element at the front of the queue is the first element that was inserted and will be the first element that is removed.
In some implementations, the terms "push" and "pop" are called "enqueue" and "dequeue".

When we push an element onto the queue, we add it to the back of the queue:

When we pop an element off the queue, we remove the first element of the queue. Note that the indices of the array also change:

Queue Implementation

Implementing a queue using a dynamic array is slightly more tricky than the stack implementation.
The reason is that we need to be able to remove the first element of the queue, which means that we have to shift all the elements in the array to the left. This operation would be very expensive, because we have to move all elements one place to the left, which takes time.

However, we can also do something very clever and make the pop operation .
We do this by making the dynamic array circular. This means that we save the index of the actual first element (the front), and we can just increment the front whenever we pop an element off the queue. We will also keep track of the last element index (the back) to make our lives easier.
Elements that would be placed after the end of the array are circularly placed at the beginning of the array instead.

Let's bring it to life:

			
class Queue<T>

{

	private T[] array;

	private int front;

	private int size;

	public

	Queue(int capacity)

	{

		array = new T[capacity];

		size = 0;

		front = 0;

	}

	public int

	size()

	{

		return size;

	}

	public void

	push(T element)

	{

		// If the array is too small, grow it.

		if (size == array.length)

		{

			reallocate(array.length * 2);

		}

		// Add the element to the back of the queue.

		array[back] = element;

		size++;

		// If the back of the queue would exceed the array size,

		// circularly place it at the front of the array.

		if (back == array.length - 1)

		{

			back = 0;

		}

		// Else, we can safely increment the back.

		else

		{

			back++;

		}

	}

	public T

	pop()

	{

		if (size == 0)

		{

			throw new EmptyQueueException();

		}

		// Remove the element at the front of the queue.

		T element = array[front];

		size--;

		// If the front of the queue would exceed the array size,

		// circularly place it at the front of the array.

		if (front == array.length - 1)

		{

			front = 0;

		}

		// Else, we can safely increment the front.

		else

		{

			front++;

		}

		// If the array is very big, shrink it to save memory.

		if (array.length > 16 && size == array.length / 2)

		{

			reallocate(array.length / 2);

		}

		// Finally, return the element.

		return element;

	}

	private void

	reallocate(int newCapacity)

	{

		// Create a new array of the requested new capacity.

		T[] newArray = new T[newCapacity];

		// Copy the elements over to the new array.

		// We start at the front and circularly copy all

		// elements until we reach the back.

		for (int i = 0; i < size; i++)

		{

			// Note the use of the modulo operator;

			// it ensures that we iterate over a circular range.

			newArray[i] = array[(front + i) % array.length];

		}

		// Delete the old array.

		array = newArray;

		// Reset the front and back.

		front = 0;

		back = size % array.length;

	}

}

I created an applet which can give you some more intuition about how the circular dynamic array we used in the queue implementation works.
Play around with it and push and pop some elements! See what happens when you push more elements than the array can hold.

Final Thoughts

I hope you enjoyed this post and acquired a greater understanding of dynamic data structures.
In the next post, we will discuss heaps and priority queues.

If you found this post helpful, feel free to share it! Any feedback is very welcome!

ADS 0x03: Dynamic Data Structures

Published November 25th, 2021

Introduction Get link to this section

Dynamic Array Get link to this section

Dynamic Array Implementation Get link to this section

Dynamic Array Interface Get link to this section

Dynamic Array Vs. Linked List Get link to this section

Stack and Queue Implementation

Stacks Get link to this section

Stack Implementation Get link to this section

Queues Get link to this section

Queue Implementation Get link to this section

Final Thoughts Get link to this section