ADS 0x00: Big O

Published November 9th, 2021

ADS
Java

Introduction

Algorithms & Data Structures is an important fundamental concept in computer science. Having a deep understanding in how they work allows you to write more efficient code. Not only is ADS immenslely useful, but it is also electrifying! What's more fun than getting a 1000x performance improvement after rewriting a function using a faster algorithm or a more efficient data structure? Yes, this is what you'll be doing when studying ADS!

The central idea behind Algorithms & Data Structures is time and space complexity. Simply put, it's the amount of time and memory required for a computer to perform a task depending on input size. The idea is really quite simple. Let's get started!

Analysing code

Let's take a slice of programming knowledge and a pinch of intuition. Without any further context on what time and space complexity is, we will just try to make sense of it.

Consider the following Java code:

			
int

sumArrayElements(int[] arr)

{

	int sum = 0;

	// Add each element of the array to the sum.

	for (int i = 0; i < arr.length; i++)

	{

		sum += arr[i];

	}

	return sum;

}

This is a straightforward funcion that sums all elements in an array. We simply loop over each element of the array and add it to the output.

So, what is the time complexity of this code? Well, first we set sum to zero, which we do 1 time. Then we perform the loop over the array n times, where n is the length of the array. Lastly, we return the value of sum. If we define each of these steps as an operation, this function perform operations.

According to the Big O notation, this function has a time complexity of .
You might think something like "hey, but what about the 2 operations at the start and end of the function"? Well, Big O does't care about the 2. Big O only cares about the highest degree of growth, which is n.
This is because Big O is a measure of the growth of a function's time complexity.
When we take large input sizes, the 2 operations at the start and end of the function are simply neglected, as they don't have a substantial effect on the execution time of the function anymore.

For measuring the growth of a function's time complexity, we use the Big O notation. Inside the Big O, we put the highest degree of growth, which in the above example is n. The Big O notation is therefore . Usually, this is pronounced like "Oh of n" or simply "Oh n".

We also use the Big O notation for the space complexity of a function. Space complexity measures the growth of memory usage of a function depending on the input size.
In our example, the space complexity of the function is , since the function only has a single local variable sum.
Attentive readers might have noticed that the input array is not included in the computation of the space complexity. This is not a mistake; space complexity doesn't cover the input!

Common time complexities

Next to and , there are other common time complexities, like , , , and . You can see their growth rate in the following image:

As you can see, the time complexity of a couple of functions grows very rapidly with the input size. and are very bad unless the input size is very small. is reasonable for not too large inputs, but it is definitely not fast.

Things get interesting for , which still has a much higher growth rate than , but it flattens out much faster. In fact, the derivative of is . So far, that is the only non-polynomial, non-exponential derivative, which explains why doesn't grow as badly as the other time complexities.

is a very common and good time complexity. It is often the best time complexity you can achieve on problems with a collection of n input elements.

is a special time complexity, which usually implies that the algorithm uses some form of binary search. simply means that the algorithm can compute its result in a constant amount of time.

I will go over some examples for all these time complexities.

Linear time complexity

			
int

sumOneToN(int n)

{

	int sum = 0;

	for (int i = 1; i <= n; i++)

	{

		sum += i;

	}

	return sum;

}

The above function computes the sum of all integers from 1 to n. It uses a simple for loop to compute this sum.

The time complexity of this function is , because it loops over the array n times.

However, we can do much better, by using a well-known formula from our friend Gauss:

Constant time complexity

			
int

gaussSum(int n)

{

	return n * (n + 1) / 2;

}

This function computes the Gauss sum of a number. The Gauss sum is a simple formula to compute the sum of all integers from 1 to n.

The proof is left as an exercise for the reader. Hint: Try using mathematical induction!

The important difference between this function and the previous one is that this function is instead of . This is because the formula for the Gauss sum can simply be computed in a constant time. We replaced the for loop with some basic arithmetic.

Imagine if we tried to compute the Gauss sum of a very large number using the function. It would take a long time to compute, while the version would compute it essentially instantly.

Logarithmic time complexity

			
boolean

sortedArrHasElement(int[] arr, int n)

{

	int left = 0;

	int right = arr.length - 1;

	while (left < right)

	{

		mid = (left + right) / 2;

		if (arr[mid] < n)

		{

			left = mid + 1;

		}

		else

		{

			right = mid;

		}

	}

	return arr[left] == n;

}

This code looks complicated, so let's break it down. We're writing a function that takes a sorted array of integers and a number, and returns true if the array contains the number. Because the array is sorted, we can use binary search to find the number.

Binary search is a well-known algorithm, and it is used in many other algorithms. The idea is to divide the array in half in each iteration. We take the middle element and compare it to the number we want to find. If the number is smaller, we continue searching the left half, and if it is bigger, we continue searching in the right half. We keep doing this until the array cannot be divided anymore. Lasty, we check if the remaining one element is the number we're looking for. If it is, we return true, because we found the number, otherwise we return false.

The above is a lot of text, so I made an applet that can give you some intuition as to how the binary search algorithm works. It's really not that difficult!

If you've played with the applet a couple of times, you might realise that it always takes three steps to get to the final value. This should come as no surprise, since . We have to cut the array of 8 elements in half three times to get to a final value.

It turns out that algorithms with a logarithmic time complexity are very efficient. Consider an extremely large input of elements to our sortedArrHasElement function. Since , we only need a hundred iterations to find out if some value is in this uncomprehensively large array!

Quadratic time complexity

			
int

maxSubArraySum(int[] arr)

{

	int maxSum = 0;

	for (int i = 0; i < arr.length; i++)

	{

		int currentSum = 0;

		for (int j = i; j < arr.length; j++)

		{

			currentSum += arr[j];

				if (currentSum > maxSum)

				{

					maxSum = currentSum;

				}

		}

	}

	return maxSum;

}

This function computes the maximum subarray sum of a given array. First, lets define what exactly a subarray is: A subarray is a contiguous sequence of elements in an array. For example, the array [ 2, 3, 4 ] is a subarray of the array [ 1, 2, 3, 4, 5 ]. On the other hand, the array [ 2, 3, 5 ] is not a subarray of the array [ 1, 2, 3, 4, 5 ], because there doesn't exist a contiguous sequence [ 2, 3, 5 ] in the array. Note that numbers may also be negative.

Now, let's look at the code. We start by initializing the maximum sum to zero. After all, a subarray of zero elements has a sum of zero. Then, we iterate through the array. For each element, we go through all the elements after it, and we compute the sum of a subarray beginning and ending at the two selected elements. This way, we will go over all possible subarrays in the array. We keep track of the maximum sum we've found so far. In the end, we return the maximum sum we found.

This problem can be solved in linear time too!
The linear time solution is left as an exercise for the reader.
Hint: look up Kadane's algorithm.

Exponential time complexity

			
int

fibonacci(int n)

{

	if (n < 2)

	{

		return 1;

	}

	return fibonacci(n - 1) + fibonacci(n - 2);

}

The fibonacci sequence is a sequence of numbers where the next number is the sum of the previous two numbers. The sequence starts with 1 and 1, and the first few terms are .

To compute the n-th number in the sequence, we need to compute the (n - 1)-th and (n - 2)-th numbers. In this implementation, we use recursion to do that. Since the function calls itself twice, the time complexity is . Eventually, we will reach the base case, where the first and second numbers are computed.

The fibonacci sequence can also be computed in linear time by caching values that have already been calculated, or even in constant time using a direct formula. Again, finding these algorithms is left as an exercise for the reader.

Linearithmic time complexity

Linearithmic time complexity means that the time complexity is composed of a linear part and a logarithmic part. Many algorithms are linearithmic, such as quicksort, mergesort, and heapsort.

Those sorting algorithms all split the array into halves, and then recursively sort the halves. (Divide and Conquer)
Therefore the algorithms have a linearithmic time complexity. We will take a look at sorting algorithms in a later blog post, so there will be no example for now.

Worst time complexity ever

The is the worst time complexity you're every going to encounter, and I have never seen a sensible algorithm using it, so I created a stupid example:

			
int

computeNToThePowerN(int n, int m)

{

	if (m == 0)

	{

		return 1;

	}

	int sum = 0;

	for (int i = 0; i < n; i++)

	{

		sum += computeNToThePowerN(n, m - 1);

	}

	return sum;

}

This is basically the worst way to compute .
Amazing, isn't it?!

This function calls itself recursively times, and each time it adds up the result of the previous call. Essentially, this function has n nested for loops, each going over the range .

You're never going to encounter this time complexity, so take it as the upper bound of all possible time complexities.

Final thoughts

I hope you enjoyed this post and learnt the basics of space and time complexity! I hope to post more about Algorithms & Data Structures in the near future.

If you found this post helpful, feel free to share it! Any feedback is very welcome!

ADS 0x00: Big O

Published November 9th, 2021

Introduction Get link to this section

Analysing code Get link to this section

Common time complexities Get link to this section

Linear time complexity Get link to this section

Constant time complexity Get link to this section

Logarithmic time complexity Get link to this section

Quadratic time complexity Get link to this section

Exponential time complexity Get link to this section

Linearithmic time complexity Get link to this section

Worst time complexity ever Get link to this section

Final thoughts Get link to this section