Introduction
Linear data structures are by far the most common kinds of data
structure in computer science.
If implemented correctly, they are easy, fast, and efficient.
The idea of any data structure is that they store arbitrary data
in a way that makes it easy to access and manipulate the data.
We will go over all the important linear data structures with
implementations in Java and amortised analysis of the methods
they provide.
Arrays
Arrays are the most common linear data structure. Most other linear data structures are implemented using arrays.
As you can see, arrays are simple:
they are simple contiguous blocks of memory that hold data.
Elements on the array are accessed by their index.
Indices start at 0, except they start at 1 in some ghoulish
programming languages that shall not be named for legal reasons.
In Java, which is a not a ghoulish language (or is it?),
indices start at 0.
Pros:
- Any element can be accessed easily by its index
- The elements are stored contiguously, which means that the caching of elements is very efficient
- It is very straightforward to iterate over the elements on the array
- Arrays are fixed size, so eventually we will run out of space
- Arrays don't have a lot of functionality, especially when it comes to searching for elements
Array Implementation
1
public class Array<T>
2
{
3
private T[] array;
4
5
public
6
Array(int size)
7
{
8
// Allocate memory for the array.
9
10
array = new T[size];
11
}
12
13
public T
14
get(int index)
15
{
16
// Check if the index is valid.
17
18
if(index < 0 || index >= array.length)
19
{
20
throw new IndexOutOfBoundsException();
21
}
22
23
// Return the element at the provided index.
24
25
return array[index];
26
}
27
28
public void
29
set(int index, T value)
30
{
31
// Check if the index is valid.
32
33
if(index < 0 || index >= array.length)
34
{
35
throw new IndexOutOfBoundsException();
36
}
37
38
// Set the element at the provided index.
39
40
array[index] = value;
41
}
42
}
In order to support any types of data, we use Java generics.
This allows us to use the same array implementation for all
types of data.
An array of integers can be written as
Array<Integer>
,
an array of strings can be written as
Array<String>
,
and so on.
As you can see, arrays are very simple and don't have a lot of
functionality.
Linked Lists
Linked lists are a non-contiguous data structure. Each node in the list contains a reference to the next node.
Linked Lists vs. Arrays
As you can see from the diagram, linked lists are very similar
to arrays. The biggest difference is that linked lists are not
contiguous, whereas arrays are.
Pros:
- It is very easy to add or remove elements from a linked list at any position
- It is still easy to iterate over all elements of a linked list
- Since linked lists are not contiguous, we can't access elements by their index
- Caching locality of linked lists is really bad because the elements are not stored contiguously
- Linked lists waste memory because we need to store a reference to the next node
Linked lists are slower because the elements are not contiguously stored. This can lead to a lot of cache misses when accessing elements. This can have a substantial performance impact in programs.
The only good reason to use linked lists is if you really need a container that supports adding or removing elements at any position.
If you store references to elements in a linked list in another data structure, you might be able to add or remove elements in constant time.
Linked List Implementation
1
public class Node<T>
2
{
3
public T data;
4
public Node<T> next;
5
6
public
7
Node(T data)
8
{
9
this.data = data;
10
next = null;
11
}
12
}
13
14
public class LinkedList<T>
15
{
16
private Node<T> head;
17
private Node<T> tail;
18
private int curSize;
19
20
public
21
LinkedList()
22
{
23
// Initialise the linked list.
24
25
head = null;
26
tail = null;
27
curSize = 0;
28
}
29
30
public int
31
size()
32
{
33
return curSize;
34
}
35
36
public Node<T>
37
getNodeByIndex(int index)
38
{
39
// Check if the index is valid.
40
41
if (index < 0 || index >= curSize)
42
{
43
throw new IndexOutOfBoundsException();
44
}
45
46
// Iterate over the list until we reach the index.
47
48
Node<T> curNode = head;
49
50
for (int i = 0; i < index; i++)
51
{
52
curNode = curNode.next;
53
}
54
55
return curNode;
56
}
57
58
public T
59
get(int index)
60
{
61
return getNodeByIndex(index).data;
62
}
63
64
public Node<T>
65
getNodeByData(T data)
66
{
67
// Iterate over the list until we reach
68
// the first node with the given data.
69
70
Node<T> curNode = head;
71
72
while (curNode != null)
73
{
74
if (curNode.data == data)
75
{
76
break;
77
}
78
79
curNode = curNode.next;
80
}
81
82
// If there was a node with the given data,
83
// a reference to it is returned.
84
// The reference will be null if a node
85
// with the given data was not found.
86
87
return curNode;
88
}
89
90
...
91
}
Above you can see the core structure of a linked list.
The head and tail nodes are used to keep track of the
first and last elements of the list.
There are some utility functions that can be used to
find elements in the list.
We also keep track of the current size of the list.
Every time we add or remove an element, we increment the size,
and every time we add or remove an element,
we decrement the size.
Next, we will add a couple of methods to the linked list
and go over how they work.
Linked List: Append Element
Appending an element to the end of a linked list is pretty easy.
We just need to create a new node and append it to the tail.
There is one edge case to be aware of though:
if the linked list is empty, we need to set the head and tail
to the new node instead.
1
public void
2
append(T data)
3
{
4
// Create a new node.
5
// Since `node.next` is initialised to NULL already,
6
// so we don't need to set it.
7
8
Node<T> node = new Node<T>(data);
9
10
// Check if the list is empty.
11
12
if(head == null)
13
{
14
// Set the head and tail to the new node.
15
16
head = node;
17
tail = node;
18
}
19
else
20
{
21
// Add the node to the end of the linked list.
22
23
tail.next = node;
24
25
// Update the tail.
26
27
tail = node;
28
}
29
}
Linked List: Prepend Element
Prepending an element to the beginning of a linked list is very similar to appending an element to the end of a linked list:
1
public void
2
prepend(T data)
3
{
4
// Create a new node.
5
6
Node<T> node = new Node<T>(data);
7
8
// Check if the list is empty.
9
10
if(head == null)
11
{
12
// Set the head and tail to the new node.
13
14
head = node;
15
tail = node;
16
}
17
else
18
{
19
// Add the node to the beginning of the linked list.
20
21
node.next = head;
22
23
// Update the head.
24
25
head = node;
26
}
27
}
Removing the first or last element from the linked list is also quite trivial and is left as an exercise for the reader.
Doubly Linked Lists
In order to support some of the more advanced operations
like adding or removing elements at the middle of the list,
we need to add a reference to the previous node.
A linked list with nodes that store references to the previous
node is called a doubly linked list and looks like this:
Note that each node now also points back to the previous node,
and the previous element of the head is NULL.
We simply add a reference to the previous node to the
Node<T>
class and update the constructor:
1
public class Node<T>
2
{
3
public T data;
4
public Node<T> next;
5
public Node<T> prev;
6
7
public
8
Node(T data)
9
{
10
this.data = data;
11
next = null;
12
prev = null;
13
}
14
}
That's it! We don't really have to change anything in the
LinkedList<T>
class, except for the append
and prepend
methods.
Think about how we would have to change these methods for a
doubly linked list.
Doubly Linked List: Add Element
If you have a reference to the previous node, you can add a new element to the middle of the list by updating a couple of references, as you can see in the picture above.
1
public void
2
add(T data, Node<T> prevNode)
3
{
4
// Create a new node.
5
6
Node<T> node = new Node<T>(data);
7
8
// Save a reference to the next node.
9
10
Node<T> nextNode = prevNode.next;
11
12
// Set the next and prev references on the new node.
13
14
node.next = nextNode;
15
node.prev = prevNode;
16
17
// Link the node with the previous node.
18
19
prevNode.next = node;
20
21
// Link the node with the next node.
22
23
if (nextNode != null)
24
{
25
nextNode.prev = node;
26
}
27
28
// If there is no next node, set the new node as tail
29
30
else
31
{
32
tail = node;
33
}
34
}
Doubly Linked List: Remove Element
Finally, let's create a method that takes a reference to a node and removes it from a doubly linked list.
1
public void
2
remove(Node<T> node)
3
{
4
// Save a reference to the next node.
5
6
Node<T> nextNode = node.next;
7
8
// Save a reference to the previous node.
9
10
Node<T> prevNode = node.prev;
11
12
// If we're removing the head, update the head.
13
14
if (prevNode == null)
15
{
16
head = nextNode;
17
nextNode.prev = null;
18
}
19
20
// Else, link the previous node with the next node.
21
22
else
23
{
24
prevNode.next = nextNode;
25
}
26
27
// If we're removing the tail, update the tail.
28
29
if (nextNode == null)
30
{
31
tail = prevNode;
32
prevNode.next = null;
33
}
34
35
// Else, link the next node with the previous node.
36
37
else
38
{
39
nextNode.prev = prevNode;
40
}
41
42
// Let Java's Garbage Collector delete the node.
43
}
Arrays vs. Linked Lists: Time Complexity
We have gone through some common operations on arrays and linked lists, so try to analyse their time complexity and compare your results to the below table.
Operation | Array | Linked List |
---|---|---|
Accessing by index | ||
Appending an element | - | |
Prepending an element | - | |
Adding an element | - | * |
Removing an element | - | * |
* Assuming we already have a reference to the node we want to
add or remove. If we don't have this reference, we will have to
traverse the list to find the node. This takes
time.
As you can see in the table, accessing elements from a linked
list takes time, and adding or
removing elements takes time.
The catch is that we can only add or remove elements if we have
a reference to the node we want to add to or to remove.
We first have to find the node, so we still have to traverse
the list to add or remove an element.
So, linked lists really only make sense if you store references
to nodes you might want to change, which is usually not
very efficient.
The only true operation
that a linked list supports is appending or prepending an
element, which we cannot do with arrays, since they are fixed
in size.
Also, since arrays are fixed in size, we cannot really do
anything with them except accessing and changing elements by
index.
This is why dynamic arrays and deques exist, which are arrays
that can change in size and support
appending and prepending
operations.
Dynamic arrays and deques are much more efficient than linked
lists, since they do store their elements contiguously.
We will talk about these data structures in the
next blog post.
Final Thoughts
In today's blog post, we've learnt how arrays and linked lists
work, and what their differences are. We've also learnt how to
implement linked lists and doubly linked lists in Java.
In the next blog post, we'll learn how to implement a
dynamic array works and we will use it to implement a stack and
a queue.
If you found this post helpful, feel free to share it!
Any feedback is very welcome!