Introduction
In the programming world, there are many different coding styles.
Each of them has its own set of rules and guidelines.
Sometimes a coding style is bound to the language it is used in,
because of specific syntax rules.
But usually the rules are more general and can be applied to
almost any language.
Coding style can be broken down into two main categories:
- Indentation & Whitespace
- Placement of brackets
In this blog post I will discuss the most popular coding styles among C-like languages mostly. These languages include C, C++, C#, Java, JavaScript, TypeScript, PHP, Go, Rust, Kotlin, Scala, R, Swift and many others.
Many things in this post still apply to languages that don't inherit their syntax from C, like Python and Ruby.
Brace Placement
C-like languages tend to use a lot of braces. Braces are used to
define blocks of code, and in many languages also to enclose
classes, structures, and array or object literals.
There are different brace placement styles. I will go over
the most popular ones and discuss their pros and cons.
Note that no brace placement style is objectively the best.
Not everyone finds the same style the most readable.
Much is subjective, and comes down to aestatics.
1TBS (One True Brace Style)
1
void fibonacci(int n) {
2
if (n < 2) {
3
return 1;
4
}
5
6
return fibonacci(n - 1) + fibonacci(n - 2);
7
}
8
9
int main() {
10
for (int i = 0; i < 10; i + 1) {
11
printf("%d ", fibonacci(i));
12
}
13
}
1TBS is a very popular brace style. It probably originates from Java, where 1TBS is the convention. In 1TBS, all opening braces are put on the same line. Closing braces are put on their own line.
K&R (Kernighan and Ritchie)
1
void fibonacci(int n)
2
{
3
if (n < 2) {
4
return 1;
5
}
6
7
return fibonacci(n - 1) + fibonacci(n - 2);
8
}
9
10
int main()
11
{
12
for (int i = 0; i < 10; i + 1) {
13
printf("%d ", fibonacci(i));
14
}
15
}
The creators of the C programming language, Brian Kernighan and
Dennis Ritchie, used this style of brace placement.
K&R is similar 1TBS, but opening braces for functions are put
on their own lines.
The reason for this is that in the old days of C, the types
of the parameters were declared differently:
1
void my_function(a, b, c)
2
int a;
3
float b;
4
char *c;
5
{
6
int local_var;
7
8
...
9
}
Let's look at the code again in 1TBS, and we will see that it becomes much less readable:
1
void my_function(a, b, c)
2
int a;
3
float b;
4
char *c; {
5
int local_var;
6
7
...
8
}
It is very easy to misread the local variable declaration as
a parameter, because it comes right after the parameter list.
The K&R style fixes this, making it very clear where the
function body starts.
Even though parameters are rarely declared like this anymore in
C, many people stuck to K&R style because they found it easier
to read function declarations with more space between the
parameter list and the function body.
Allman
1
void fibonacci(int n)
2
{
3
if (n < 2)
4
{
5
return 1;
6
}
7
8
return fibonacci(n - 1) + fibonacci(n - 2);
9
}
10
11
int main()
12
{
13
for (int i = 0; i < 10; i + 1)
14
{
15
printf("%d ", fibonacci(i));
16
}
17
}
The Allman style goes even further than K&R.
All opening braces are put on their own lines.
I personally find this style of brace placement the most
readable. Of course, I might be very biased, but let me explain
why it is so good:
The Allman style really shifts your perspective of writing
clean code. Sticking to any proper set of rules for indentation
will of course help, but I found that the Allman style provides
the best balance between freedom of writing code and readability
of the final product.
The Allman style gives you an extra dimension: a feel for space
in your code. Since putting all braces on their own lines leads
to the lines of code being further apart from each other, it
becomes very natural to leave lines blank every now and then,
intuitively grouping your code into logical blocks that fit
together:
1
/**
2
* @brief Checks if a free memory block can be merged with
3
* its neigbouring blocks.
4
* If this is possible, the blocks are merged.
5
*
6
* @param block The free block to check.
7
*/
8
void
9
maybe_merge_free_blocks(FreeHeapBlockHeader *block)
10
{
11
bool prev_exists = block->prev_block != nullptr;
12
bool next_exists = block->next_block() != heap_end;
13
14
// If both the previous and next blocks are free,
15
// merge them with the current block.
16
17
if (prev_exists && block->prev_block->is_free()
18
&& next_exists && block->next_block()->is_free())
19
{
20
merge_three_free_blocks(
21
(FreeHeapBlockHeader *) block->prev_block, block,
22
(FreeHeapBlockHeader *) block->next_block());
23
}
24
25
// If only the previous block is free, merge it with the current block.
26
27
else if (prev_exists && block->prev_block->is_free())
28
{
29
merge_two_free_blocks(
30
(FreeHeapBlockHeader *) block->prev_block, block);
31
}
32
33
// If only the next block is free, merge it with the current block.
34
35
else if (next_exists && block->next_block()->is_free())
36
{
37
merge_two_free_blocks(
38
block, (FreeHeapBlockHeader *) block->next_block());
39
}
40
41
// The neigbouring blocks are both allocated.
42
// We cannot merge anything, so we return.
43
}
Good code should be readable, and using whitespace correctly
is a key part of that. When I write code that other people will
have to read as well, I want to make sure that the code does
not scream at the reader and is self-explanatory.
In the above snippet, many lines are too long and are split
across multiple lines. You can see that these lines are still
very readable using the Allman style.
When you start grouping small chunks of code into logical
blocks, you start realising that each block does something
specific. If this specific thing is not very obvious, you
should comment out what your code block does and why.
The above code snippet has a lot of comments, because it is
part of a memory allocator, which is a rather complex thing.
When code becomes too long: Allman for the rescue
We've all been there: we write a simple function that does
something, and we start adding more and more code to it.
At some point we find ourselves in a situation where we
have a function with parts in it that don't fit nicely on the
screen anymore.
Sometimes, it just makes more sense to have a long line of
code, rather than breaking it up in shorter lines and losing
the context of what the code does.
As you have seen in the previous section, the Allman style
handles these situations really well, due to the fact that
all braces are on their own lines. This allows the reader
(and writer) of the code to easily see where code blocks start.
Let's look at the following long code:
1
HashMap<Integer, ArrayList<String>> mapNumbersToPlaceNames(ArrayList<Integer> inputs, ArrayList<ArrayList<String>> outputs) {
2
HashMap<Integer, ArrayList<String>> map;
3
4
...
5
6
if (foundIndex < inputs.size() && inputs[foundIndex] == number && !map.contains(number)) {
7
map.put(number, outputs[foundIndex]);
8
}
9
10
...
11
12
return map;
13
}
In the above code, we have three problems:
- The function signature line is too long.
- The code inside the if-statement check is too long.
- We cannot easily tell where the function and if-statement bodies start.
The code is written using a 1TBS brace placement style, which makes it very hard to see where the function and if-statement bodies begin.
Let's convert this code to the Allman style and see what happens:
1
HashMap<Integer, ArrayList<String>> mapNumbersToPlaceNames(ArrayList<Integer> inputs, ArrayList<ArrayList<String>> outputs)
2
{
3
HashMap<Integer, ArrayList<String>> map;
4
5
...
6
7
if (foundIndex < inputs.size() && inputs[foundIndex] == number && !map.contains(number))
8
{
9
map.put(number, outputs[foundIndex]);
10
}
11
12
...
13
14
return map;
15
}
The code still looks horrible, but it already a bit easier to
read because of the extra whitespace.
To fix problems like this in the Allman style, we will insert
more whitespace. We will simply continue the code on a new
line, and indent the code by one tab. This will make the code
look much more readable.
1
HashMap<Integer, ArrayList<String>> mapNumbersToPlaceNames(
2
ArrayList<Integer> inputs, ArrayList<ArrayList<String>> outputs)
3
{
4
HashMap<Integer, ArrayList<String>> map;
5
6
...
7
8
if (foundIndex < inputs.size()
9
&& inputs[foundIndex] == number
10
&& !map.contains(number))
11
{
12
map.put(number, outputs[foundIndex]);
13
}
14
15
...
16
17
return map;
18
}
If we run out of screen space or we want to enforce the
80-column rule, we can just continue writing the code on the
next line and the indentation will be preserved.
This only works well because the brace is placed on its own
line. In the 1TBS style, one of the three initial problems
will still remain: we cannot easily tell where the function
and if-statement bodies start.
1
// Allman brace placement
2
3
if (foundIndex < inputs.size()
4
&& inputs[foundIndex] == number
5
&& !map.contains(number))
6
{
7
map.put(number, outputs[foundIndex]);
8
}
9
10
// One True Brace Style
11
12
if (foundIndex < inputs.size()
13
&& inputs[foundIndex] == number
14
&& !map.contains(number)) {
15
map.put(number, outputs[foundIndex]);
16
}
17
18
// One True Brace Style fix 1
19
20
if (foundIndex < inputs.size()
21
&& inputs[foundIndex] == number
22
&& !map.contains(number)) {
23
map.put(number, outputs[foundIndex]);
24
}
25
26
// One True Brace Style fix 2
27
28
if (foundIndex < inputs.size()
29
&& inputs[foundIndex] == number
30
&& !map.contains(number)) {
31
map.put(number, outputs[foundIndex]);
32
}
In the 1TBS style, we could somewhat easily fix the problem by adding a second level indentation. Some people prefer to indent the code in the check by one level, and the code inside the if-statement body another time, but I dislike this idea because it still looks messy. It also makes the code look like it is two levels deep, but it is actually only one level deep. Another way to fix this in the 1TBS style is to use two levels of indentation in the check, and one for the body. This distinguishes the check from the body a bit better, but I think the Allman style just looks better, and it is more intuitive to write.
Tabs and Spaces
Indentation can be accomplished with tabs or spaces in most
programming languages. You could even mix them up, but I don't
think that's a good idea. (Though indentation and alignment
are seperate things. If you're using tabs for indentation,
you should still align your code using spaces.)
I personally prefer using tabs, because they are easier to
delete and replace, since they are only one character long.
Tabs allow you to set your own indentation size,
while allowing other people to use their preferred indentation
size. For some people, including me, it can be annoying to work
on a project which uses spaces with a different indentation
size than my own. On top of that, visually impaired people
might not even be able to work on code that does not use tabs,
because they often have to use a huge font size.
They are forced to converting the spaces to tabs, and then
converting the tabs back to spaces after editing the code.
That is a lot of unnecessary pain, so I encourage everyone
to use tabs.
The size of indentation is really up to you. I personally
prefer 8 width tabs, but I had been using 2 character before.
Some people heavily advocate using 8 width tabs:
- Linus Torvalds @ The Linux Coding Style
4 width tabs are a good compromise between readability and real estate, and are the default for most editors. The code snippets on my website are all using 4 width tabs for these reasons.
Return Types
In statically typed languages, the return type of a function often comes directly before the function name and its parameters. Return types might get very long, so some people prefer to put them on their own line. This is a good idea because it makes the code more readable. I have recently switched to enforcing this rule, and I found that it actually makes the code more readable, so I would like to bring it up in this blog page.
1
HashMap<Integer, ArrayList<String>>
2
mapNumbersToPlaceNames(...)
3
{
4
...
5
}
6
7
int
8
otherFunction(...)
9
{
10
...
11
}
For functions with long return types this definitely
makes the code more readable, but for functions with short
return types it might seem a little strange at first.
A cool side-effect of enforcing this rule in languages like C,
where functions can only be declared on the first level of
indentation, is that function declarations can be matched with
the regex ^functionName\(
.
This allows you to easily find the function declaration
by searching for the function name, but excluding all the
calls to the function.
I'm not telling you that you must use this rule from now on
of course, but putting return types on their own lines might
improve readability for functions with long return types,
so feel free to try it out.
Final Thoughts
In the end, there is of course no objectively best style
for indentation. You should try to find a style that works
for you. I hope this blog post will make you more aware of
the different styles of indentation and tools to make your
code look more readable. After all, readable code is much
nicer to work with than code that is hard to read.
Hopefully, you will write all your code using the Allman style
from now on, because it's simply the best style.
(No, I'm just kidding.)
If you have any comments or suggestions, feel free to
contact me.