February 11, 2022
NaN minute read
Writing good SQL code requires sound knowledge of SQL syntax. But this is not enough, writing professional SQL code requires good formatting skills too.
Beginners often neglect the importance of SQL indentation and formatting their code, but it’s only when they find it difficult to read, edit, and compare with other queries, that they understand the true meaning of “well-written” code.
Experienced professionals, on the other hand, argue on various formatting styles. For instance, it has been a long-time debate whether SQL queries should use trailing or leading commas.
In this article, we’ll delve into SQL indentation and formatting, including the benefits of using a house style, general best practices, and how to format your SQL code using Mode.
Indentation in SQL refers to the spacing in a SQL query. In SQL, scripts indentation ensures that the code is displayed with a spacing that improves the readability of the queries. The length of the spacing is subjective and depends on the individual preferences for SQL commands, identifiers, functions, etc. For instance, you may keep an indentation of 3 spaces for column and table names, and 6 spaces for the nested queries.
Indentation in SQL scripts is a component of formatted SQL. In formatted SQL, queries are written with a uniformed case, fonts, and spaces so that it is easy and pleasant to read. This also saves time that could otherwise be spent breaking a large piece of code into chunks.
It can get difficult to comprehend a lengthy text if it’s not well-formatted, be it a plain text or SQL code. Using indentation in SQL not only reduces the chances of error, but it also cuts down the time to troubleshoot in case of any errors. Indented code is highly maintainable and helps when different teams are working on the same project.
Although SQL indentation doesn’t affect the code technically, it improves the visualization and thus, it makes it easier for programmers to interpret the code. It helps particularly in the case of long queries with multiple levels of nesting.
As you can see, the indentations make this code easy to digest. You can quickly identify the columns, table, operator, join type and condition, etc. It is also indented at different levels the longer the code gets.
To show you how an unindented SQL code looks, here’s a snippet of a nested SQL statement with an inner join query.
The code shown above looks unorganized and has poor readability. It can take a while to understand what the query actually does.
SQL programmers and data scientists usually develop a habit of writing indented SQL code over time. However, if you’re not used to it, you can use online tools to format the code.
You can use the TAB key to space the code or use tools built within a SQL editor. Most query editors let you configure whether to insert spaces or tabs while indenting the code. You can customize the spacing option for various data statements, choosing between the number of spaces or tabs to be inserted on the new line before SQL code.
For instance, Microsoft SSMS lets you format code using SQL formatting options. You can also set how many spaces compose a single tab or indent. Alternatively, you can format SQL queries using third-party tools such as ApexSQL.
Mode offers you a one-click option to format your SQL code, with our “Format SQL” button, as shown in the gif below. It transforms the bulky and messy code into neat-looking SQL. To undo the formatting, simply click the same button.
While there are no set rules for SQL indentation, formatting patterns largely depend on individual style and preference.
Speaking of preference, there is generally less disagreement around the importance of indentation and more about smaller formatting details like case, brackets, underscores, commas, etc.
Let’s take a look at some of the industry best practices for SQL formatting.
Commas
Many data scientists who read the code vertically prefer trailing commas, whereas the ones who have a ‘horizontal’ style like leading commas. The latter lot believes that it makes it easier to spot a missing comma and troubleshoot errors faster.
Around 80% of the Mode users follow trailing comma patterns, 5% like leading commas, and the rest use a mixed pattern, according to Benn's analysis. While it doesn’t matter whether you use a trailing or leading pattern, what’s important is to stick to a convention. Hence, that’s when a mixed pattern can create problems. Not only is it visually absurd, but it might also irritate readers too.
2. Brackets
Use an open bracket on the same line with the closing bracket on the separate line. Another convention is to use brackets for WHERE and ON clauses with the closing brackets at the same indentation as the opening ones.
3. Joins
While joining to any data from a different source, prefix the field with the name of the data source to avoid confusion. Use the same indentation space for all JOIN clauses.
4. Column names
Some programmers like to have column names on different lines, whereas others like to have it in the same line or distributed over just a couple of lines. The reason being they don’t like to scroll up and down too often and prefer to view more code on the same screen.
5. Case
Usually, keywords, operators, and functions are written in uppercase, while the field names are typed in lower case. Avoid CamelCase as it’s difficult to read.
6. Nested queries
Nested or sub-queries often contain arithmetic operators. They can be set by moving the operation on a new line and setting specific indentation for each operation. Consider the code below for instance:
After applying the Mode formatting style, it looks like:
7. General
Use an underscore where you would naturally use a space. However, avoid using them to separate name parts. Start boolean fields with `has_`, `is_`, or `does_`. End timestamp with `_at` and dates with `_date`.
SQL indentation is a subjective matter! We suggest keeping a notebook of guidelines—or house style—across the company so that all data teams follow the same preferences. The aim should be to enable a programmer to understand the code by skimming through it quickly.
If you want to experiment with our Format SQL feature, sign up for Mode for free and use it in our freemium Mode studio option.
Data analysts and scientists prefer Mode for fast, flexible, and collaborative data analysis. Explore the tool and share your feedback with us at hi [at] modeanalytics.com or talk to our sales team. We’d love to hear from you!
Work-related distractions for data enthusiasts.