From Code to Clarity: Visualizing SQL code for Documentation and Debugging

Published on May 29, 2023
Updated on September 18, 2024

In this post we will show you how you can create a visual diagram of your SQL code. We will use the FlowHigh SQL visualiser to achieve this.

FlowHigh is the essential toolbox for SQL data engineers and SQL developers. It helps engineers to increase their productivity when developing SQL code and data pipelines.

Apart from the FlowHigh SQL visualiser it is made up of the following components

FlowHigh SQL Optimiser. The Optimiser detects 30+ anti patterns in SQL code and gives recommendations on how to fix them.

FlowHigh SQL Parser. The SQL parser helps to break down individual SQL code into its components.

FlowHigh SQL Tuning. A marketplace that brings together SQL experts and for tuning poorly performing SQL queries.

We will add new features such as automated SQL data pipelines and SQL impact analysis over time. Get access to the FlowHigh to keep up to date and try out existing and new features.

Let’s take a simple SELECT statement and show how the FlowHigh visualiser converts the code to a visual flow diagram.

Visualise SQL code

The SQL code is of low complexity and contains multiple Common Table Expressions (CTE). Let’s have a look at the visual diagram that FlowHigh Visualiser generates

Using the visual diagram of the SQL code we can quickly see that it breaks down into two CTEs.

The first CTE f_customer_sales joins the tables FactInternetSales and DimCustomer together.

The second CTE d_date queries the table DimDate and applies a filter on the CalendarYear field.

The two CTEs are then joined and an aggregation is applied. In a final step the resultset is sorted and returned to the user.

Table types

For representing SQL code we have come up with a categorisation for tables. We have already come across three of these table types in the previous section: physical tables, CTEs, and Select.

Let’s go through them one by one

Physical table

A physical table is a standard table that has been created with the CREATE TABLE DDL statement.

The following query references a physical table

… and the representation by FlowHigh on the diagram.

Pseudo tables

Pseudo tables are tables that are created on the fly by hard coding values.

An example

Common Table Expressions (CTE)

We have already seen CTEs in action in our first example. CTEs are reusable snippets of SQL that can be nested and referenced in other components of the same SQL statement.

There is one CTE visualisation feature I did not mention. Sometimes developers write a CTE and then don’t use it.

Here is an example

We have added a third CTE d_geo to the SQL. However, this is not referenced anywhere else.

We can see that this results in an orphaned component on the diagram which is not connected to any other parts of the SQL. We also grey this out on the diagram.

This will also be detected by the FlowHigh optimiser as an SQL anti pattern.

Inline view

Inline views are similar to CTEs. They can both be nested. One of the differences is that CTEs are reusable and multiple instances can be referenced inside the same SQL statement.

Recursive CTE

Recursive CTEs traverse a hierarchy. Here is an example

The diagram shows the anchor and recursive clause. For a great explanation on how recursive CTEs work check this blog post Understanding Recursive CTEs.

Function table

Some databases and SQL dialects offer User Defined Table Functions (UDTF).

Here is an example for a UDTF

And how FlowHigh displays a UDTF in the diagram

Browsing and searching SQL

You can browse the diagram

You can find tables by browsing and selecting tables from the Table List. This is very useful if you need to quickly find a table in a very large and complex SQL statement with many subqueries and table joins.

In the figure below we have selected the DimCutsomer table and the f_customer_sales CTE. They are highlighted on the diagram.

You can also browse the diagram itself and select tables which will then highlight the corresponding part of the SQL code.

In the figure below we have selected the f_customer_sales CTE. This will highlight the definition of the CTE and also any instances of the CTE.

We search for the d_date CTE and results are highlighted in the editor and the diagram

Expand and collapse

You can collapse and also expand any CTEs and Inline Views. This is useful to drill down into the SQL hierarchy and nest or unnest any lower or higher level parts of the SQL.

Let’s collapse both CTEs to focus on the top level tables that make up our SQL.

We can now drill further into our code one CTE at a time

The same expand and collapse functionality also exists for the SQL code editor. We can also expand and collapse and inline views or CTEs.

This drill down feature particularly useful for deeply nested and complex SQL

Level 1

Level 2

Level 3

Level 4

Columns

You can also view column level detail

Hover over a table

…or an operator

Expand a table

Expand all tables

Other features

Export and download SQL diagram

You can also download the SQL diagram as a PNG to use in your documentation.

Zoom

You can zoom in and out of a diagram by using your mouse wheel or the zoom feature in the bottom right corner of the diagram

FlowHigh [Video]: Visualise, format, optimise and parse your SQL

Use cases for visualising SQL

Visualising SQL is useful for the following scenarios

  • SQL query visualisation helps understand complex queries. It allows developers to see how each part of the query contributes to the final result, making it easier to understand and debug.
  • SQL query visualisations can be used to document SQL code

Contact us if you would like to be part of the private preview of the FlowHigh SQL revolution