Tricks with SQL: Beware of the Predicate in the Outer Join

Uli Bethke SQL

Today we will have a look what happens when we place a filter condition into an outer join. We will also have a look at how this compares to placing the filter condition into the WHERE clause. Finally we'll have a look where this could be useful.

Let's first create some sample data.

Let's run our first outer join query where we put the filter condition into the WHERE clause. All is as expected. The query just returns customer_id 1.

Now let's run the same query, but we put the filter condition into the join. This may be different from what you may have expected. What happens is that the query returns all of the rows in the customer table and those rows of table customer_loc where the join condition is met, i.e. those rows where customer_id = 1.

Let's verify what we have just seen with another query. This time we will put customer_id = 4 into the Join condition. There are no rows in table customer_loc that match this. As expected the query returns all rows for table customer but now rows for table customer_loc

What could this be useful for? One use case would be some limited form of data densification whereby you need to return all of the tables in one table but only a subset of rows in another table. Typically this can only be done using some sort of subselect. An example:

The query below is run in the SH schema and returns all of the customers, but only those sales transactions that are larger than 1000 in sales_amount.