SQL Server Self Join

Summary: in this tutorial, you will learn how to use the SQL Server self join to join a table to itself.

SQL Server self join syntax

A self join allows you to join a table to itself. It helps query hierarchical data or compare rows within the same table.

A self join uses the inner join or left join clause. Because the query that uses the self join references the same table, the table alias is used to assign different names to the same table within the query.

Note that referencing the same table more than once in a query without using table aliases will result in an error.

The following shows the syntax of joining the table T to itself:

SELECT
    select_list
FROM
    T t1
[INNER | LEFT]  JOIN T t2 ON
    join_predicate; 
Code language: SQL (Structured Query Language) (sql)

The query references the table T twice. The table aliases t1 and t2 are used to assign the T table different names in the query.

SQL Server self join examples

Let’s take some examples to understand how the self join works.

1) Using self join to query hierarchical data

Consider the following  staffs table from the sample database:

SQL Server Self Join - staffs table

The  staffs table stores the staff information such as id, first name, last name, and email. It also has a column named manager_id that specifies the direct manager. For example, Mireya reports to Fabiola because the value in the manager_id of  Mireya is Fabiola.

Fabiola has no manager, so the manager id column has a NULL.

To get who reports to whom, you use the self join as shown in the following query:

SELECT
    e.first_name + ' ' + e.last_name employee,
    m.first_name + ' ' + m.last_name manager
FROM
    sales.staffs e
INNER JOIN sales.staffs m ON m.staff_id = e.manager_id
ORDER BY
    manager;
Code language: SQL (Structured Query Language) (sql)
SQL Server Self Join with INNER JOIN

In this example, we referenced to the  staffs table twice: one as e for the employees and the others as m for the managers. The join predicate matches the employee and manager relationship using the values in the e.manager_id and m.staff_id columns.

The employee column does not have Fabiola Jackson because of the INNER JOIN effect. If you replace the INNER JOIN clause by the LEFT JOIN clause as shown in the following query, you will get the result set that includes Fabiola Jackson in the employee column:

SELECT
    e.first_name + ' ' + e.last_name employee,
    m.first_name + ' ' + m.last_name manager
FROM
    sales.staffs e
LEFT JOIN sales.staffs m ON m.staff_id = e.manager_id
ORDER BY
    manager;
Code language: SQL (Structured Query Language) (sql)
SQL Server Self Join with LEFT JOIN clause

2) Using self join to compare rows within a table

See the following customers table:

customers table

The following statement uses the self join to find the customers located in the same city.

SELECT
    c1.city,
    c1.first_name + ' ' + c1.last_name customer_1,
    c2.first_name + ' ' + c2.last_name customer_2
FROM
    sales.customers c1
INNER JOIN sales.customers c2 ON c1.customer_id > c2.customer_id
AND c1.city = c2.city
ORDER BY
    city,
    customer_1,
    customer_2;Code language: SQL (Structured Query Language) (sql)
SQL Server Self Join - compare rows in the same table

The following condition makes sure that the statement doesn’t compare the same customer:

c1.customer_id > c2.customer_idCode language: SQL (Structured Query Language) (sql)

The following condition matches the city of the two customers:

AND c1.city = c2.cityCode language: SQL (Structured Query Language) (sql)

Note that if you change the greater than ( > ) operator by the not equal to (<>) operator, you will get more rows:

SELECT
    c1.city,
    c1.first_name + ' ' + c1.last_name customer_1,
    c2.first_name + ' ' + c2.last_name customer_2
FROM
    sales.customers c1
INNER JOIN sales.customers c2 ON c1.customer_id <> c2.customer_id
AND c1.city = c2.city
ORDER BY
    city,
    customer_1,
    customer_2;Code language: SQL (Structured Query Language) (sql)
SQL Server Self Join - compare rows in the same table with not equal to operator

Let’s see the difference between > and <> in the ON clause by limiting to one city to make it easier for comparison.

The following query returns the customers located in Albany:

SELECT 
   customer_id, first_name + ' ' + last_name c, 
   city
FROM 
   sales.customers
WHERE
   city = 'Albany'
ORDER BY 
   c;Code language: SQL (Structured Query Language) (sql)
SQL Server Self Join - customers in a city

This query uses ( >) operator in the ON clause:

SELECT
    c1.city,
    c1.first_name + ' ' + c1.last_name customer_1,
    c2.first_name + ' ' + c2.last_name customer_2
FROM
    sales.customers c1
INNER JOIN sales.customers c2 ON c1.customer_id > c2.customer_id
AND c1.city = c2.city
WHERE c1.city = 'Albany'
ORDER BY
    c1.city,
    customer_1,
    customer_2;Code language: SQL (Structured Query Language) (sql)

The output is:

SQL Server Self Join - compare rows in the same table with greater than operator

This query uses ( <>) operator in the ON clause:

SELECT
    c1.city,
	c1.first_name + ' ' + c1.last_name customer_1,
    c2.first_name + ' ' + c2.last_name customer_2
FROM
    sales.customers c1
INNER JOIN sales.customers c2 ON c1.customer_id <> c2.customer_id
AND c1.city = c2.city
WHERE c1.city = 'Albany'
ORDER BY
	c1.city,
    customer_1,
    customer_2;Code language: SQL (Structured Query Language) (sql)

Here is the output:

SQL Server Self Join with not equal to operator

In this tutorial, you have learned how to use an SQL Server self join to query hierarchical data and compare rows in the same table.

Was this tutorial helpful?