SQL Server CREATE INDEX Statement

Summary: in this tutorial, you will learn how to use the SQL Server CREATE INDEX statement to create nonclustered indexes for tables.

Introduction to SQL Server non-clustered indexes #

A nonclustered index is a data structure that improves the speed of data retrieval from tables. Unlike a clustered index, a nonclustered index sorts and stores data separately from the data rows in the table. It is a copy of selected columns of data from a table with the links to the associated table.

Like a clustered index, a nonclustered index uses the B-tree structure to organize its data.

A table may have one or more nonclustered indexes and each non-clustered index may include one or more columns in a table.

The following picture illustrates the structure of a non-clustered index:

Besides storing the index key values, the leaf nodes also store row pointers to the data rows that contain the key values. These row pointers are also known as row locators.

If the underlying table is a clustered table, the row pointer is the clustered index key. In case the underlying table is a heap, the row pointer points to the row of the table.

SQL Server CREATE INDEX statement #

To create a non-clustered index, you use the CREATE INDEX statement:

CREATE [NONCLUSTERED] INDEX index_name
ON table_name(column_list);Code language: SQL (Structured Query Language) (sql)

In this syntax:

First, specify the name of the index after the CREATE NONCLUSTERED INDEX clause. Note that the NONCLUSTERED keyword is optional.
Second, specify the table name on which you want to create the index and a list of columns of that table as the index key columns.

SQL Server CREATE INDEX statement examples #

We will use the sales.customers from the sample database for the demonstration.

The sales.customers table is a clustered table because it has a primary key customer_id.

1) Using the CREATE INDEX statement to create a nonclustered index for one column example #

This statement finds customers who are located in Atwater:

SELECT 
    customer_id, 
    city
FROM 
    sales.customers
WHERE 
    city = 'Atwater';Code language: SQL (Structured Query Language) (sql)

If you display the estimated execution plan, you will see that the query optimizer scans the clustered index to find the row. This is because the sales.customers table does not have an index for the city column.

To improve the speed of this query, you can create a new index named ix_customers_city for the city column:

CREATE INDEX ix_customers_city
ON sales.customers(city);Code language: SQL (Structured Query Language) (sql)

Now, if you display the estimated execution plan of the above query again, you will find that the query optimizer uses the nonclustered index ix_customers_city:

SQL Server CREATE INDEX one column index seek

2) Using the CREATE INDEX statement to create a nonclustered index for multiple columns #

The following statement finds the customer whose last name is Berg and the first name is Monika:

SELECT 
    customer_id, 
    first_name, 
    last_name
FROM 
    sales.customers
WHERE 
    last_name = 'Berg' AND 
    first_name = 'Monika';Code language: SQL (Structured Query Language) (sql)

SQL Server CREATE INDEX on multiple columns index scan

The query optimizer scans the clustered index to locate the customer.

To speed up the retrieval of data, you can create a nonclustered index that includes both last_name and first_name columns:

CREATE INDEX ix_customers_name 
ON sales.customers(last_name, first_name);Code language: SQL (Structured Query Language) (sql)

Now, the query optimizer uses the index ix_customers_name to find the customer.

SELECT 
    customer_id, 
    first_name, 
    last_name
FROM 
    sales.customers
WHERE 
    last_name = 'Berg' AND 
    first_name = 'Monika';Code language: SQL (Structured Query Language) (sql)

SQL Server CREATE INDEX on multiple columns index seek

When you create a nonclustered index that consists of multiple columns, the order of the columns in the index is very important. You should place the columns that you often use to query data at the beginning of the column list.

For example, the following statement finds customers whose last name is Albert. Because the last_name is the leftmost column in the index, the query optimizer can leverage the index and use the index seek method for searching:

SELECT 
    customer_id, 
    first_name, 
    last_name
FROM 
    sales.customers
WHERE 
    last_name = 'Albert';Code language: SQL (Structured Query Language) (sql)

This statement finds customers whose first name is Adam. It also leverages the ix_customer_name index. But it needs to scan the whole index for searching, which is slower than index seek.

SELECT 
    customer_id, 
    first_name, 
    last_name
FROM 
    sales.customers
WHERE 
    first_name = 'Adam';
Code language: SQL (Structured Query Language) (sql)

SQL Server CREATE INDEX multiple columns not leftmost column index scan

Therefore, it is a good practice to place the columns that you often use to query data at the beginning of the column list of the index.

Summary #

A non-clustered index copies the table data and stores it in a separate data structure (B-tree).
A table can have multiple non-clustered indexes.
Use the CREATE INDEX statement to create a non-clustered index to enhance the query speed.

Was this tutorial helpful?