Parts of a SELECT Statement

Accessing and Changing Relational Data

Accessing and Changing Relational Data

Parts of a SELECT Statement

The full syntax of the SELECT statement is complex, but the main clauses can be summarized as:

SELECT select_list
[INTO new_table_name]
FROM table_list
[WHERE search_conditions]
[GROUP BY group_by_list]
[HAVING search_conditions]
[ORDER BY order_list [ASC | DESC] ]

select_list

Describes the columns of the result set. It is a comma-separated list of expressions. Each expression defines both the format (data type and size) and the source of the data for the result set column. Each select list expression is usually a reference to a column in the source table or view the data is coming from, but can be any other expression, such as a constant or a Transact-SQL function. Using the * expression in a select list specifies that all columns in the source table are returned.

INTO new_table_name

Specifies that the result set is used to create a new table. new_table_name specifies the name of the new table.

FROM table_list

Contains a list of the tables from which the result set data is retrieved. These sources can be:

  • Base tables in the local server running Microsoft® SQL Server™.

  • Views in the local SQL Server. SQL Server internally resolves a view reference to references against the base tables that make up the view.

  • Linked tables, which are tables in OLE DB data sources made accessible to SQL Server. This is called a distributed query. OLE DB data sources can be accessed from SQL Server by linking them as a linked server, or referencing the data source in an OPENROWSET or OPENQUERY function.

    The FROM clause can also contain join specifications, which define the specific path SQL Server is to use in navigating from one table to another.

    The FROM clause is also used on the DELETE and UPDATE statements to define the tables that are modified.

WHERE search_conditions

The WHERE clause is a filter that defines the conditions each row in the source tables must meet to qualify for the SELECT. Only rows that meet the conditions contribute data to the result set. Data from rows that do not meet the conditions are not used.

The WHERE clause is also used on the DELETE and UPDATE statements to define the rows in the target tables that are modified.

GROUP BY group_by_list

The GROUP BY clause partitions the result set into groups based on the values in the columns of the group_by_list. For example, the Northwind Orders table has three values in ShipVia. A GROUP BY ShipVia clause partitions the result set into three groups, one for each value of ShipVia.

HAVING search_conditions

The HAVING clause is an additional filter that is applied to the result set. Logically, the HAVING clause filters rows from the intermediate result set built from the application of any FROM, WHERE, or GROUP BY clauses in the SELECT statement. HAVING clauses are most commonly used with a GROUP BY clause, although a GROUP BY clause is not required before a HAVING clause.

ORDER BY order_list [ ASC | DESC ]

The ORDER BY clause defines the order in which the rows in the result set are sorted. order_list specifies the result set columns that make up the sort list. The ASC and DESC keywords are used to specify if the rows are sorted in an ascending or descending sequence.

ORDER BY is important because relational theory specifies that the rows in a result set cannot be assumed to have any sequence unless ORDER BY is specified. ORDER BY must be used in any SELECT statement for which the order of the result set rows is important.

The clauses in a SELECT statement must be specified in the proper order.

Each reference to a database object must be unambiguous. Ambiguity can come from these sources:

  • There may be multiple objects with the same name in a system. For example, both User1 and User2 may have defined a table named TableX. To resolve the ambiguity and specify the TableX owned by User1, qualify the table name with at least the user ID:
    SELECT *
    FROM User1.TableX
    
  • The database in which the object resides may not always be the current database when the SELECT statement is executed. To ensure that the proper object is always used, regardless of the current database setting, qualify the object name with the database and owner:
    SELECT *
    FROM Northwind.dbo.Shippers
    
  • The tables and views specified in the FROM clause may have duplicate column names. It is especially likely that foreign keys will have the same column name as their related primary key. To resolve the ambiguity between duplicate names, the column name must be qualified with the table or view name:
    SELECT DISTINCT Customers.CustomerID, Customers.CompanyName
    FROM Customers JOIN Orders ON
           ( Customers.CustomerID = Orders.CustomerID)
    WHERE Orders.ShippedDate > 'May 1 1998'
    

    This syntax becomes cumbersome when the table and view names must themselves be fully qualified. This problem is resolved by assigning a correlation name (also known as a range variable or alias) to the table, using the AS keyword in the FROM clause. The fully qualified table or view name has to be specified only in the FROM clause. All other table or view references can then use the correlation name. Applying correlation names and fully qualifying the tables in the earlier sample results in this SELECT statement:

    SELECT DISTINCT Cst.CustomerID, Cst.CompanyName
    FROM Northwind.dbo.Customers AS Cst
       JOIN
         Northwind.dbo.Orders AS Ord
       ON ( Cst.CustomerID = Ord.CustomerID)
    WHERE Ord.ShippedDate > 'May 1 1998'
    

For more information about object qualification, see Using Identifiers.

Many Transact-SQL examples in the SQL Server Books Online are simplified by not using qualified names. Although these elements are left out of the examples to promote readability, it is recommended that Transact-SQL statements in production systems use qualified names.

See Also

Expressions

SELECT

IDENTITY (Property)