Eliminating Duplicates with DISTINCT
The DISTINCT keyword eliminates duplicate rows from the results of a SELECT statement. If DISTINCT is not specified, all rows are returned, including duplicates. For example, if you select all the author IDs in titleauthor without DISTINCT, the following rows are returned (with some duplicate listings):
USE pubs
SELECT au_id
FROM titleauthor
Here is the result set:
au_id
-----------
172-32-1176
213-46-8915
213-46-8915
238-95-7766
267-41-2394
267-41-2394
274-80-9391
409-56-7008
427-17-2319
472-27-2349
486-29-1786
486-29-1786
648-92-1872
672-71-3249
712-45-1867
722-51-5454
724-80-9391
724-80-9391
756-30-7391
807-91-6654
846-92-7186
899-46-2035
899-46-2035
998-72-3567
998-72-3567
(25 row(s) affected)
With DISTINCT, you can eliminate duplicates and see only the unique author IDs:
USE pubs
SELECT DISTINCT au_id
FROM titleauthor
Here is the result set:
au_id
-----------
172-32-1176
213-46-8915
238-95-7766
267-41-2394
274-80-9391
409-56-7008
427-17-2319
472-27-2349
486-29-1786
648-92-1872
672-71-3249
712-45-1867
722-51-5454
724-80-9391
756-30-7391
807-91-6654
846-92-7186
899-46-2035
998-72-3567
(19 row(s) affected)
Important The output for statements involving DISTINCT depends on the collation of the column or expression on which the DISTINCT is applied. For more information about the effects of different collations, see SQL Server Collation Fundamentals.
For the DISTINCT keyword, null values are considered to be duplicates of each other. When DISTINCT is included in a SELECT statement, only one NULL is returned in the results, regardless of how many null values are encountered.
Note For compatibility with the SQL-92 standard and other implementations of Microsoft® SQL Server™, the ALL keyword can explicitly ask for all rows. However, there is no need to specify ALL because it is the default.