This section describes operations that take character set information into account.
MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?
For simple functions that take string input and return a string
result as output, the output's character set and collation are
the same as those of the principal input value. For example,
UPPER(
X
) returns a
string whose character string and collation are the same as that
of X
. The same applies for
INSTR()
, LCASE()
,
LOWER()
, LTRIM()
,
MID()
, REPEAT()
,
REPLACE()
, REVERSE()
,
RIGHT()
, RPAD()
,
RTRIM()
, SOUNDEX()
,
SUBSTRING()
, TRIM()
,
UCASE()
, and UPPER()
.
Note: The REPLACE()
function, unlike all
other functions, always ignores the collation of the string
input and performs a case-sensitive comparison.
If a string input or function result is a binary string, the
string has no character set or collation. This can be check by
using the CHARSET()
and
COLLATION()
functions, both of which return
binary
to indicate that their argument is a
binary string:
mysql> SELECT CHARSET(BINARY 'a'), COLLATION(BINARY 'a');
+---------------------+-----------------------+
| CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
+---------------------+-----------------------+
| binary | binary |
+---------------------+-----------------------+
For operations that combine multiple string inputs and return a single string output, the “aggregation rules” of standard SQL apply for determining the collation of the result:
-
If an explicit
COLLATE
X
occurs, useX
. -
If explicit
COLLATE
X
andCOLLATE
Y
occur, raise an error. -
Otherwise, if all collations are
X
, useX
. -
Otherwise, the result has no collation.
For example, with CASE ... WHEN a THEN b WHEN b THEN c
COLLATE
X
END, the
resulting collation is X
. The same
applies for UNION
, ||
,
CONCAT()
, ELT()
,
GREATEST()
, IF()
, and
LEAST()
.
For operations that convert to character data, the character set
and collation of the strings that result from the operations are
defined by the character_set_connection
and
collation_connection
system variables. This
applies to CAST()
, CONV()
,
FORMAT()
, HEX()
,
SPACE()
. Before MySQL 5.0.15, it also applies
to CHAR()
.
CONVERT()
provides a way to convert data
between different character sets. The syntax is:
CONVERT(expr
USINGtranscoding_name
)
In MySQL, transcoding names are the same as the corresponding character set names.
Examples:
SELECT CONVERT(_latin1'Müller' USING utf8); INSERT INTO utf8table (utf8column) SELECT CONVERT(latin1field USING utf8) FROM latin1table;
CONVERT(... USING ...)
is implemented
according to the standard SQL specification.
You may also use CAST()
to convert a string
to a different character set. The syntax is:
CAST(character_string
AScharacter_data_type
CHARACTER SETcharset_name
)
Example:
SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8);
If you use CAST()
without specifying
CHARACTER SET
, the resulting character set
and collation are defined by the
character_set_connection
and
collation_connection
system variables. If you
use CAST()
with CHARACTER SET
X
, the resulting character set and collation are
X
and the default collation of
X
.
You may not use a COLLATE
clause inside a
CAST()
, but you may use it outside. That is,
CAST(... COLLATE ...)
is illegal, but
CAST(...) COLLATE ...
is legal.
Example:
SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8) COLLATE utf8_bin;
Several SHOW
statements provide additional
character set information. These include SHOW CHARACTER
SET
, SHOW COLLATION
, SHOW
CREATE DATABASE
, SHOW CREATE TABLE
and SHOW COLUMNS
. These statements are
described here briefly. For more information, see
Section 13.5.4, “SHOW
Syntax”.
INFORMATION_SCHEMA
has several tables that
contain information similar to that displayed by the
SHOW
statements. For example, the
CHARACTER_SETS
and
COLLATIONS
tables contain the information
displayed by SHOW CHARACTER SET
and
SHOW COLLATION
.
Chapter 20, The INFORMATION_SCHEMA
Database.
The SHOW CHARACTER SET
command shows all
available character sets. It takes an optional
LIKE
clause that indicates which character
set names to match. For example:
mysql> SHOW CHARACTER SET LIKE 'latin%';
+---------+-----------------------------+-------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+-----------------------------+-------------------+--------+
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
+---------+-----------------------------+-------------------+--------+
The output from SHOW COLLATION
includes all
available character sets. It takes an optional
LIKE
clause that indicates which collation
names to match. For example:
mysql> SHOW COLLATION LIKE 'latin1%';
+-------------------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+-------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1 | 5 | | | 0 |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 0 |
| latin1_danish_ci | latin1 | 15 | | | 0 |
| latin1_german2_ci | latin1 | 31 | | Yes | 2 |
| latin1_bin | latin1 | 47 | | Yes | 0 |
| latin1_general_ci | latin1 | 48 | | | 0 |
| latin1_general_cs | latin1 | 49 | | | 0 |
| latin1_spanish_ci | latin1 | 94 | | | 0 |
+-------------------+---------+----+---------+----------+---------+
SHOW CREATE DATABASE
displays the
CREATE DATABASE
statement that creates a
given database:
mysql> SHOW CREATE DATABASE test;
+----------+-----------------------------------------------------------------+
| Database | Create Database |
+----------+-----------------------------------------------------------------+
| test | CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+----------+-----------------------------------------------------------------+
If no COLLATE
clause is shown, the default
collation for the character set applies.
SHOW CREATE TABLE
is similar, but displays
the CREATE TABLE
statement to create a given
table. The column definitions indicate any character set
specifications, and the table options include character set
information.
The SHOW COLUMNS
statement displays the
collations of a table's columns when invoked as SHOW
FULL COLUMNS
. Columns with CHAR
,
VARCHAR
, or TEXT
data
types have collations. Numeric and other non-character types
have no collation (indicated by NULL
as the
Collation
value). For example:
mysql> SHOW FULL COLUMNS FROM person\G
*************************** 1. row ***************************
Field: id
Type: smallint(5) unsigned
Collation: NULL
Null: NO
Key: PRI
Default: NULL
Extra: auto_increment
Privileges: select,insert,update,references
Comment:
*************************** 2. row ***************************
Field: name
Type: char(60)
Collation: latin1_swedish_ci
Null: NO
Key:
Default:
Extra:
Privileges: select,insert,update,references
Comment:
The character set is not part of the display but is implied by the collation name.