10.6. Operations Affected by Character Set Support

MySQL 5.0

10.6. Operations Affected by Character Set Support

This section describes operations that take character set information into account.

10.6.1. Result Strings

MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?

For simple functions that take string input and return a string result as output, the output's character set and collation are the same as those of the principal input value. For example, ) returns a string whose character string and collation are the same as that of . The same applies for , , , , , , , , , , , , , , , and .

Note: The function, unlike all other functions, always ignores the collation of the string input and performs a case-sensitive comparison.

If a string input or function result is a binary string, the string has no character set or collation. This can be check by using the and functions, both of which return to indicate that their argument is a binary string:

mysql> 
+---------------------+-----------------------+
| CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
+---------------------+-----------------------+
| binary              | binary                |
+---------------------+-----------------------+

For operations that combine multiple string inputs and return a single string output, the “aggregation rules” of standard SQL apply for determining the collation of the result:

  • If an explicit occurs, use .

  • If explicit and occur, raise an error.

  • Otherwise, if all collations are , use .

  • Otherwise, the result has no collation.

For example, with END, the resulting collation is . The same applies for , , , , , , and .

For operations that convert to character data, the character set and collation of the strings that result from the operations are defined by the and system variables. This applies to , , , , . Before MySQL 5.0.15, it also applies to .

10.6.2. CONVERT() and CAST()

provides a way to convert data between different character sets. The syntax is:

CONVERT( USING )

In MySQL, transcoding names are the same as the corresponding character set names.

Examples:

SELECT CONVERT(_latin1'Müller' USING utf8);
INSERT INTO utf8table (utf8column)
    SELECT CONVERT(latin1field USING utf8) FROM latin1table;

is implemented according to the standard SQL specification.

You may also use to convert a string to a different character set. The syntax is:

CAST( AS  CHARACTER SET )

Example:

SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8);

If you use without specifying , the resulting character set and collation are defined by the and system variables. If you use with , the resulting character set and collation are and the default collation of .

You may not use a clause inside a , but you may use it outside. That is, is illegal, but is legal.

Example:

SELECT CAST(_latin1'test' AS CHAR CHARACTER SET utf8) COLLATE utf8_bin;

10.6.3. SHOW Statements and INFORMATION_SCHEMA

Several statements provide additional character set information. These include , , , and . These statements are described here briefly. For more information, see Section 13.5.4, “ Syntax”.

has several tables that contain information similar to that displayed by the statements. For example, the and tables contain the information displayed by and . Chapter 20, The Database.

The command shows all available character sets. It takes an optional clause that indicates which character set names to match. For example:

mysql> 
+---------+-----------------------------+-------------------+--------+
| Charset | Description                 | Default collation | Maxlen |
+---------+-----------------------------+-------------------+--------+
| latin1  | cp1252 West European        | latin1_swedish_ci |      1 |
| latin2  | ISO 8859-2 Central European | latin2_general_ci |      1 |
| latin5  | ISO 8859-9 Turkish          | latin5_turkish_ci |      1 |
| latin7  | ISO 8859-13 Baltic          | latin7_general_ci |      1 |
+---------+-----------------------------+-------------------+--------+

The output from includes all available character sets. It takes an optional clause that indicates which collation names to match. For example:

mysql> 
+-------------------+---------+----+---------+----------+---------+
| Collation         | Charset | Id | Default | Compiled | Sortlen |
+-------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1  |  5 |         |          |       0 |
| latin1_swedish_ci | latin1  |  8 | Yes     | Yes      |       0 |
| latin1_danish_ci  | latin1  | 15 |         |          |       0 |
| latin1_german2_ci | latin1  | 31 |         | Yes      |       2 |
| latin1_bin        | latin1  | 47 |         | Yes      |       0 |
| latin1_general_ci | latin1  | 48 |         |          |       0 |
| latin1_general_cs | latin1  | 49 |         |          |       0 |
| latin1_spanish_ci | latin1  | 94 |         |          |       0 |
+-------------------+---------+----+---------+----------+---------+

displays the statement that creates a given database:

mysql> 
+----------+-----------------------------------------------------------------+
| Database | Create Database                                                 |
+----------+-----------------------------------------------------------------+
| test     | CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+----------+-----------------------------------------------------------------+

If no clause is shown, the default collation for the character set applies.

is similar, but displays the statement to create a given table. The column definitions indicate any character set specifications, and the table options include character set information.

The statement displays the collations of a table's columns when invoked as . Columns with , , or data types have collations. Numeric and other non-character types have no collation (indicated by as the value). For example:

mysql> 
*************************** 1. row ***************************
     Field: id
      Type: smallint(5) unsigned
 Collation: NULL
      Null: NO
       Key: PRI
   Default: NULL
     Extra: auto_increment
Privileges: select,insert,update,references
   Comment:
*************************** 2. row ***************************
     Field: name
      Type: char(60)
 Collation: latin1_swedish_ci
      Null: NO
       Key:
   Default:
     Extra:
Privileges: select,insert,update,references
   Comment:

The character set is not part of the display but is implied by the collation name.