Character Set
BIG5
A standard of Chinese computer industry wildly use in Taiwan and Hong Kong,
a general character encoding which is different from code page concept using in
Windows. Users in different country use different character encoding method。
Unicode
Unicode - a character encoding standard developed by Unicode Consortium.
Unicode represents each character with a byte or above, Unicode
allow to represent all characters in different languages with single word.
CJK Unified Ideographs Extension A
Defined in Unicode code position 3400 - 4DFF.
CJK Unified Ideographs Extension B
The Encoding method is divide the Unicode code position D800 - DFFF into 2
areas. The upper area D800 - DBFF includes 1,024 code positions, lower area DC00
- DFFF also has 1,024 code positions. totally can be cross-defined 1,024×1,024=1,048,576
code positions. This type of method is defined under Unicode as Surrogate。
HKSCS
Traditional Chinese computer systems in Hong Kong are mainly using BIG5
character set, and BIG5 character set doesn't content many of Hong Kong frequent
use characters. In order to meet the specific need of the unique Hong Kong
characters, the Hong Kong SAR Government has defined the Hong Kong Supplementary
Character Set (HKSCS) includes Chinese characters used in Hong Kong but are not
contained in the BIG5 standard character set.
Related Links:
Unicode: http://www.unicode.org
HKSCS : http://www.microsoft.com/hk/hkscs/chinese/default.aspx
CNS11643: http://www.cns11643.gov.tw/web/word.jsp