Encoded Python Source Files
|
Previous Top Next |
PyScripter supports the PEP 263 fully. The editor internally uses Unicode strings. When
saved, Python files can be encoded in either utf-8 or ansi encoding.
UTF-8 encoded source files
You can select this encoding from the File Formats submenu of the Edit
menu. From that
menu you can select whether UTF-8 encoded source files include the BOM UTF-8 signature
which is detected by the Python interpreter. This signature is also detected by PyScripter
when a file is loaded and other Windows editors. Although it is not necessary you are advised
to include an encoding comment such as
# -*- coding: utf-8 -*-
as the first or second line of the python script. The advantage of using UTF-8 encoded files is
that they can run without modification in other computers with different default encoding. When using UTF-8 encoding you should specify all strings that are not plain ascii as python
unicode stings by adding the prefix 'u'.
ANSI encoded files
If the UTF-8 flag of the File Formats submenu of the Edit menu is not selected, then the file is
treated as an ANSI string. To define a specific source code encoding, a magic comment
must be placed into the source files either as first or second line in the file, e.g.:
#!/usr/bin/python
# -*- coding: <encoding name> -*-
More precisely, the first or second line must match the regular expression "coding[:=]\s*([-\w.]+)". The first group of this expression is then interpreted as encoding name. If the
encoding is unknown to Python, an error is raised during compilation. There must not be any
Python statement on the line that contains the encoding declaration. If such a comment is not
present then the default system encoding is assumed. PyScripter detects such comments
when it loads Python Source files and decodes them to Unicode using the appropriate
encoding.
The default python encoding is controlled by a Python file called "site.py" which is located in
the python lib directory (see function "setencoding" in site.py). The default encoding when
python is installed is ascii, which does not support non-ascii characters (character value
greater than 127). If you are planning to use non-ascii strings in Python without using the utf-8 encoding, you will need to modify site.py and enable support for a locale aware default
string encoding.
IDE encoding options for new files
· Default line breaks for new files
· Default encoding for new files
IDE option for detecting UTF-8 encoding when opening files
Another IDE option (Detect UTF-8 when opening files) controls whether PyScripter attempts
to detect utf-8 encoding when opening files without the BOM mark. This detection is done by
analyzing the first 4000 characters of the file and is imperfect. It only applies to non-Python
files since utf-8 encoded Python files are required to have either the BOM mark or an
encoding comment.