12 Other Language Changes
Here are all of the changes that Python 2.5 makes to the core Python language.
- The dict type has a new hook for letting subclasses
provide a default value when a key isn't contained in the dictionary.
When a key isn't found, the dictionary's
__missing__(key)
method will be called. This hook is used to implement
the new defaultdict class in the collections
module. The following example defines a dictionary
that returns zero for any missing key:
class zerodict (dict): def __missing__ (self, key): return 0 d = zerodict({1:1, 2:2}) print d[1], d[2] # Prints 1, 2 print d[3], d[4] # Prints 0, 0
- Both 8-bit and Unicode strings have new partition(sep)
and rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to slice the string and obtain the pieces that are before and after the separator. partition(sep) condenses this pattern into a single method call that returns a 3-tuple containing the substring before the separator, the separator itself, and the substring after the separator. If the separator isn't found, the first element of the tuple is the entire string and the other two elements are empty. rpartition(sep) also returns a 3-tuple but starts searching from the end of the string; the "r" stands for 'reverse'.
Some examples:
>>> ('http://www.python.org').partition('://') ('http', '://', 'www.python.org') >>> ('file:/usr/share/doc/index.html').partition('://') ('file:/usr/share/doc/index.html', '', '') >>> (u'Subject: a quick question').partition(':') (u'Subject', u':', u' a quick question') >>> 'www.python.org'.rpartition('.') ('www.python', '.', 'org') >>> 'www.python.org'.rpartition(':') ('', '', 'www.python.org')
(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
- The startswith() and endswith() methods
of string types now accept tuples of strings to check for.
def is_image_file (filename): return filename.endswith(('.gif', '.jpg', '.tiff'))
(Implemented by Georg Brandl following a suggestion by Tom Lynn.)
- The min() and max() built-in functions
gained a
key
keyword parameter analogous to thekey
argument for sort(). This parameter supplies a function that takes a single argument and is called for every value in the list; min()/max() will return the element with the smallest/largest return value from this function. For example, to find the longest string in a list, you can do:L = ['medium', 'longest', 'short'] # Prints 'longest' print max(L, key=len) # Prints 'short', because lexicographically 'short' has the largest value print max(L)
(Contributed by Steven Bethard and Raymond Hettinger.)
- Two new built-in functions, any() and
all(), evaluate whether an iterator contains any true or
false values. any() returns True if any value
returned by the iterator is true; otherwise it will return
False. all() returns True only if
all of the values returned by the iterator evaluate as true.
(Suggested by Guido van Rossum, and implemented by Raymond Hettinger.)
- The result of a class's __hash__() method can now
be either a long integer or a regular integer. If a long integer is
returned, the hash of that value is taken. In earlier versions the
hash value was required to be a regular integer, but in 2.5 the
id() built-in was changed to always return non-negative
numbers, and users often seem to use
id(self)
in __hash__() methods (though this is discouraged). - ASCII is now the default encoding for modules. It's now
a syntax error if a module contains string literals with 8-bit
characters but doesn't have an encoding declaration. In Python 2.4
this triggered a warning, not a syntax error. See PEP 263
for how to declare a module's encoding; for example, you might add
a line like this near the top of the source file:
# -*- coding: latin1 -*-
- A new warning, UnicodeWarning, is triggered when
you attempt to compare a Unicode string and an 8-bit string
that can't be converted to Unicode using the default ASCII encoding.
The result of the comparison is false:
>>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode __main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False >>> chr(127) == unichr(127) # chr(127) can be converted True
Previously this would raise a UnicodeDecodeError exception, but in 2.5 this could result in puzzling problems when accessing a dictionary. If you looked up
unichr(128)
andchr(128)
was being used as a key, you'd get a UnicodeDecodeError exception. Other changes in 2.5 resulted in this exception being raised instead of suppressed by the code in dictobject.c that implements dictionaries.Raising an exception for such a comparison is strictly correct, but the change might have broken code, so instead UnicodeWarning was introduced.
(Implemented by Marc-André Lemburg.)
- One error that Python programmers sometimes make is forgetting
to include an __init__.py module in a package directory.
Debugging this mistake can be confusing, and usually requires running
Python with the -v switch to log all the paths searched.
In Python 2.5, a new ImportWarning warning is triggered when
an import would have picked up a directory as a package but no
__init__.py was found. This warning is silently ignored by default;
provide the -Wd option when running the Python executable
to display the warning message.
(Implemented by Thomas Wouters.)
- The list of base classes in a class definition can now be empty.
As an example, this is now legal:
(Implemented by Brett Cannon.)
class C(): pass
12.1 Interactive Interpreter Changes
In the interactive interpreter, quit
and exit
have long been strings so that new users get a somewhat helpful message
when they try to quit:
>>> quit 'Use Ctrl-D (i.e. EOF) to exit.'
In Python 2.5, quit
and exit
are now objects that still
produce string representations of themselves, but are also callable.
Newbies who try quit()
or exit()
will now exit the
interpreter as they expect. (Implemented by Georg Brandl.)
The Python executable now accepts the standard long options --help and --version; on Windows, it also accepts the /? option for displaying a help message. (Implemented by Georg Brandl.)
12.2 Optimizations
Several of the optimizations were developed at the NeedForSpeed sprint, an event held in Reykjavik, Iceland, from May 21-28 2006. The sprint focused on speed enhancements to the CPython implementation and was funded by EWT LLC with local support from CCP Games. Those optimizations added at this sprint are specially marked in the following list.
- When they were introduced
in Python 2.4, the built-in set and frozenset types
were built on top of Python's dictionary type.
In 2.5 the internal data structure has been customized for implementing sets,
and as a result sets will use a third less memory and are somewhat faster.
(Implemented by Raymond Hettinger.)
- The speed of some Unicode operations, such as finding
substrings, string splitting, and character map encoding and decoding,
has been improved. (Substring search and splitting improvements were
added by Fredrik Lundh and Andrew Dalke at the NeedForSpeed
sprint. Character maps were improved by Walter Dörwald and
Martin von Löwis.)
- The long(str, base) function is now
faster on long digit strings because fewer intermediate results are
calculated. The peak is for strings of around 800-1000 digits where
the function is 6 times faster.
(Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.)
- The struct module now compiles structure format
strings into an internal representation and caches this
representation, yielding a 20% speedup. (Contributed by Bob Ippolito
at the NeedForSpeed sprint.)
- The re module got a 1 or 2% speedup by switching to
Python's allocator functions instead of the system's
malloc() and free().
(Contributed by Jack Diederich at the NeedForSpeed sprint.)
- The code generator's peephole optimizer now performs
simple constant folding in expressions. If you write something like
a = 2+3
, the code generator will do the arithmetic and produce code corresponding toa = 5
. (Proposed and implemented by Raymond Hettinger.) - Function calls are now faster because code objects now keep
the most recently finished frame (a ``zombie frame'') in an internal
field of the code object, reusing it the next time the code object is
invoked. (Original patch by Michael Hudson, modified by Armin Rigo
and Richard Jones; committed at the NeedForSpeed sprint.)
Frame objects are also slightly smaller, which may improve cache locality and reduce memory usage a bit. (Contributed by Neal Norwitz.)
- Python's built-in exceptions are now new-style classes, a change
that speeds up instantiation considerably. Exception handling in
Python 2.5 is therefore about 30% faster than in 2.4.
(Contributed by Richard Jones, Georg Brandl and Sean Reifschneider at
the NeedForSpeed sprint.)
- Importing now caches the paths tried, recording whether
they exist or not so that the interpreter makes fewer
open() and stat() calls on startup.
(Contributed by Martin von Löwis and Georg Brandl.)
See About this document... for information on suggesting changes.