14 Build and C API Changes

Python 2.5


14 Build and C API Changes

Changes to Python's build process and to the C API include:

  • The Python source tree was converted from CVS to Subversion, in a complex migration procedure that was supervised and flawlessly carried out by Martin von Löwis. The procedure was developed as PEP 347.

  • Coverity, a company that markets a source code analysis tool called Prevent, provided the results of their examination of the Python source code. The analysis found about 60 bugs that were quickly fixed. Many of the bugs were refcounting problems, often occurring in error-handling code. See http://scan.coverity.com for the statistics.

  • The largest change to the C API came from PEP 353, which modifies the interpreter to use a Py_ssize_t type definition instead of int. See the earlier section 10 for a discussion of this change.

  • The design of the bytecode compiler has changed a great deal, no longer generating bytecode by traversing the parse tree. Instead the parse tree is converted to an abstract syntax tree (or AST), and it is the abstract syntax tree that's traversed to produce the bytecode.

    It's possible for Python code to obtain AST objects by using the compile() built-in and specifying _ast.PyCF_ONLY_AST as the value of the flags parameter:

    from _ast import PyCF_ONLY_AST
    ast = compile("""a=0
    for i in range(10):
        a += i
    """, "<string>", 'exec', PyCF_ONLY_AST)
    
    assignment = ast.body[0]
    for_loop = ast.body[1]
    

    No official documentation has been written for the AST code yet, but PEP 339 discusses the design. To start learning about the code, read the definition of the various AST nodes in Parser/Python.asdl. A Python script reads this file and generates a set of C structure definitions in Include/Python-ast.h. The PyParser_ASTFromString() and PyParser_ASTFromFile(), defined in Include/pythonrun.h, take Python source as input and return the root of an AST representing the contents. This AST can then be turned into a code object by PyAST_Compile(). For more information, read the source code, and then ask questions on python-dev.

    The AST code was developed under Jeremy Hylton's management, and implemented by (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil Schemenauer, plus the participants in a number of AST sprints at conferences such as PyCon.

  • Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never freed arenas. With this patch, Python will free arenas when they're empty. The net effect is that on some platforms, when you allocate many objects, Python's memory usage may actually drop when you delete them and the memory may be returned to the operating system. (Implemented by Evan Jones, and reworked by Tim Peters.)

    Note that this change means extension modules must be more careful when allocating memory. Python's API has many different functions for allocating memory that are grouped into families. For example, PyMem_Malloc(), PyMem_Realloc(), and PyMem_Free() are one family that allocates raw memory, while PyObject_Malloc(), PyObject_Realloc(), and PyObject_Free() are another family that's supposed to be used for creating Python objects.

    Previously these different families all reduced to the platform's malloc() and free() functions. This meant it didn't matter if you got things wrong and allocated memory with the PyMem function but freed it with the PyObject function. With 2.5's changes to obmalloc, these families now do different things and mismatches will probably result in a segfault. You should carefully test your C extension modules with Python 2.5.

  • The built-in set types now have an official C API. Call PySet_New() and PyFrozenSet_New() to create a new set, PySet_Add() and PySet_Discard() to add and remove elements, and PySet_Contains and PySet_Size to examine the set's state. (Contributed by Raymond Hettinger.)

  • C code can now obtain information about the exact revision of the Python interpreter by calling the Py_GetBuildInfo() function that returns a string of build information like this: "trunk:45355:45356M, Apr 13 2006, 07:42:19". (Contributed by Barry Warsaw.)

  • Two new macros can be used to indicate C functions that are local to the current file so that a faster calling convention can be used. Py_LOCAL(type) declares the function as returning a value of the specified type and uses a fast-calling qualifier. Py_LOCAL_INLINE(type) does the same thing and also requests the function be inlined. If PY_LOCAL_AGGRESSIVE is defined before python.h is included, a set of more aggressive optimizations are enabled for the module; you should benchmark the results to find out if these optimizations actually make the code faster. (Contributed by Fredrik Lundh at the NeedForSpeed sprint.)

  • PyErr_NewException(name, base, dict) can now accept a tuple of base classes as its base argument. (Contributed by Georg Brandl.)

  • The PyErr_Warn() function for issuing warnings is now deprecated in favour of PyErr_WarnEx(category, message, stacklevel) which lets you specify the number of stack frames separating this function and the caller. A stacklevel of 1 is the function calling PyErr_WarnEx(), 2 is the function above that, and so forth. (Added by Neal Norwitz.)

  • The CPython interpreter is still written in C, but the code can now be compiled with a C++ compiler without errors. (Implemented by Anthony Baxter, Martin von Löwis, Skip Montanaro.)

  • The PyRange_New() function was removed. It was never documented, never used in the core code, and had dangerously lax error checking. In the unlikely case that your extensions were using it, you can replace it by something like the following:
    range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", 
                                  start, stop, step);
    


14.1 Port-Specific Changes

  • MacOS X (10.3 and higher): dynamic loading of modules now uses the dlopen() function instead of MacOS-specific functions.

  • MacOS X: a --enable-universalsdk switch was added to the configure script that compiles the interpreter as a universal binary able to run on both PowerPC and Intel processors. (Contributed by Ronald Oussoren.)

  • Windows: .dll is no longer supported as a filename extension for extension modules. .pyd is now the only filename extension that will be searched for.

See About this document... for information on suggesting changes.