11.20 cookielib -- Cookie handling for HTTP clients

Python PEP

11.20 cookielib -- Cookie handling for HTTP clients

The cookielib module defines classes for automatic handling of HTTP cookies. It is useful for accessing web sites that require small pieces of data - cookies - to be set on the client machine by an HTTP response from a web server, and then returned to the server in later HTTP requests.

Both the regular Netscape cookie protocol and the protocol defined by RFC 2965 are handled. RFC 2965 handling is switched off by default. RFC 2109 cookies are parsed as Netscape cookies and subsequently treated as RFC 2965 cookies. Note that the great majority of cookies on the Internet are Netscape cookies. cookielib attempts to follow the de-facto Netscape cookie protocol (which differs substantially from that set out in the original Netscape specification), including taking note of the max-age and port cookie-attributes introduced with RFC 2109. Note: The various named parameters found in Set-Cookie: and Set-Cookie2: headers (eg. domain and expires) are conventionally referred to as attributes. To distinguish them from Python attributes, the documentation for this module uses the term cookie-attribute instead.

The module defines the following exception:

Instances of FileCookieJar raise this exception on failure to load cookies from a file.

The following classes are provided:

policy is an object implementing the CookiePolicy interface.

The CookieJar class stores HTTP cookies. It extracts cookies from HTTP requests, and returns them in HTTP responses. CookieJar instances automatically expire contained cookies when necessary. Subclasses are also responsible for storing and retrieving cookies from a file or database.

policy is an object implementing the CookiePolicy interface. For the other arguments, see the documentation for the corresponding attributes.

A CookieJar which can load cookies from, and perhaps save cookies to, a file on disk. Cookies are NOT loaded from the named file until either the load() or revert() method is called. Subclasses of this class are documented in section 11.20.2.

This class is responsible for deciding whether each cookie should be accepted from / returned to the server.

Constructor arguments should be passed as keyword arguments only. blocked_domains is a sequence of domain names that we never accept cookies from, nor return cookies to. allowed_domains if not None, this is a sequence of the only domains for which we accept and return cookies. For all other arguments, see the documentation for CookiePolicy and DefaultCookiePolicy objects.

DefaultCookiePolicy implements the standard accept / reject rules for Netscape and RFC 2965 cookies. RFC 2109 cookies (ie. cookies received in a Set-Cookie: header with a version cookie-attribute of 1) are treated according to the RFC 2965 rules. DefaultCookiePolicy also provides some parameters to allow some fine-tuning of policy.

This class represents Netscape, RFC 2109 and RFC 2965 cookies. It is not expected that users of cookielib construct their own Cookie instances. Instead, if necessary, call make_cookies() on a CookieJar instance.

See Also:

URL opening with automatic cookie handling.

HTTP cookie classes, principally useful for server-side code. The cookielib and Cookie modules do not depend on each other.

Extensions to this module, including a class for reading Microsoft Internet Explorer cookies on Windows.

The specification of the original Netscape cookie protocol. Though this is still the dominant protocol, the 'Netscape cookie protocol' implemented by all the major browsers (and cookielib) only bears a passing resemblance to the one sketched out in cookie_spec.html.

Obsoleted by RFC 2965. Uses Set-Cookie: with version=1.

The Netscape protocol with the bugs fixed. Uses Set-Cookie2: in place of Set-Cookie:. Not widely used.

Unfinished errata to RFC 2965.


See About this document... for information on suggesting changes.