8 PEP 327: Decimal Data Type

Python PEP


8 PEP 327: Decimal Data Type

Python has always supported floating-point (FP) numbers, based on the underlying C double type, as a data type. However, while most programming languages provide a floating-point type, many people (even programmers) are unaware that floating-point numbers don't represent certain decimal fractions accurately. The new Decimal type can represent these fractions accurately, up to a user-specified precision limit.

8.1 Why is Decimal needed?

The limitations arise from the representation used for floating-point numbers. FP numbers are made up of three components:

  • The sign, which is positive or negative.
  • The mantissa, which is a single-digit binary number followed by a fractional part. For example, 1.01 in base-2 notation is 1 + 0/2 + 1/4, or 1.25 in decimal notation.
  • The exponent, which tells where the decimal point is located in the number represented.

For example, the number 1.25 has positive sign, a mantissa value of 1.01 (in binary), and an exponent of 0 (the decimal point doesn't need to be shifted). The number 5 has the same sign and mantissa, but the exponent is 2 because the mantissa is multiplied by 4 (2 to the power of the exponent 2); 1.25 * 4 equals 5.

Modern systems usually provide floating-point support that conforms to a standard called IEEE 754. C's double type is usually implemented as a 64-bit IEEE 754 number, which uses 52 bits of space for the mantissa. This means that numbers can only be specified to 52 bits of precision. If you're trying to represent numbers whose expansion repeats endlessly, the expansion is cut off after 52 bits. Unfortunately, most software needs to produce output in base 10, and common fractions in base 10 are often repeating decimals in binary. For example, 1.1 decimal is binary 1.0001100110011 ...; .1 = 1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE 754 has to chop off that infinitely repeated decimal after 52 digits, so the representation is slightly inaccurate.

Sometimes you can see this inaccuracy when the number is printed:

>>> 1.1
1.1000000000000001

The inaccuracy isn't always visible when you print the number because the FP-to-decimal-string conversion is provided by the C library, and most C libraries try to produce sensible output. Even if it's not displayed, however, the inaccuracy is still there and subsequent operations can magnify the error.

For many applications this doesn't matter. If I'm plotting points and displaying them on my monitor, the difference between 1.1 and 1.1000000000000001 is too small to be visible. Reports often limit output to a certain number of decimal places, and if you round the number to two or three or even eight decimal places, the error is never apparent. However, for applications where it does matter, it's a lot of work to implement your own custom arithmetic routines.

Hence, the Decimal type was created.

8.2 The Decimal type

A new module, decimal, was added to Python's standard library. It contains two classes, Decimal and Context. Decimal instances represent numbers, and Context instances are used to wrap up various settings such as the precision and default rounding mode.

Decimal instances are immutable, like regular Python integers and FP numbers; once it's been created, you can't change the value an instance represents. Decimal instances can be created from integers or strings:

>>> import decimal
>>> decimal.Decimal(1972)
Decimal("1972")
>>> decimal.Decimal("1.1")
Decimal("1.1")

You can also provide tuples containing the sign, the mantissa represented as a tuple of decimal digits, and the exponent:

>>> decimal.Decimal((1, (1, 4, 7, 5), -2))
Decimal("-14.75")

Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is negative.

Converting from floating-point numbers poses a bit of a problem: should the FP number representing 1.1 turn into the decimal number for exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced? The decision was to dodge the issue and leave such a conversion out of the API. Instead, you should convert the floating-point number into a string using the desired precision and pass the string to the Decimal constructor:

>>> f = 1.1
>>> decimal.Decimal(str(f))
Decimal("1.1")
>>> decimal.Decimal('%.12f' % f)
Decimal("1.100000000000")

Once you have Decimal instances, you can perform the usual mathematical operations on them. One limitation: exponentiation requires an integer exponent:

>>> a = decimal.Decimal('35.72')
>>> b = decimal.Decimal('1.73')
>>> a+b
Decimal("37.45")
>>> a-b
Decimal("33.99")
>>> a*b
Decimal("61.7956")
>>> a/b
Decimal("20.64739884393063583815028902")
>>> a ** 2
Decimal("1275.9184")
>>> a**b
Traceback (most recent call last):
  ...
decimal.InvalidOperation: x ** (non-integer)

You can combine Decimal instances with integers, but not with floating-point numbers:

>>> a + 4
Decimal("39.72")
>>> a + 4.5
Traceback (most recent call last):
  ...
TypeError: You can interact Decimal only with int, long or Decimal data types.
>>>

Decimal numbers can be used with the math and cmath modules, but note that they'll be immediately converted to floating-point numbers before the operation is performed, resulting in a possible loss of precision and accuracy. You'll also get back a regular floating-point number and not a Decimal.

>>> import math, cmath
>>> d = decimal.Decimal('123456789012.345')
>>> math.sqrt(d)
351364.18288201344
>>> cmath.sqrt(-d)
351364.18288201344j

Decimal instances have a sqrt() method that returns a Decimal, but if you need other things such as trigonometric functions you'll have to implement them.

>>> d.sqrt()
Decimal("351364.1828820134592177245001")

8.3 The Context type

Instances of the Context class encapsulate several settings for decimal operations:

  • prec is the precision, the number of decimal places.
  • rounding specifies the rounding mode. The decimal module has constants for the various possibilities: ROUND_DOWN, ROUND_CEILING, ROUND_HALF_EVEN, and various others.
  • traps is a dictionary specifying what happens on encountering certain error conditions: either an exception is raised or a value is returned. Some examples of error conditions are division by zero, loss of precision, and overflow.

There's a thread-local default context available by calling getcontext(); you can change the properties of this context to alter the default precision, rounding, or trap handling. The following example shows the effect of changing the precision of the default context:

>>> decimal.getcontext().prec
28
>>> decimal.Decimal(1) / decimal.Decimal(7)
Decimal("0.1428571428571428571428571429")
>>> decimal.getcontext().prec = 9 
>>> decimal.Decimal(1) / decimal.Decimal(7)
Decimal("0.142857143")

The default action for error conditions is selectable; the module can either return a special value such as infinity or not-a-number, or exceptions can be raised:

>>> decimal.Decimal(1) / decimal.Decimal(0)
Traceback (most recent call last):
  ...
decimal.DivisionByZero: x / 0
>>> decimal.getcontext().traps[decimal.DivisionByZero] = False
>>> decimal.Decimal(1) / decimal.Decimal(0)
Decimal("Infinity")
>>>

The Context instance also has various methods for formatting numbers such as to_eng_string() and to_sci_string().

For more information, see the documentation for the decimal module, which includes a quick-start tutorial and a reference.

See Also:

Written by Facundo Batista and implemented by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.

A more detailed overview of the IEEE-754 representation.

The article uses Fortran code to illustrate many of the problems that floating-point inaccuracy can cause.

A description of a decimal-based representation. This representation is being proposed as a standard, and underlies the new Python decimal type. Much of this material was written by Mike Cowlishaw, designer of the Rexx language.

See About this document... for information on suggesting changes.