Device

The different format handlers (pdf, xps, etc.) interpret pages to a “device”. Devices are the basis for everything that can be done with a page: rendering, text extraction and searching. The device type is determined by the selected construction method.

Class API

class Device
__init__(self, object, clip)

Constructor for either a pixel map or a display list device.

Parameters:
  • object (Pixmap or DisplayList) – either a Pixmap or a DisplayList.
  • clip (IRect) – An optional IRect for Pixmap devices to restrict rendering to a certain area of the page. If the complete page is required, specify None. For display list devices, this parameter must be omitted.
__init__(self, textpage, flags = 0)

Constructor for a text page device.

Parameters:
  • textpage (TextPage) – TextPage object
  • flags (int) – control the way how text is parsed into the text page. Currently 3 options can be coded into this parameter, see Preserve Text Flags. To set these options use something like flags = 0 | TEXT_PRESERVE_LIGATURES | ....

Note

In higher level code (Page.getText(), Document.getPageText()), the following decisions for creating text devices have been implemented: (1) TEXT_PRESERVE_LIGATURES and TEXT_PRESERVE_WHITESPACES are always set, (2) TEXT_PRESERVE_IMAGES is set for JSON and HTML, otherwise off.