IM: Read Access Sample

IM - Imaging Libray

Architecture Guide

Image Representation (Data Model)

In the IM library images are 2D matrices of pixels defining width and height. Stacks, Animations, Videos and Volumes are represented as a sequence of individual images.

The pixels can have one of several color spaces: IM_RGB, IM_MAP, IM_GRAY, IM_BINARY, IM_CMYK, IM_YCBCR, IM_LAB, IM_LUV, IM_XYZIM_MAP is a subset of the IM_RGB color space. IM_MAP can have a max of 256 colors. IM_BINARY a subset of the IM_GRAY color space, and it has only 2 colors black and white. IM_GRAY usually means luma (nonlinear Luminance), but it can represent any other intensity value that is not necessarily related to color.

The number of components of the color space defines the depth of the image. The color components can be packed sequentially in one plane (like rgbrgbrgb...) or separated in several planes (like rrr...ggg...bbb...). Packed color components are normally used by graphics systems. We allow these two options because many user define their own image structure that can have a packed or an separated organization. The following picture illustrates the difference between the two options:

Separated and Packed RGB Components

An extra component, the alpha channel, may be present. The number of components is then increased by one. Its organization follows the rules of packed and unpacked components.

There are several numeric representations for the color component, or several data types: IM_BYTE, IM_USHORT, IM_INT, IM_FLOAT, IM_CFLOAT. There is no bit type, binary images use 1 byte (waist space but keep processing simple).

Image orientation can be bottom up to top with the origin at the bottom left corner, or top down to bottom with the origin at the top left corner.

Top Down and Bottom Up Orientations

Since all these options are relative to data organization we created a parameter called color mode that contains the color space, the packing, the orientation and the optional alpha channel. The color space definitions are binary "or" combined with the flags: IM_ALPHA, IM_PACKED and IM_TOPDOWN. When a flag is absent the opposite definition is assumed: no alpha, separated components and bottom up orientation. See some examples:

IM_RGB | IM_ALPHA - rgb color space with an alpha channel, bottom up orientation and separated components
IM_GRAY | IM_TOPDOWN - gray color space with no alpha channel and top down orientation
IM_RGB | IM_ALPHA | IM_PACKED - rgb color space with an alpha channel, bottom up orientation and packed components

So these four parameters define our raw image data: width, height, color_mode and data_type. The raw data buffer is always byte aligned and each component is stored sequentially in the buffer with the specified packing.

If the raw data buffer is typecasted to the proper C type, then locating the pixel at line y, column x, component d is done like this:

    if (is_packed) idata[y*width*depth  + x*depth + d]
    else           idata[d*width*height + y*width + x]

But this will return different pixel locations for top down and bottom up orientations.

We could restrict the data organization by eliminating the extra flags, but several users requested these features in the library. So we keep them but restricted to raw data access. For the high level image processing functions we created a structure called imImage that eliminates the extra flags and assume no alpha channel, bottom up orientation and separated components.

The imImage structure is created using the four image parameters: width, height, color_space and data_type. It is an open structure in C where you can access all the parameters. In addition to the 4 creation parameters there are many auxiliary parameters like depth, count, line_size, plane_size and size, with values calculated in the creation.

The data is allocated like the raw image data with separated color components one after another, but we access the data through an array of pointers each one starting at the beginning of each color component. So data[0] contains a pointer to all the data, and data[1] is a short cut to the second component and so on.

To the structure contains all the image information obtained from file it also has support for attributes and for the palette. The palette can be used for IM_MAP images and for pseudo color of IM_GRAY images.

An important subset of images is what we call a bitmap image. It is an image that can be directly used to graphics display. Color space must be: IM_RGB, IM_MAP, IM_GRAY or IM_BINARY, and data type must be IM_BYTE.

The conversion between image data types, color modes and to bitmap are defined only for the imImage structure.

See: Reference / Image Representation,
        Reference / Image Representation / Structure,
        Reference / Image Representation / Conversion,
        Reference / Structures / imImage,
        Guide / Basics / Creating.

Image Storage (File Format Model)

Essentially all the file formats save the same image data. There is no such thing like a GIF image, instead we have a color indexed image that can be saved in a file with a GIF format, or a TIFF format, etc. However the compression encoding can be lossy and degrade the original image. The point is file formats and image data are two different things.

A file format is a file organization of the image data and its attributes. The IM library model considers all the file formats under the same model, including image, video, animation, stacks and volume file formats. When there is more than one image each one is treated as an independent frame. Each frame can have its own parameters and set of attributes.

We consider only formats that starts with a signature so we can recognize the format without using its file extension. If there is more than one driver that handles the same signature the first registered driver will open the file. Since the internal drivers are automatically registered all the external drivers can be loaded first if no imFile function has been called. In this way you can also control which external driver goes first.

See: Reference / Image Storage,
        Guide / Basics / Reading,
        Guide / Basics / Writing.

Image Capture (Live Image Input Model)

You must have a video capture device installed. It must be a device capable of live video, it can not be a passive digital camera to only transfer the already taken pictures. Are valid: USB cameras (like most Webcams), Firewire (IEEE 1394) cameras, and analog video capture boards, including TV Tuners.

You can list the installed devices and once you connect to a specific device you can control its parameters. Each connected device captures data frames continuously when in Live state otherwise it stays in standby. Then the user should retrieve frames from the device. This can be done inside a closed loop or inside an idle function. The user is not notified when a new frame is available, but every time the user retrieve a frame if it is successful then it is a new frame, old frames are discarded when a new frame arrives.

See Reference / Image Capture,
       Guide / Basics / Capturing.

Image Processing (Operation Model)

We use the simpliest model possible, a function with input data, output data and control parameters. There is no ROI (Region Of Interest) management.

The operations have usually one or more input images, and one or more output images.  We avoid implementing in-place operations, but many operations can use the same data for input and output. The data type, color mode and size of the images depends on the operation. Sometimes the operations can change the data type to increase the precision of the results, but normally only a few operations will change the size (resize and geometric) and color mode (color conversion). All of these details are described in each function documentation, check before using them.

See Reference / Image Processing,
       Guide / Basics / Processing.