Using DBOPTIMIZE DBOPTIMIZE_Batch

LANSA for i

Using *DBOPTIMIZE / *DBOPTIMIZE_Batch

If you write an RDML program like this:

REQUEST  FIELDS(#PRONUM)

FETCH    FIELDS(#PANEL01) FROM_FILE(PRODMST) WITH_KEY(#PRONUM)

FETCH    FIELDS(#PANEL01) FROM_FILE(PRODCAT) WITH_KEY(#PROCAT)

 

then compile and execute it, the objects being used would look something like this:

 __________            _____________            ____________

|          |          |             |    to    |            |

|          |  calls   | I/O Module  |  access  |  PRODMST   |

|          | -------> | for PRODMST | <------> | data file  |

| RDML     |          |_____________|          |____________|

| Function |                                                 

|          |  calls    _____________            ____________

|          |          |             |    to    |            |  

|          | -------> | I/O Module  |  access  |  PRODCAT   |

 ----------           | for PRODCAT | <------> | data file  |

                      |_____________|          |____________|

 

This is a typical LANSA application, using I/O module calls to control all access to all database files.

Using this approach has several advantages, the chief of which is the ability to alter the definition of a database file and/or its validation rules without ever having to recompile the RDML functions that access it.

Only the I/O module usually needs to be recompiled.

However, there is a disadvantage as well. Using this approach involves a performance overhead in the call to the I/O module.

Usually this overhead is small and acceptable, given the advantages it brings, however in some situations the call overhead becomes magnified by other factors.

The factors can range from accessing a file with a very large number of records to trying to run a large and complex application system on a small or undersized machine.

To alleviate this call overhead, a facility called *DBOPTIMIZE (or *DBOPTIMISE) can be used.

If the program above was changed to be like this:

FUNCTION OPTIONS(*DBOPTIMIZE) or OPTIONS(*DBOPTIMISE)

REQUEST  FIELDS(#PRONUM)

FETCH    FIELDS(#PANEL01) FROM_FILE(PRODMST) WITH_KEY(#PRONUM)

FETCH    FIELDS(#PANEL01) FROM_FILE(PRODCAT) WITH_KEY(#PROCAT)

 

then (re)compiled and executed, the objects being used would look something like this:

     ________________                     ___________

    |                |      access       |           |

    |                | <---------------> | PRODMST   |

    | RDML Function  |                   | data file |

    |                |                   |___________|

    |                |                                

    |                |                   ____________

    |                |      access       |           |

    |                | <---------------> | PRODCAT   |

    |                |                   | data file |

    |________________|                   |___________|

                                    

 

This alleviates the I/O module call overhead imposed by the IBM i operating systems.

This technique also produces efficient code, because only the code required for the specific database access method(s) required by the RDML function are actually brought into the RDML function.

It can produce significant performance benefits in some situations.

However, it also has several disadvantages that you should be aware of:

  • Recompilations of RDML functions are now required   
    If the definition of a database file is changed, all functions that use *DBOPTIMIZE, and reference the file (directly or indirectly), need to be recompiled.
  • RDML functions produce larger compiled objects   
    Since the RDML function now contains more database access logic, it will produce larger compiled objects.
  • RDML functions take longer to compile   
    Since the RDML function now contains more database access logic, it will take longer to compile.
  • Some RDML commands cannot be used   
    The current release of LANSA does not allow the BROWSE or RENAME commands to be used in an RDML function that uses the *DBOPTIMIZE option. These restrictions are checked by the full function checker.
  • Virtual derivation logic may be more complex   
    Virtual field derivation code (refer to section on virtual fields) can be made more complex when working in a *DBOPTIMIZE environment. For instance, 2 different database files have the same data structures and subroutines defined to handle a date reversal. This works fine when using I/O modules, but if an RDML program is written that accesses both files, it will fail to compile because the data structure and subroutine would be included into the translated RPG code twice. This problem can be overcome by using the VC_COPY command. Refer to the section of this guide that deals with virtual fields for details.
  • RDML level overrides may not work the same way   
    Some LANSA users have RDML functions that issue specific file overrides by setting the file up as SECURE=NO and then issuing overrides by calling QCMDEXC or QCAEXEC. These may not work as they do when using I/O modules because IBM i and CPF handle overrides issued at the same invocation level differently to overrides issued at different invocation levels.
  • A limit to how many files that can be accessed now exists   
    When using I/O modules, no effective limit as to the number of files that could be accessed in an RDML function existed.
    However, since *DBOPTIMIZE causes all the required code to be centralized into the RDML function, the RPG restriction now applies.
    To calculate file usage total use the following:
  • If the RDML function uses DISPLAY, REQUEST or POP_UP commands, count 1.
  • For each file referenced in the RDML function, count 1.
  • For each file referenced in the RDML function for insert, update, delete or fetch by relative record, count another 1.
  • For each report produced by the RDML function, count 1.
  • For each table/code file validation check referenced by one or more of the files referenced in the RDML function, count 1.
  • For each batch control file referenced by one or more of the files referenced in the RDML function, count 1.
  • For each file accessed by one or more access routes which have predetermined join fields that are referenced in the function count 1.
    If this total exceeds 49, the translated RPG code will not compile. Note that one file spot is reserved by LANSA, so the limit is 49, not 50.
    This limit is not checked by LANSA in the current release.

 

  • Making file changes operational required before compiling RDML   
    When I/O modules are used, RDML functions that access a file can be coded and compiled before the file actually exists.
    However, when using *DBOPTIMIZE:
  • New database files must be made operational, with or without an I/O module, before attempting to compile any RDML function that references the file.
  • Modified database files must be made operational before attempting to (re)compile new or existing RDML functions that access the file.
  • After a changed file definition has been made operational, all existing RDML functions that reference the file, directly or indirectly, and use *DBOPTIMIZE, should be recompiled.

    None of these situations is specifically checked for by the current release of LANSA. However, the cross-reference abilities of LANSA have been enhanced to aid in the identification of all RDML functions that use *DBOPTIMIZE.
  • Accessing "OTHER" files with omitted fields requires more care   
    When "OTHER" (i.e. non_LANSA) files are made known to LANSA, fields within the file are sometimes "omitted" from the loaded definition because they are too long, have an unknown type or conflict with the definition of a field already in the dictionary.

    For example: a S/36 file which has no external definition may be loaded. It contains one 300 byte field called DATA. The load procedures generally ignore this field, and the developer uses the "virtual fields" facility to assemble and disassemble the field DATA during reads from and writes to the file.

    When an I/O module is used, this all works okay, and field DATA is extracted from the external file definition, allowing it to be accessed by virtual field derivation code, although, strictly speaking, LANSA does not know that this field exists.

    However, if the file is used with *DBOPTIMIZE and another file also contains a field called DATA, two different types of conflicts are possible:
  • Field DATA has a different type and/or length in both files. This is the best conflict as the RDML function will not compile, clearly indicating a conflicting situation. The only resolution in the current release of LANSA, for this situation, is not to use *DBOPTIMIZE.
  • The field DATA has the same attributes in both files. In this situation it is possible that the value of field DATA will be overwritten erratically, possibly affecting user defined virtual derivation code. In this situation the use of *DBOPTIMIZE is not recommended unless very extensive and thorough testing is performed.

Some other things to think about when using *DBOPTIMIZE are:

  • Using *DBOPTIMIZE_BATCH for large volume batch updates/deletes   
    The *DBOPTIMIZE_BATCH keyword invokes the same facilities as *DBOPTIMIZE, but in a form that is more appropriate for batch applications doing large volumes of update or delete operations.

To understand how *DBOPTIMIZE_BATCH changes the way file records are processed, you must first understand how records are processed by I/O modules, or by normal *DBOPTIMIZE logic.

Consider the following RDML function:

    FUNCTION OPTIONS(*DBOPTIMISE)

 -- SELECT    FIELDS(#FIELD01) FROM_FILE(TEST)

|     CHANGE  FIELD(#FIELD01) TO('#FIELD01 * 1.1')

|     UPDATE  FIELDS(#FIELD01) IN_FILE(TEST)

 -- ENDSELECT

 

This RDML logic is translated into I/O requests like this:

SELECT   - Read next record from file TEST (in keyed order)

         | and save its associated relative record number

         | (RRN). The record is not locked because the open 

         | data path being used is only open for input, not 

         | for update. 

         |

UPDATE   | Reread the record from file TEST via an alternate 

         | open data path using its RRN. This RRN access is   

         | much faster than access by key. The record is  

         | locked because this open data path is open for

         | update.

         | Check for any changes to the record since it was 

         | originally read via the input open data path.  

         |

         | Update the record via the update open data path

         | and release the record lock.

         |

ENDSELECT - Go back to the SELECT command

 

This logic produces very strong program logic, does not leave records locked for a long period of time, and may actually work faster than an *DBOPTIMIZE_BATCH version of the program when low percentages of the records read by the SELECT loop are updated. This is because the input open data path being used to read records does not have to lock every record it reads.

However, when large volumes of the records are being updated (or deleted), the extra open data path and the extra RRN I/O performed on it, may impose an unnecessary overhead.

To remove this overhead, you can make the following change to the function:

FUNCTION OPTIONS(*DBOPTIMIZE_BATCH)

 

This changes the translated I/O requests to be like this ...

 

SELECT   - Read next record from file TEST (in keyed order). 

         |  The record is locked because the open data path 

         |  being used is open for update.

         |  

UPDATE   |  Update the file record. No check for changes 

         |  since the record was read is required because the

         |  file record has been locked since it was read. 

         |  Use the same open data path as the read operation.

         |

ENDSELECT - Go back to the SELECT command

 

which will probably execute faster than the *DBOPTIMIZE version.

Some other things to note about *DBOPTIMIZE_BATCH include:

  • The I/O logic used for read only files in RDML programs (i.e. files that do not have any INSERT, UPDATE or DELETE operations performed against them) is identical, regardless of whether *DBOPTIMIZE or *DBOPTIMIZE_BATCH is used.
  • The I/O logic used for update files is changed. Files that have UPDATE, INSERT or DELETE operations performed against them do use different logic. This logic only opens one data path to the file (instead of 2), and it is open for update operations. This means that a record read from the file by any I/O command like SELECT or FETCH leaves the record locked until a subsequent read, update or delete operation releases the lock.
  • The previously stated method for calculating how many files can be used in a function when using *DBOPTIMIZE is slightly changed. Files accessed for update, insert or delete do not have to be counted twice when *DBOPTIMIZE_BATCH is used.
  • Do not use *DBOPTIMIZE_BATCH in online functions unless extreme care is taken. The use of *DBOPTIMIZE_BATCH in online programs that have any form of screen panel interaction is an inappropriate use of this facility, and it will most likely cause all sorts of record locking (and releasing) problems that will enormously complicate the function logic. Do not use this facility in this type of function.
  • Do not use *DBOPTIMIZE_BATCH in programs that use FETCH, UPDATE or DELETE operations that address records by relative record number (RRN). Such operations use the WITH_RRN parameter of the associated command. The use of the RETURN_RRN parameter is okay.
  • Using data file SHARE options can improve processing   

    The sharing of an open data path (SHARE=YES) facility provided by IBM i and CPF does not have any significant performance impact when using I/O modules, because all I/O is centralized into one I/O module anyway.

    However, when using *DBOPTIMIZE, I/O operations become dispersed across several RDML functions, so using SHARE=YES can have performance benefits.

    Please ensure that the concept of an open data path and the conflict and interference implications of using an open data path are fully understood before attempting to use this option. Additionally, the difference between a shared open data path and a shared access path should be clearly understood.

    Use of SHARE=YES on files under batch control logic (i.e. the file the batch control totals are kept in) is not recommended as it will almost certainly cause shared open database usage conflicts.
  • Cross-reference reports show RDML functions using *DBOPTIMIZE   

    The cross-reference reports for field and file usage clearly indicates RDML functions that use the *DBOPTIMIZE facility. This enables functions that need to be recompiled to be identified quickly.
  • You don't have to have an I/O module   

    If all RDML functions that access a file use *DBOPTIMIZE, then the file's I/O module is not really required as it will never be invoked. In this situation, you can actually define the file so as to permanently indicate that no I/O module is required.

    Refer to Database File Attributes for details of this option. If this option is used to indicate that no I/O module is required, any existing I/O module will be deleted to maintain future synchrony between the database and I/O module(s).

    Please note that the "File Maintenance Utility" performs database access via I/O modules only. Attempting to use the FMU on a file that does not have an I/O module will cause it to fail.

    Generally this is not a limitation because a purpose built RDML function, using *DBOPTIMIZE, can be generated in a matter of minutes from an application template to perform any role for which the FMU was required.
  • Be selective about the use of *DBOPTIMIZE   

    Most commercial applications more or less follow the 80/20 rule. This means that 20% of the application programs are used 80% of the time. If this is true of your system, you should plan on using *DBOPTIMIZE in the 20% of the applications, and not bother with the other 80%. It is better to optimize an online inquiry that is used all day by lots of users, than a table update routine that is used once a month.