Build Options- Defining OpenCL and DirectX build options

CodeXL

PreviousNext
CodeXL User Guide
Help > Using CodeXL > Static Analyzer > Build Options- Defining OpenCL and DirectX build options
Build Options- Defining OpenCL and DirectX build options

In the Static Analyze toolbar, you can define specific OpenCL or HLSL build options:

The Build Options box is a place to set compiler build flags such as –x clc++ or –o3. Any compiler build flag can be placed in this box.

You can set the build options by typing the options directly in the designated text box or by using the OpenCL/HLSL Build Options dialog.

OpenCL Build Options Dialog

This dialog will help you choose the correct OpenCL build options for you and hopefully will prevent making spelling mistakes while typing the options manually.

To open the dialog, press The  button. The dialog will be opened. You can switch between the ”General & Optimization” tab and the ”Other” tab to view all the available options. Once you choose an option, the option text is displayed in the ”OpenCL Build Command Line” text box that appears below. This string will also appear in the menu bar after you click the OK button.

While typing a command in the “OpenCL Build Command Line” text box, you will notice that the relevant controls are being updated accordingly (for example, if you will type “-w”, you will be able to see that the “Disable all warnings” check box becomes checked).

Usage Example: build options

For building the tpAdvectFieldScalar.cl kernel from CodeXL’s AMDTTeaPot sample project, enter the following options:

-D GRID_NUM_CELLS_X=64 -D GRID_NUM_CELLS_Y=64 -D GRID_NUM_CELLS_Z=64 -D GRID_INV_SPACING=1.000000f -D GRID_SPACING=1.000000f -D GRID_SHIFT_X=6 -D GRID_SHIFT_Y=6 -D GRID_SHIFT_Z=6 -D GRID_STRIDE_Y=64 -D GRID_STRIDE_SHIFT_Y=6 -D GRID_STRIDE_Z=4096 -D GRID_STRIDE_SHIFT_Z=12 -I path_to_example_src

On windows, path_to_example_src should be:

C:\Program Files\CodeXL\Examples\Teapot\res

On Linux, path_to_example_src should be:

/opt//CodeXL/bin/examples/Teapot/AMDTTeaPotLib/AMDTTeaPotLib/

Adding the option ‘-h’ will dump the list of OpenCL compiler available options in the output tab. For additional details, ‘Compile Build Options’ Appendix.

Build Options

General Options

-D

Predefined macros

Predefine macros should be separated by ';'. If the Predefined macro needs to include a space, enclose the macro within parentheses.

-I

Additional include directories.

Additional include directories should be separated by ';'. If the directory path includes a space, enclose the path within parentheses.

-x clc,-x clc++

OpenCL format

 

-w

Disable all warnings

Inhibit all warning messages.

-Werror

Treat any warning as an error

Make all warnings into errors.

Optimization Options

-O0,-O1,-O2,-O3,-O4,-O5

Optimization level

 

-cl-single-precision-constant

Treat double float-point constant as single one

Treat double precision floating-point constant as single precision constant

-cl-denorms-are-zero

Flush denormalized floating point numbers as zeros

This option controls how single precision and double precision denormalized numbers are handled. If specified as a build option, the single precision denormalized numbers may be flushed to zero and if the optional extension for double precision is supported, double precision denormalized numbers may also be flushed to zero. This is intended to be a performance hint and the OpenCL compiler can choose not to flush denorms to zero if the device supports single precision (or double precision) denormalized numbers. This option is ignored for single precision numbers if the device does not support single precision denormalized numbers i.e. if CL_FP_DENORM bit is not set in CL_DEVICE_SINGLE_FP_CONFIG. This option is ignored for double precision numbers if the device does not support double precision or if it does support double precision but CL_FP_DENORM bit is not set in CL_DEVICE_DOUBLE_FP_CONFIG. This flag only applies to scalar and vector single precision floating-point variables and to computations on these floating-point variables inside a program. It does not apply to reading from or writing to image objects.

-cl-strict-aliasing

Compiler assumes the strict aliasing rules

This option allows the compiler to assume the strictest aliasing rules.

-cl-mad-enable

Enable MAD

Allow a * b + c to be replaced by a mad. The mad computes a * b + c with reduced accuracy. For example, some OpenCL devices implement mad as truncate the result of a * b before adding it to c.

-cl-no-signed-zeros

Ignore the signedness of zero

Allow optimizations for floating-point arithmetic that ignore the signedness of zero. IEEE 754 arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits the simplification of expressions such as x+0.0 or 0.0*x (even with -cl-finite-math-only). This option implies that the sign of a zero result isn't significant.

-cl-unsafe-math-optimizations

 

Allow unsafe optimization

 

Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid, (b) may violate IEEE 754 standard and (c) may violate the OpenCL numerical compliance requirements as defined in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option includes the -cl-no-signed-zeros and -cl-mad-enable options.

-cl-finite-math-only

Assume no NaN nor infinite

Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or ±?. This option may violate the OpenCL numerical compliance requirements defined in in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5.

-cl-fast-relaxed-math

Do aggressive Math Optimization

 

Sets the optimization options -cl-finite-math-only and -cl-unsafe-math-optimizations. This allows optimizations for floating-point arithmetic that may violate the IEEE 754 standard and the OpenCL numerical compliance requirements defined in the specification in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option causes the preprocessor macro __FAST_RELAXED_MATH__ to be defined in the OpenCL program.

-cl-fp32-correctly-rounded-divide-sqrt

Correctly round single-precision FP divide & sqrt

The -cl-fp32-correctly-rounded-divide-sqrt build option to clBuildProgram or clCompileProgram allows an application to specify that single precision floating-point divide (x/y and 1/x) and sqrt used in the program source are correctly rounded. If this build option is not specified, the minimum numerical accuracy of single precision floating-point divide and sqrt are as defined in section 7.4 of the OpenCL specification.\nThis build option can only be specified if the CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT is set in CL_DEVICE_SINGLE_FP_CONFIG (as defined in in the table of allowed values for param_name for clGetDeviceInfo) for devices that the program is being build. clBuildProgram or clCompileProgram will fail to compile the program for a device if the -cl-fp32-correctly-rounded-divide-sqrt option is specified and CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT is not set for the device.

Other Options

-cl-std

CL version supported

Determine the OpenCL C language version to use. A value for this option must be provided. Valid values are:\nCL1.1 - Support all OpenCL C programs that use the OpenCL C language features defined in section 6 of the OpenCL 1.1 specification.\nCL1.2 – Support all OpenCL C programs that use the OpenCL C language features defined in section 6 of the OpenCL 1.2 specification.

-cl-kernel-arg-info

Kernel argument info

This option allows the compiler to store information about the arguments of a kernel(s) in the program executable. The argument information stored includes the argument name, its type, the address and access qualifiers used. Refer to the description of clGetKernelArgInfo for information about how to query this information.

-create-library

Create library

Create a library of compiled binaries specified in input_programs argument to clLinkProgram.

-enable-link-options

Enable link options

Allows the linker to modify the library behavior based on one or more link options (described in Program Linking Options, below) when this library is linked with a program executable. This option must be specified with the –create-library option.

-g

Produce debugging information

This is an experimental feature that lets you use the GNU project debugger, GDB, to debug kernels on x86 CPUs running Linux, or cygwin/minGW under Windows. For more details, see Chapter 3, “Debugging OpenCL.” This option does not affect the default optimization of the OpenCL code.

-fper-pointer-uav
-fno-per-pointer-uav

Specify that UAV per pointer should be used
(HD5XXX and HD6XXX series GPU's only)

 

-fbin-bif30
-fno-bin-bif30

Allow OpenCL binary to be BIF3.0 format

 

-fbin-encrypt
-fno-bin-encrypt

Generate an encrypted OpenCL binary (not by default)

 

-save-temps

Store temporary files in current directory

This option dumps intermediate temporary files, such as IL and ISA code, for each OpenCL kernel. If <prefix> is not given, temporary files are saved in the default temporary directory (the current directory for Linux, C:\\Users\\<user>\\AppData\\Local for Windows). If \\<prefix\\> is given, those temporary files are saved with the given <prefix>. If <prefix> is an absolute path prefix, such as C:\\your\\work\\dir\\mydumpprefix, those temporaries are saved under C:\\your\\work\\dir, with mydumpprefix as prefix to all temporary names. For example, under the default directory

-fuse-jit
-fno-use-jit

Use JIT for CPU target (disable if debugging is enabled

 

-fforce-jit
-fno-force-jit

Force use JIT for CPU target (even if debugging is enabled)

 

-fdisable-avx
-fno-disable-avx

Disable AVX code generation

 

-ffma-enable
-fno-fma-enable

Enable fma for a*b+c

 

 

-fuse-native

Replace math function calls with native version

 

 

 

HLSL Build Options Dialog

This dialog will help you choose the correct HLSL build options for you and hopefully will prevent making spelling mistakes while typing the options manually.

To open the dialog, press The  Button. The dialog will be opened. Click the ”HLSL Build Options” node to view the available options.
Once you choose an option, the option text is displayed in the ”HLSL Build Command Line” text box that appears below.
This build option string will also appear in the toolbar’s build options box after you click the OK button.

As an alternative to selecting options through the radio buttons, it is possible to type a command in the “HLSL Build Command Line” text box. Build options types in the text box will automatically be translated to update of the relevant controls accordingly. For example, typing “D3DCOMPILE_DEBUG” in the lower text box automatically updates the “Debug” check box to be checked.

 

 

Build Options

The compilation of DirectX shaders can be executed either by directly referencing the D3D compiler DLL or by going through Microsoft’s FXC tool.

The CodeXL installation includes a copy of the Microsoft DirectX compiler DLL: d3dcompiler_47.dll. You may specify a different path if you want CodeXL to use a different d3dcompiler module. If you select the FXC compiler tool, you must specify a path to the location of FXC.exe.

To select the path of the compiler module, click the ‘Browse…” option from the combo-box.  When selecting Browse, a dialog box will open for selecting the compiler file.

·         For D3D compiler – any file called d3compiler_*.dll can be selected.

·         For FXC compiler only files named FXC.exe can be selected.

D3D compile command

FXC Compile command

Build Option

 

-D

Predefined macros

Predefine macros should be separated by ';'. If the Predefined macro needs to include a space, enclose the macro within parentheses.
The macro won’t appear in the “HLSL Build Command Line”

-I

Additional include directories

Additional include directories should be separated by ';'. If the directory path includes a space, enclose the path within parentheses.
the include won’t appear in the “HLSL Build Command Line”

D3DCOMPILE_AVOID_FLOW_CONTROL

/Gfa

Avoid Flow Control

Directs the compiler to not use flow-control constructs where possible.

D3DCOMPILE_DEBUG

/Zi

Debug

Directs the compiler to insert debug file/line/type/symbol information into the output code.

D3DCOMPILE_ENABLE_BACKWARDS_COMPATIBILITY

/Gec

Enable Backwards Compatibility

Directs the compiler to enable older shaders to compile to 5_0 targets.

D3DCOMPILE_ENABLE_STRICTNESS

/Ges

Enable Strictness

Forces strict compile, which might not allow for legacy syntax.
By default, the compiler disables strictness on deprecated syntax.

D3DCOMPILE_FORCE_PS_SOFTWARE_NO_OPT

 

Force Pixel Shader Optimization Off

Directs the compiler to compile a pixel shader for the next highest shader profile. This constant also turns debugging on and optimizations off.

D3DCOMPILE_FORCE_VS_SOFTWARE_NO_OPT

 

Force Vertex Shader Optimizations Off

Directs the compiler to compile a vertex shader for the next highest shader profile. This constant turns debugging on and optimizations off."

D3DCOMPILE_IEEE_STRICTNESS

/Gis

IEEE Strictness

Forces the IEEE strict compile

D3DCOMPILE_NO_PRESHADER

/Op

No Preshader

Directs the compiler to disable Preshaders. If you set this constant, the compiler does not pull out static expression for evaluation.


* D3DCOMPILE_SKIP_OPTIMIZATION
* D3DCOMPILE_OPTIMIZATION_LEVEL0
* .. (no flag for default optimization)
* D3DCOMPILE_OPTIMIZATION_LEVEL2
* D3DCOMPILE_OPTIMIZATION_LEVEL3

 

* /Od
* /O0
* .. (no flag for default)
* /O1
* /O2
* /O3

Optimization Level:
* Skip Optimization
* Level 0 - Lowest optimization
* Level 1 – Default Optimization
* Level 2
* Level 3 - Highest optimization

Directs the compiler to use the specified level of optimization:
* Skip - skip optimization steps during code generation. We recommend that you set this constant for debug purposes only.
* Lowest level - If you set this constant, the compiler might produce slower code but produces the code more quickly.
      Set this constant when you develop the shader iteratively.
* Second lowest level - Second highest level.
* Highest level - If you set this constant, the compiler produces the best possible code but might take significantly longer to do so.
      Set this constant for final builds of an application when performance is the most important factor.

D3DCOMPILE_PACK_MATRIX_COLUMN_MAJOR

/Zpc

Pack Matrix Column Major

Directs the compiler to pack matrices in column-major order on input and output from the shader. This type of packing is generally more efficient because a series of dot-products can then perform vector-matrix multiplication

D3DCOMPILE_PACK_MATRIX_ROW_MAJOR

/Zpr

Pack Matrix Row Major

Directs the compiler to pack matrices in row-major order on input and output from the shader.

D3DCOMPILE_PARTIAL_PRECISION

/Gpp

Partial Precision

Directs the compiler to perform all computations with partial precision. If you set this constant, the compiled code might run faster on some hardware.

D3DCOMPILE_PREFER_FLOW_CONTROL

/Gfp

Prefer Flow Control

Directs the compiler to use flow-control constructs where possible.

D3DCOMPILE_RESOURCES_MAY_ALIAS

/res_may_alias

Resources May Alias

Directs the compiler to assume that unordered access views (UAVs) and shader resource views (SRVs) may alias for cs_5_0.
Note: This is a new compiler symbol (supported as of D3dcompiler_47.dll).

D3DCOMPILE_SKIP_VALIDATION

/Vd

Skip Validation

Directs the compiler not to validate the generated code against known capabilities and constraints. We recommend that you use this constant only with shaders that have been successfully compiled in the past. DirectX always validates shaders before it sets them to a device.

D3DCOMPILE_WARNINGS_ARE_ERRORS

/WX

Warnings Are Errors

Directs the compiler to treat all warnings as errors when it compiles the shader code. We recommend that you use this constant for new shader code, so that you can resolve all warnings and lower the number of hard-to-find code defects

 

/Lx

Output hexadecimal literals

 

 

/Ni

Numbering of instructions in assembly listings

 

 

/No

Output instruction byte offset in assembly listings

 

 

/Qstrip_debug

Strip debug data from 4.0 + shader bytecode

 

 

/Qstrip_priv

Strip private data from 4.0 + shader bytecode

 

 

/Qstrip_reflect

Strip reflection data from 4.0 + shader bytecode

 

 

·         Note: some of the flags are only relevant to the FXC tool.