Build Options- Defining OpenCL and DirectX build options

CodeXL

previous page next page

CodeXL User Guide

Help > Using CodeXL > Static Analyzer > Build Options- Defining OpenCL and DirectX build options

Build Options- Defining OpenCL and DirectX build options

In the Static Analyze toolbar, you can define specific OpenCL or HLSL build options:

The Build Options box is a place to set compiler build flags such as –x clc++ or –o3. Any compiler build flag can be placed in this box.

You can set the build options by typing the options directly in the designated text box or by using the OpenCL/HLSL Build Options dialog.

OpenCL Build Options Dialog

This dialog will help you choose the correct OpenCL build options for you and hopefully will prevent making spelling mistakes while typing the options manually.

To open the dialog, press The button. The dialog will be opened. You can switch between the ”General & Optimization” tab and the ”Other” tab to view all the available options. Once you choose an option, the option text is displayed in the ”OpenCL Build Command Line” text box that appears below. This string will also appear in the menu bar after you click the OK button.

While typing a command in the “OpenCL Build Command Line” text box, you will notice that the relevant controls are being updated accordingly (for example, if you will type “-w”, you will be able to see that the “Disable all warnings” check box becomes checked).

Usage Example: build options

For building the tpAdvectFieldScalar.cl kernel from CodeXL’s AMDTTeaPot sample project, enter the following options:

-D GRID_NUM_CELLS_X=64 -D GRID_NUM_CELLS_Y=64 -D GRID_NUM_CELLS_Z=64 -D GRID_INV_SPACING=1.000000f -D GRID_SPACING=1.000000f -D GRID_SHIFT_X=6 -D GRID_SHIFT_Y=6 -D GRID_SHIFT_Z=6 -D GRID_STRIDE_Y=64 -D GRID_STRIDE_SHIFT_Y=6 -D GRID_STRIDE_Z=4096 -D GRID_STRIDE_SHIFT_Z=12 -I path_to_example_src

On windows, path_to_example_src should be:

C:\Program Files\CodeXL\Examples\Teapot\res

On Linux, path_to_example_src should be:

/opt//CodeXL/bin/examples/Teapot/AMDTTeaPotLib/AMDTTeaPotLib/

Adding the option ‘-h’ will dump the list of OpenCL compiler available options in the output tab. For additional details, ‘Compile Build Options’ Appendix.

Build Options

General Options
-D	Predefined macros	Predefine macros should be separated by ';'. If the Predefined macro needs to include a space, enclose the macro within parentheses.
-I	Additional include directories.	Additional include directories should be separated by ';'. If the directory path includes a space, enclose the path within parentheses.
-x clc,-x clc++	OpenCL format
-w	Disable all warnings	Inhibit all warning messages.
-Werror	Treat any warning as an error	Make all warnings into errors.
Optimization Options
-O0,-O1,-O2,-O3,-O4,-O5	Optimization level
-cl-single-precision-constant	Treat double float-point constant as single one	Treat double precision floating-point constant as single precision constant
-cl-denorms-are-zero	Flush denormalized floating point numbers as zeros	This option controls how single precision and double precision denormalized numbers are handled. If specified as a build option, the single precision denormalized numbers may be flushed to zero and if the optional extension for double precision is supported, double precision denormalized numbers may also be flushed to zero. This is intended to be a performance hint and the OpenCL compiler can choose not to flush denorms to zero if the device supports single precision (or double precision) denormalized numbers. This option is ignored for single precision numbers if the device does not support single precision denormalized numbers i.e. if CL_FP_DENORM bit is not set in CL_DEVICE_SINGLE_FP_CONFIG. This option is ignored for double precision numbers if the device does not support double precision or if it does support double precision but CL_FP_DENORM bit is not set in CL_DEVICE_DOUBLE_FP_CONFIG. This flag only applies to scalar and vector single precision floating-point variables and to computations on these floating-point variables inside a program. It does not apply to reading from or writing to image objects.
-cl-strict-aliasing	Compiler assumes the strict aliasing rules	This option allows the compiler to assume the strictest aliasing rules.
-cl-mad-enable	Enable MAD	Allow a * b + c to be replaced by a mad. The mad computes a * b + c with reduced accuracy. For example, some OpenCL devices implement mad as truncate the result of a * b before adding it to c.
-cl-no-signed-zeros	Ignore the signedness of zero	Allow optimizations for floating-point arithmetic that ignore the signedness of zero. IEEE 754 arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits the simplification of expressions such as x+0.0 or 0.0*x (even with -cl-finite-math-only). This option implies that the sign of a zero result isn't significant.
-cl-unsafe-math-optimizations	Allow unsafe optimization	Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid, (b) may violate IEEE 754 standard and (c) may violate the OpenCL numerical compliance requirements as defined in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option includes the -cl-no-signed-zeros and -cl-mad-enable options.
-cl-finite-math-only	Assume no NaN nor infinite	Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or ±?. This option may violate the OpenCL numerical compliance requirements defined in in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5.
-cl-fast-relaxed-math	Do aggressive Math Optimization	Sets the optimization options -cl-finite-math-only and -cl-unsafe-math-optimizations. This allows optimizations for floating-point arithmetic that may violate the IEEE 754 standard and the OpenCL numerical compliance requirements defined in the specification in section 7.4 for single-precision floating-point, section 9.3.9 for double-precision floating-point, and edge case behavior in section 7.5. This option causes the preprocessor macro __FAST_RELAXED_MATH__ to be defined in the OpenCL program.
-cl-fp32-correctly-rounded-divide-sqrt	Correctly round single-precision FP divide & sqrt	The -cl-fp32-correctly-rounded-divide-sqrt build option to clBuildProgram or clCompileProgram allows an application to specify that single precision floating-point divide (x/y and 1/x) and sqrt used in the program source are correctly rounded. If this build option is not specified, the minimum numerical accuracy of single precision floating-point divide and sqrt are as defined in section 7.4 of the OpenCL specification.\nThis build option can only be specified if the CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT is set in CL_DEVICE_SINGLE_FP_CONFIG (as defined in in the table of allowed values for param_name for clGetDeviceInfo) for devices that the program is being build. clBuildProgram or clCompileProgram will fail to compile the program for a device if the -cl-fp32-correctly-rounded-divide-sqrt option is specified and CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT is not set for the device.
Other Options
-cl-std	CL version supported	Determine the OpenCL C language version to use. A value for this option must be provided. Valid values are:\nCL1.1 - Support all OpenCL C programs that use the OpenCL C language features defined in section 6 of the OpenCL 1.1 specification.\nCL1.2 – Support all OpenCL C programs that use the OpenCL C language features defined in section 6 of the OpenCL 1.2 specification.
-cl-kernel-arg-info	Kernel argument info	This option allows the compiler to store information about the arguments of a kernel(s) in the program executable. The argument information stored includes the argument name, its type, the address and access qualifiers used. Refer to the description of clGetKernelArgInfo for information about how to query this information.
-create-library	Create library	Create a library of compiled binaries specified in input_programs argument to clLinkProgram.
-enable-link-options	Enable link options	Allows the linker to modify the library behavior based on one or more link options (described in Program Linking Options, below) when this library is linked with a program executable. This option must be specified with the –create-library option.
-g	Produce debugging information	This is an experimental feature that lets you use the GNU project debugger, GDB, to debug kernels on x86 CPUs running Linux, or cygwin/minGW under Windows. For more details, see Chapter 3, “Debugging OpenCL.” This option does not affect the default optimization of the OpenCL code.
-fper-pointer-uav -fno-per-pointer-uav	Specify that UAV per pointer should be used (HD5XXX and HD6XXX series GPU's only)
-fbin-bif30 -fno-bin-bif30	Allow OpenCL binary to be BIF3.0 format
-fbin-encrypt -fno-bin-encrypt	Generate an encrypted OpenCL binary (not by default)
-save-temps	Store temporary files in current directory	This option dumps intermediate temporary files, such as IL and ISA code, for each OpenCL kernel. If <prefix> is not given, temporary files are saved in the default temporary directory (the current directory for Linux, C:\\Users\\<user>\\AppData\\Local for Windows). If \\<prefix\\> is given, those temporary files are saved with the given <prefix>. If <prefix> is an absolute path prefix, such as C:\\your\\work\\dir\\mydumpprefix, those temporaries are saved under C:\\your\\work\\dir, with mydumpprefix as prefix to all temporary names. For example, under the default directory
-fuse-jit -fno-use-jit	Use JIT for CPU target (disable if debugging is enabled
-fforce-jit -fno-force-jit	Force use JIT for CPU target (even if debugging is enabled)
-fdisable-avx -fno-disable-avx	Disable AVX code generation
-ffma-enable -fno-fma-enable	Enable fma for a*b+c
-fuse-native	Replace math function calls with native version

HLSL Build Options Dialog

This dialog will help you choose the correct HLSL build options for you and hopefully will prevent making spelling mistakes while typing the options manually.

To open the dialog, press The Button. The dialog will be opened. Click the ”HLSL Build Options” node to view the available options.
Once you choose an option, the option text is displayed in the ”HLSL Build Command Line” text box that appears below.
This build option string will also appear in the toolbar’s build options box after you click the OK button.

As an alternative to selecting options through the radio buttons, it is possible to type a command in the “HLSL Build Command Line” text box. Build options types in the text box will automatically be translated to update of the relevant controls accordingly. For example, typing “D3DCOMPILE_DEBUG” in the lower text box automatically updates the “Debug” check box to be checked.

Build Options

The compilation of DirectX shaders can be executed either by directly referencing the D3D compiler DLL or by going through Microsoft’s FXC tool.

The CodeXL installation includes a copy of the Microsoft DirectX compiler DLL: d3dcompiler_47.dll. You may specify a different path if you want CodeXL to use a different d3dcompiler module. If you select the FXC compiler tool, you must specify a path to the location of FXC.exe.

To select the path of the compiler module, click the ‘Browse…” option from the combo-box. When selecting Browse, a dialog box will open for selecting the compiler file.

· For D3D compiler – any file called d3compiler_*.dll can be selected.

· For FXC compiler – only files named FXC.exe can be selected.

D3D compile command	FXC Compile command	Build Option
-D		Predefined macros	Predefine macros should be separated by ';'. If the Predefined macro needs to include a space, enclose the macro within parentheses. The macro won’t appear in the “HLSL Build Command Line”
-I		Additional include directories	Additional include directories should be separated by ';'. If the directory path includes a space, enclose the path within parentheses. the include won’t appear in the “HLSL Build Command Line”
D3DCOMPILE_AVOID_FLOW_CONTROL	/Gfa	Avoid Flow Control	Directs the compiler to not use flow-control constructs where possible.
D3DCOMPILE_DEBUG	/Zi	Debug	Directs the compiler to insert debug file/line/type/symbol information into the output code.
D3DCOMPILE_ENABLE_BACKWARDS_COMPATIBILITY	/Gec	Enable Backwards Compatibility	Directs the compiler to enable older shaders to compile to 5_0 targets.
D3DCOMPILE_ENABLE_STRICTNESS	/Ges	Enable Strictness	Forces strict compile, which might not allow for legacy syntax. By default, the compiler disables strictness on deprecated syntax.
D3DCOMPILE_FORCE_PS_SOFTWARE_NO_OPT		Force Pixel Shader Optimization Off	Directs the compiler to compile a pixel shader for the next highest shader profile. This constant also turns debugging on and optimizations off.
D3DCOMPILE_FORCE_VS_SOFTWARE_NO_OPT		Force Vertex Shader Optimizations Off	Directs the compiler to compile a vertex shader for the next highest shader profile. This constant turns debugging on and optimizations off."
D3DCOMPILE_IEEE_STRICTNESS	/Gis	IEEE Strictness	Forces the IEEE strict compile
D3DCOMPILE_NO_PRESHADER	/Op	No Preshader	Directs the compiler to disable Preshaders. If you set this constant, the compiler does not pull out static expression for evaluation.
* D3DCOMPILE_SKIP_OPTIMIZATION * D3DCOMPILE_OPTIMIZATION_LEVEL0 * .. (no flag for default optimization) * D3DCOMPILE_OPTIMIZATION_LEVEL2 * D3DCOMPILE_OPTIMIZATION_LEVEL3	* /Od * /O0 * .. (no flag for default) * /O1 * /O2 * /O3	Optimization Level: * Skip Optimization * Level 0 - Lowest optimization * Level 1 – Default Optimization * Level 2 * Level 3 - Highest optimization	Directs the compiler to use the specified level of optimization: * Skip - skip optimization steps during code generation. We recommend that you set this constant for debug purposes only. * Lowest level - If you set this constant, the compiler might produce slower code but produces the code more quickly. Set this constant when you develop the shader iteratively. * Second lowest level - Second highest level. * Highest level - If you set this constant, the compiler produces the best possible code but might take significantly longer to do so. Set this constant for final builds of an application when performance is the most important factor.
D3DCOMPILE_PACK_MATRIX_COLUMN_MAJOR	/Zpc	Pack Matrix Column Major	Directs the compiler to pack matrices in column-major order on input and output from the shader. This type of packing is generally more efficient because a series of dot-products can then perform vector-matrix multiplication
D3DCOMPILE_PACK_MATRIX_ROW_MAJOR	/Zpr	Pack Matrix Row Major	Directs the compiler to pack matrices in row-major order on input and output from the shader.
D3DCOMPILE_PARTIAL_PRECISION	/Gpp	Partial Precision	Directs the compiler to perform all computations with partial precision. If you set this constant, the compiled code might run faster on some hardware.
D3DCOMPILE_PREFER_FLOW_CONTROL	/Gfp	Prefer Flow Control	Directs the compiler to use flow-control constructs where possible.
D3DCOMPILE_RESOURCES_MAY_ALIAS	/res_may_alias	Resources May Alias	Directs the compiler to assume that unordered access views (UAVs) and shader resource views (SRVs) may alias for cs_5_0. Note: This is a new compiler symbol (supported as of D3dcompiler_47.dll).
D3DCOMPILE_SKIP_VALIDATION	/Vd	Skip Validation	Directs the compiler not to validate the generated code against known capabilities and constraints. We recommend that you use this constant only with shaders that have been successfully compiled in the past. DirectX always validates shaders before it sets them to a device.
D3DCOMPILE_WARNINGS_ARE_ERRORS	/WX	Warnings Are Errors	Directs the compiler to treat all warnings as errors when it compiles the shader code. We recommend that you use this constant for new shader code, so that you can resolve all warnings and lower the number of hard-to-find code defects
	/Lx	Output hexadecimal literals
	/Ni	Numbering of instructions in assembly listings
	/No	Output instruction byte offset in assembly listings
	/Qstrip_debug	Strip debug data from 4.0 + shader bytecode
	/Qstrip_priv	Strip private data from 4.0 + shader bytecode
	/Qstrip_reflect	Strip reflection data from 4.0 + shader bytecode

· Note: some of the flags are only relevant to the FXC tool.

previous page start next page