Guidelines for Installing and Running GLUE Program |
Top Next |
Jianqiang He, Cheryl Porter, Paul Wilkens, Fabio Marin, Howard Hu, and James W. Jones July 7, 2010
The GLUE (Generalized Likelihood Uncertainty Estimation) program is used to estimate genotype-specific coefficients for the DSSAT crop models. It is a Bayesian estimation method that uses Monte Carlo sampling from prior distributions of the coefficients and a Gaussian likelihood function to determine the best coefficients based on the data that are used in the estimation process. The GLUE program allows users to select a crop, then a cultivar to be estimated. The program will then identify all experiments and treatments in the DSSAT data files for the crop that have measurements for that cultivar. The user then can select one or more experiments and treatments that will actually be used in the coefficient estimation process. Another option for the user is to specify whether to estimate only those coefficients that control phenological development, only those that deal with expansive and dry matter growth, or both sets. Generally, one would want to estimate all parameters. What happens then is that the GLUE program will make 3,000 simulation runs for phenology coefficients and another 3,000 runs for growth coefficients. The program randomly generates parameters that are being estimated (either phenology or growth) from the prior distribution of parameter values and runs the model for each. The model outputs are used to select the parameter set with the maximum likelihood value based on comparison of simulated vs. observed variables, first for phenology parameters, then for growth parameters. The program also computes the uncertainties of the estimates (variances) for each parameter.
The maximum likelihood coefficients are written to a file in the same format as the cultivar file for the selected crop. These values can be copied into the CUL file (e.g., MZCER045.CUL or SBGRO045.CUL, etc.) to operate for routine DSSAT applications and further model evaluations.
What measurements are used to estimate the coefficients? For the development coefficients, measurements of first flower, physiological maturity, and first reproductive organ appearance dates are all used. For growth coefficients, final grain yield, above ground biomass, maximum leaf area during the season, final pod weight, final main stem leaf number, and unit grain weight are used. Thus, the measurements that go into File A in DSSAT are used; these are variables measured only one time during the season, most of which were measured at harvest.
There are several assumptions that may have important effects on the resulting parameters. First are the prior distributions of coefficients, which are stored in a file called ParameterProperty.xls. This file has information for all of the DSSAT v4.5 crops. We assumed that the parameters have uniform distributions with minimum and maximum values. This is a conservative assumption, and values are provided in the files based on previous work with the models. A second assumption is that the final errors between simulated and observed values are normally distributed and are unbiased. The assumed values of the variances are given in a file named MeasurementVariances.xls. This assumption may be a problem, particularly if the model is not able to describe responses for a particular experiment very well or if observations are not reliable. Another problem will occur if the experiment had water, nutrient, or other stresses that are either not in the model or that the model does not represent well. Users should only use treatments that are near stress-free conditions, if possible, to minimize these problems. Coefficients estimated using treatments with moderate to severe stress effects will not be reliable. In any case, users should carefully check results from any estimation process to make sure that results are realistic and provide good comparisons to observations used in estimation.
There are other cautions that users should be aware of. For example, results from an estimation process provide conditional estimates of coefficients. That means that the coefficients are the best set given the measurements that were used, but the coefficients also depend on the set of observations used in the process. Our aim is for the coefficients to be robust and useful across environments, but this may not be the case. Another caution is that coefficients estimated from end of season measurements may not reproduce observed time series results very well if such measurements were made. We have seen this occur in various experiments when only end of season measurements are used, whether using GLUE or other estimation procedures. If users have time series data, these data can be used manually to refine the coefficients estimated from the GLUE procedure. It is possible to use in-season measurements and simulations in this type of Bayesian estimation process, but there are certain complications that make it difficult to create a robust and reliable automated procedure.
The GLUE program is one of two tools in DSSAT for estimating cultivar coefficients for the different crops. The first tool, developed by L. A. Hunt and others, evolved from the GENCALC software available in DSSAT v3.5. There are advantages and disadvantages of using each. Disadvantages of the GLUE technique is that it may require a lot of time for the computations, depending on the number of treatments selected for the estimation process. If there are only 2-5 measurement data sets, one would expect the GLUE procedure to finish its calculations in less than 2 hours. However, if there are many measurement data sets, say more than 15, the GLUE method will likely require several hours of calculations. This is a practical limit. On the other hand, the GLUE method can be used, without intervention by users, to produce a set of estimated coefficients. It also provides estimates of the uncertainties of the parameters. This method does not depend on heuristic rules, making it simple to implement for additional crops as they are added to DSSAT. |