Algorithms - Mathematics and Statistics

OSTRICH Mathematics and Statistics features

Initial Publication:

Modified:

This group describes the finite difference method employed by algorithms in Table 1 with shaded entries in the “Math and Stats?” column. If calibration is being performed, additional variables in this group are used to request various statistical and diagnostic output. The general format of the MathAndStats group is given in Listing 58 and Listing 58 is a concrete example.

BeginMathAndStats
DiffType         [dtype]
DiffIncType      [ditype]
DiffRelIncrement [drinc]
DIffIncrement    [dinc]
DiffMinIncrement [dmin]
CI_Pct           [cipct]

. . .

BeginPredictions
  --- see section 2.15
EndPredictions
EndMathAndStats

Figure 1: General Format of the Math and Stats Group

As shown in Figure 1, a “Predictions” sub-group can be used to instruct OSTRICH to compute confidence limits on predicted quantities of a calibrated model. The format of the Predictions sub-group is identical to the Response Variables group.

BeginMathAndStats
DiffType forward
DiffIncType value-relative
DIffIncrement 0.01
DiffMinIncrement 1.00E-6
CI_Pct 0.95
AllStats
ExcludeInsensitiveParameters
ExcludeInsensitiveObservations
WriteResidualsEachIteration

BeginPredictions
#name   filename      key  line    col     token augmented?
MinDef  CanBeam.out ; Umin   2      2       '='    no
MaxDef  CanBeam.out ; Umax   3      2       '='    no
EndPredictions
EndMathAndStats

Figure 2: Example of the Math and Stats Group

Variables in the Math and Stats group are described below:

  • DiffType: This variable selects the type of finite-difference approach to use for approximating derivatives (i.e. ∂Y/∂p, where Y is some model output and p is a parameter). Options are “forward” (forward differences), “outside” (outside central differences), “parabolic” (parabolic central differences), and “best-fit” (linear central differences). Central differences require twice as much computation as forward differences but can be more accurate. The default value is “forward”.
  • DiffIncType: The type of increment used in the selected finite-difference approach. Supported values for this variable are described below, and the default is “range-relative”:

    • range-relative: Increments will be computed by multiplying the range of a given parameter by the value of the“DiffIncrement” or “DiffRelIncrement” variable.
    • value-relative: Increments will be computed by multiplying the current value of a given parameter by the value of the “DiffIncrement” variable.
    • absolute: Increments will be directly specified by the value of the “DiffIncrement” variable.
    • optimal: Finite-difference increments will be computed according to the iterative procedure outlined by Yager (2004).
  • DiffRelIncrement: When this variable is assigned the program will use a range-relative finite-difference increment, irrespective of the value of DiffIncType. The value of this variable can be a single value that will be applied to each parameter, or a space-separated list of values corresponding to each parameter listed in the parameters group. The default value is 0.001 for all parameters.
  • DiffIncrement: The value used in computing a finite-difference for each parameter. The value of this variable can be a single value that will be applied to each parameter, or a space-separated list of values corresponding to each parameter listed in the parameters group. The default value is 0.001 for all parameters.
  • DiffMinIncrement: The minimum increment that will be used irrespective of the compute valued. The default value is 1.00E-20.
  • CI_Pct: The desired confidence level for computing linear confidence intervals on parameters and predictions. The default value is 95.
  • Stat1 .. StatN: These entries serve as flags to select various statistical output. Options are described below. The default flags are “NoStats”, “ExcludeInsensitiveParameters”, and “ExcludeInsensitiveObservations”:

    • Default: Selects a default list of parameter statistics, including correlation, standard error, and linear confidence intervals.
    • AllStats: Enables all available statistical output.
    • NoStats: Disables all statistical output.
    • BestBoxCox: Instructs OSTRICH to compute an estimate of the best Box-Cox power transformation for obtaining normalized residuals.
    • StdDev: Selects standard deviation of the regression (i.e. root mean squared error, RMSE).
    • StdErr: Selects parameter standard error.
    • CorrCoeff: Selects parameter correlation matrix.
    • NormPlot: Selects plot points for a normal probability plot along with corresponding R2N value.
    • Beale: Selects Beale’s linearity measure.
    • Linssen: Selects Linssen’s linearity measure.
    • CooksD: Selects the Cook’s D measure of observation influence.
    • DFBETAS: Selects the DFBETAS measure of observation influence.
    • Matrices: Selects non-linear regression matrices, including the Jacobian, normal and inverse normal matrices.
    • Confidence: Selects linear confidence intervals on estimated parameters.
    • Sensitivity: Selects measures of parameter sensitivity, including composite and dimensionless scaled sensitivities.
    • RunsTest: Selects the runs test for serial correlation among residuals.
    • AutorunFunction: Selects the autorun function for serial correlation among residuals.
    • MMRI: Selects various information-theoretic measures for assessing multi-model ranking and inference.
    • ExcludeInsensitiveParameters: If present, insensitive parameters will be excluded from statistical calculations. This can help avoid problems with singular matrices.
    • IncludeInsensitiveParameters: If present, insensitive parameters will be included in statistical calculations.
    • ExcludeInsensitiveObservations: If present, insensitive observations will be excluded from statistical calculations. This can help avoid problems with singular matrices.
    • IncludeInsensitiveObservations: If present, insensitive observations will be included in statistical calculations.
    • WriteResidualsEachIteration: If present, a residuals file will be created for each iteration or step of the algorithm and named OstResiduals_P*_S*.txt. The P* portion of the filename will identify the processor (i.e. rank) and the S* portion of the filename will identify the iteration (i.e. step). The file will list the residuals associated with the best-fit parameter set discovered by the algorithm up to the indicated algorithm iteration (i.e. step). This option only applies to the WSSE objective function and is not available for the following algorithms: DDSAU, GLUE, MCMC, PADDS, ParaPADDS, RJSMP, BEERS, and SMOOTH.

References

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716-723.

Beale, E. M. 1968. Confidence Regions in Non-linear Estimation. Journal of the Royal Statistical Society, Series B 22, 41-88.

Belsley, D., Kuh, E.,Welsch, R. 1980. Regression Diagnostics: Identifying Influential Data and Sources of Colinearity. John Wiley & Sons, New York (NY).

Belsley, D. A., Kuh, E.,Welsch, R. E. 2005. Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons.

Carroll, R. J.,Ruppert, D. 1988. Transformation and weighting in regression. CRC Press.

Chatterjee, S.,Hadi, A. S. 1986. Influential observations, high leverage points, and outliers in linear regression. Statistical Science 379-393.

Cook, R.,Weisberg, S. 1982. Residuals and Influence in Regression. Chapman and Hall, New York (NY).

Draper, N. R., Smith, H.,Pownell, E. 1966. Applied regression analysis. Wiley New York.

Filliben, J. J. 1975. The Probability Plot Correlation Coefficient Test for Normality. Technometrics 17, 111-117.

Hannan, E.,Quinn, B. 1979. The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B 41, 190–195.

Hurvich, C. M.,Tsai, C. L. 1993. A corrected Akaike information criterion for vector autoregressive model selection. Journal of time series analysis 14, 271-279.

Hurvich, C. M.,Tsai, C.-L. 1994. Autoregressive model selection in small samples using a bias-corrected version of AIC. Kluwer Academic Publishers, Dordrecht, Netherlands.

Linssen, H. N. 1975. Nonlinearity measures: A case study. Statistica Neerlandica 29, 93-99.

Looney, S. W.,Gulledge Jr, T. R. 1985. Use of the correlation coefficient with normal probability plots. The American Statistician 39, 75-79.

McKenzie, E. 1984. The autorun function: A non-parametric autocorrelation function. Journal of Hydrology 67, 45-53.

Sakia, R. 1992. The Box-Cox transformation technique: a review. The statistician 169-178.

Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics 6, 461–464.

Seber, G. A.,Wild, C. J. 1989. Nonlinear Regression. John Wiley and Sons, New York (NY).

Straume, M.,Johnson, M. L. 2010. Analysis of residuals: criteria for determining goodness-of-fit. Essential Numerical Computer Methods 37.