WebCab Probability and Statistics for COM v3.6

GeneralLinearFactorModel Class

Offers the ability to find the function of best fit in accordance with the least squares approach where the class of functions considered is built from a factor model.

For a list of all members of this type, see GeneralLinearFactorModel Members.

System.Object
   GeneralLinearFactorModel

public class GeneralLinearFactorModel

Remarks

The factor model builds the class of functions considered by iteratively adding term-by-term the types from which the class of functions are constructed (see below for details). The key advantage of the factor model over the application of the function basis interface FunctionBasis which is then passed to the GeneralLinear class, is that this approach only requires static function calls in order to define the class of functions from which the best fit function is selected.

Details concerning the Function of Best Fit

The function is best fit is constructed from a linear combination of basis functions f1(x), f2(x), ..., fn(x), where the basis functions are set using iterative basis building procedure as detailed below. That is, the function of best fit can be any function which takes the form:

a1 * f1(x) + a2 * f2(x) + ... + an * fn(x),

where a1,..., an are any real numbers.

Providing the Function to Fit via the Factor Model

The function of best fit is constructed as the linear sum of functions: f1(x), f2(x),..., fn(x) where each of the individual functions are given by calling one of the methods:

  1. AddConstantTerm - Add a basis constant function.
  2. AddSumOfPowerTerms - Add a basis function which is a sum of power terms.
  3. AddRationalSumOfPowerTerms - Add a basis function which is a sum of a rational where the denominator and numerator which our themselves sums of power terms.
  4. AddCosineSum - Add a basis function which is a sum of Cosine functions.
  5. AddSineSum - Add a basis function which is a sum of Sine functions.
  6. AddTanSum - Add a basis function which is a sum of Tangent functions.
  7. AddLogSum - Add a basis function which is a sum of Logarithm functions.
  8. AddExpSum - Add a basis function which is a sum of Exponential functions.
  9. AddAbsoluteSum - Add a basis function which is a sum of absolute functions.
  10. AddStepSum - Add a basis function which is a sum of step functions.
  11. AddTabulatedFunction - Adds a basis function which is constructed from a set of tabulation points where are then interpolated in order to construct the function.

That is, we construct the function which is to be fitted by iteratively added terms to the basis elements from which it is constructed. We illustrate exactly how this works without the following example.

Example of the Factor Model Function Builder

Say we wish to fit the function of the following type to the data set considered:

f(x) = (a1 * x) + (a2 * sin(x+3) ) + (a3 * 2)

where the terms a1, a2, a3 are just constants which are yet to be decided.

Then in order to build this model we would need to make the following (ordered) methods calls:

  1. AddSumOfPowerTerms - To add the x you need to passed the parameters: powerTermsCoeff = {1}, powerTermsExponents = {1}
  2. AddSineSum - The above sine term is added by passing the parameters: sinCoeff = {1}coefficientf = {1}, constant = {3}, exponent = {1}.
  3. AddConstantTerm - Passing the value 2, corresponding to the constant.

Once you have added the basis elements of the linear model, the variables of the linear model namely a1, a2, a3 will be included by the General Linear fitting algorithm (i.e SetGeneralFit). Moreover once the optimal values for these variables has been found in accordance with the least squares methodology the order in which they are return by the General Linear fitting algorithm will correspond to the order in which the elements of the factor model where set.

In the example given above if the factor model is set in the order given, once the General Linear algorithm is called the corresponding returned results array will consist of three elements where the first element will be the fitted value of a1, the second element will be the fitted value of a2, and the third element will be the fitted value of a3.

The Least Square Approach

When we refer to the function of best fit strictly speaking we are referring to the function constructed as a linear combination of basis functions constructed by the following means. The coefficients are selected such that the resulting function has the maximum likelihood of being the best fit in accordance with the least squares approach when the measurement errors of the y-axis coordinates of the known data points are given.

This implementation we offer the ability to incorporate measurement errors within the observed data by which the function it fit. In particular, we assume that the measurement errors of the yi's are independently random and distributed as a Normal Gaussian distribution about a true value. It follows from these assumptions (though not rigorously) that the most likely coefficients which generate the best fit are achieved by finding the coefficients: a1, ..., an such that the sum of the squares of the terms:

( yi - (a1 * f1(xi) + ... + an * fn(xi)) \ sigmai,

is minimized where we sum over the i, for 1 <= i <= m; where m is the number of data points, namely (xi, yi); f1,..., fn are the function basis elements and the sigmai of the standard deviation of the measurement error of the values of yi.

Remarks:

  1. Measurement Error Unknown: If you do not wish to incorporate measurement error within the select of the greatest likelihood coefficient then just set all the standard deviations of each of the errors to 1.0.
  2. Weighted Best Fit: If you wish you are also able to apply this class with the view of finding the best weighted fit where the 1 / sigmai corresponds the (relative) weight applied to the i-th data point.
  3. Chi-Squared Measure: The value of the sum of the above terms is referred to as the Chi-Squared measure. Clearly the lower the absolute value of this measure the better (globally) over the range of value the curve selected fits the given data.

Using this class

In order to apply this class you must perform the following steps:

  1. Set the Basis Functions: by iteratively building the basis elements by calling the 'add...' functions.
  2. Fit the Function set to the Data: using SetGeneralFit

Once the function has been fit you are able to 'read' the following quantitative information about the fitted function:

  1. Return the value of Chi-Squared using GetChiSquare
  2. Return the coefficients in the function of best fit using GetCoefficients
  3. Evaluate the value of the function of best fit at a given point using BestFitValue

Evaluating the Goodness and Significance of the Fit

Once the curve has been fitted you are able to measure the goodness and significance of the fit using the following functionality:

  1. Measures of the Regression Line 'distance from'
    1. SumSquaresError - The sum of the squares due to error (SSE).
    2. MeanSquaresError - The mean square due to error (MSE).
    3. StandardError - The Standard Error of the Estimate.
    4. TotalSumSquares - The Total Sum of Squares (SST).
    5. SumSquaresRegression - The Sum of Squares Due to Regression (SSR).
    6. MeanSquaresRegression - The mean square due to regression (MSR).
  2. Measures of the Goodness-of-Fit
    1. RSquared - The multiple R2 (or multiple coefficient of determination).
    2. RSquaredAdjusted - The Adjusted R2 (or Adjusted Multiple Coefficient of Determination).
    3. FTest - The F-Test Statistic.

Requirements

Namespace: WebCab.COM.Statistics.CurveFitting

Assembly: WebCab.COM.Statistics (in WebCab.COM.Statistics.dll)

See Also

GeneralLinearFactorModel Members | WebCab.COM.Statistics.CurveFitting Namespace