|
WebCab Probability and Statistics v3.5 (J2SE Edition) |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--webcab.lib.statistics.curveFitting.GeneralLinearFactorModel
Offers the ability to find the function of best fit in accordance with the least squares
approach where the class of functions considered is built from a factor model. The factor model builds
the class of functions considered by iteratively adding term-by-term the types from which the class of
functions are constructed (see below for details). The key advantage of the factor model over the application
of the function basis interface FunctionBasis which is then passed to the GeneralLinear
class, is that this approach only requires static function calls in order to define the class of
functions from which the best fit function is selected.
The function is best fit is constructed from a linear combination of basis functions
f_1(x), f_2(x), ..., f_n(x), where the basis functions are set using iterative basis building
procedure as detailed below. That is, the function of best fit can be any function which takes the form:
a_1 * f_1(x) + a_2 * f_2(x) + ... + a_n * f_n(x),
where a_1,..., a_n are any real numbers.
The function of best fit is constructed as the linear sum of functions: f_1(x), f_2(x),..., f_n(x)
where each of the individual functions are given by calling one of the methods:
addConstantTerm(double) - Add a basis constant function.
addSumOfPowerTerms(double[], double[]) - Add a basis function which is a sum of power terms.
addRationalSumOfPowerTerms(double[], double[], double[], double[]) - Add a basis function which is a sum of a rational where the
denominator and numerator which our themselves sums of power terms.
addCosineSum(double[], double[], double[], double[]) - Add a basis function which is a sum of Cosine functions.
addSineSum(double[], double[], double[], double[]) - Add a basis function which is a sum of Sine functions.
addTanSum(double[], double[], double[], double[]) - Add a basis function which is a sum of Tangent functions.
addLogSum(double[], double[], double[], double[]) - Add a basis function which is a sum of Logarithm functions.
addExpSum(double[], double[], double[], double[]) - Add a basis function which is a sum of Exponential functions.
addAbsoluteSum(double[], double[], double[], double[]) - Add a basis function which is a sum of absolute functions.
addStepSum(double[], double[], double[]) - Add a basis function which is a sum of step functions.
That is, we construct the function which is to be fitted by iteratively added terms to the basis elements from which it is constructed. We illustrate exactly how this works without the following example.
Say we wish to fit the function of the following type to the data set considered:
f(x) = (a_1 * x) + (a_2 * sin(x+3) ) + (a_3 * 2)
where the terms a_1, a_2, a_3, are just constants which are yet to be decided.
Then in order to build this model we would need to make the following (ordered) methods calls:
addSumOfPowerTerms(double[], double[]) - To add the x you need to passed the parameters:
powerTermsCoeff = {1}, powerTermsExponents = {1}
addSineSum(double[], double[], double[], double[]) - The above sine term is added by passing the parameters: sinCoeff = {1}
coefficientf = {1}, constant = {3}, exponent = {1}.
addConstantTerm(double) - Passing the value 2, corresponding to the constant.
Once you have added the basis elements of the linear model, the variables of the linear model
namely a_1, a_2, a_3 will be included by the General Linear fitting algorithm (i.e
setGeneralFit(double[], double[], double[], double[], boolean[])). Moreover once the optimal values for these variables has been found in
accordance with the least squares methodology the order in which they are return by the General Linear
fitting algorithm will correspond to the order in which the elements of the factor model where set.
In the example given above if the factor model is set in the order given, once the General
Linear algorithm is called the corresponding returned results array will consist of three elements
where the first element will be the fitted value of a_1, the second element will
be the fitted value of a_2, and the third element will be the fitted value of
a_3.
When we refer to the function of best fit strictly speaking we are referring to the function constructed as a linear combination of basis functions constructed by the following means. The coefficients are selected such that the resulting function has the maximum likelihood of being the best fit in accordance with the least squares approach when the measurement errors of the y-axis coordinates of the known data points are given.
This implementation we offer the ability to incorporate measurement errors within the observed data by
which the function it fit. In particular, we assume that the measurement errors of the y_i's
are independently random and distributed as a Normal Gaussian distribution about a true value. It follows
from these assumptions (though not rigorously) that the most likely coefficients which generate the best fit
are achieved by finding the coefficients: a_1, ..., a_n such that the sum of the squares of the
terms:
( y_i - (a_1 * f_1(x_i) + ... + a_n * f_n(x_i)) \ sigma_i,
is minimized where we sum over the i, for 1 <= i <= m; where m
is the number of data points, namely (x_i, y_i); f_1,..., f_n are the function basis
elements and the sigma_i of the standard deviation of the measurement error of the values of
y_i.
Remarks:
1.
sigma_i corresponds the weight applied to the i-th data point.
In order to apply this class you must perform the following steps:
#setFunctionBasis
setGeneralFit(double[], double[], double[], double[], boolean[]) or #setGeneralFitOrdered
Once the function has been fit you are able to 'read' the following quantitative information about the fitted function:
getChiSquare()
#getCoefficients
bestFitValue(double)
| Constructor Summary | |
GeneralLinearFactorModel()
Creates a new GeneralLinearFactorModel instance. |
|
| Method Summary | |
void |
addAbsoluteSum(double[] absCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Adds a sum of absolute functions to the (ordered) basis functions which form the factor model. |
void |
addConstantTerm(double constant)
Adds a constant function to the (ordered) function basis of the factor model. |
void |
addCosineSum(double[] cosCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Adds a sum of Cosine functions to the (ordered) basis elements of the factor model. |
void |
addExpSum(double[] expCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Adds a sum of exponential functions to be used as basis elements. |
void |
addLogSum(double[] logCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Allows the sum of (natural) log functions to be used as (ordered) basis elements of the factor model. |
void |
addRationalSumOfPowerTerms(double[] upperRationalCoeff,
double[] upperRationalExp,
double[] lowerRationalCoeff,
double[] lowerRationalExp)
Adds a rational term where the numerator and denominator are each the sums of powers terms, to the (ordered) function basis of the factor model. |
void |
addSineSum(double[] sinCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Adds a sum of Sine functions to the (ordered) basis elements of the factor model. |
void |
addStepSum(double[] stepCoeff,
double[] coefficient,
double[] constant)
Adds a sum of step functions to the (ordered) basis functions which form the factor model. |
void |
addSumOfPowerTerms(double[] powerTermsCoeff,
double[] powerTermsExponents)
Adds a sum of power terms to the (ordered) function basis of the factor model. |
void |
addTanSum(double[] tanCoeff,
double[] coeff,
double[] constant,
double[] exponent)
Adds a sum of Tangent functions to the (ordered) basis elements of the factor model. |
double |
bestFitValue(double evaluationPoint)
Evaluates the value of the function of best fit at a given point on the x-axis. |
double |
getChiSquare()
Evaluates the Chi-Squared measure which provides a qualitative measure of how well to given function can be fitted. |
double[] |
setGeneralFit(double[] xDataPoints,
double[] yDataPoints,
double[] sigma,
double[] initialValue,
boolean[] fit)
Evaluates the greatest likelihood coefficients of the function of best fit in accordance with the least squares approach for the function basis given (i.e. fitting the function). |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public GeneralLinearFactorModel()
| Method Detail |
public void addConstantTerm(double constant)
GeneralLinearFactorModel documentation.
The constant function takes the form:
f(x) = constant,
where constant is a real number provided as a parameter.
constant - a double which corresponds to the constant value of the constant function (i.e. constant above).
public void addSumOfPowerTerms(double[] powerTermsCoeff,
double[] powerTermsExponents)
GeneralLinearFactorModel documentation.
The sum of power terms takes the form:
f(x) = powerTermsCoeff(0) * x^powerTermsExponents(0) + ... + powerTermsCoeff(n) * x^powerTermsExponents(n),
where powerTermsCoeff(k), powerTermsExponents(k), are real numbers for all k and
n+1 corresponds to the number of terms within the sum.
powerTermsCoeff - an array of doubles where the k-th term corresponds to the coefficient of the k-th term of the sum (i.e. powerTermsCoeff(k) above).powerTermsExponents - an array of doubles where the k-th term corresponds to the exponent of the variable of the k-th term of the sum (i.e. powerTermsExponents[k] above).
public void addRationalSumOfPowerTerms(double[] upperRationalCoeff,
double[] upperRationalExp,
double[] lowerRationalCoeff,
double[] lowerRationalExp)
GeneralLinearFactorModel documentation.
The rational term takes the form:
f(x) = numerator(x) / denominator(x)
where:
numerator(x) = upperRationalCoeff(0) * x^upperRationalExp(0) + ... + upperRationalCoeff(n) * x^upperRationalExp(n)
denominator(x) = lowerRationalCoeff(0) * x^lowerRationalExp(0) + ... + lowerRationalCoeff(m) * x^lowerRationalExp(m)
where upperRationalCoeff(k), upperRationalExp(k), lowerRationalCoeff(k), lowerRationalExp(k),
are real numbers for all k, n+1 corresponds to the number of terms within the
numerator and m+1 corresponds to the number of terms within the denominator.
upperRationalCoeff - an array of doubles where the k-th term corresponds to the coefficient of the k-th term of the numerator (i.e. upperRationalCoeff(k) above).upperRationalExp - an array of doubles where the k-th term corresponds to the exponent of the variable of the k-th term of the sum (i.e. upperRationalExp(k) above).lowerRationalCoeff - an array of doubles where the k-th term corresponds to the coefficient of the k-th term of the numerator (i.e. lowerRationalCoeff(k) above).lowerRationalExp - an array of doubles where the k-th term corresponds to the exponent of the variable of the k-th term of the sum (i.e. lowerRationalExp(k) above).
public void addCosineSum(double[] cosCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel documentation.
Each term within the sum of Cosine functions takes the following form:
cosCoeff * (Cos(coeff * x + constant))n,
where the Cosine function is in terms of radians, the cosCoeff, coeff, constant, n
are real numbers.
Radians are a means by which to describe the angle and are related to the more
commonly used degrees as follows:
360 degrees = 2 * Pi * radians
therefore, 1 radian = 180 / Pi degrees = 57.295... degrees.
cosCoeff - a double which corresponds to the coefficient of the Cosine function term (i.e. cosCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Cosine function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Cosine function term (i.e. constant above).exponent - the power to which the result of the Cosine function is raised (i.e. n above).
public void addSineSum(double[] sinCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel documentation.
Each term of the sum which makes up the basis element which is to be added to the factor model takes the form:
sinCoeff * (Sin(coeff * x + constant))n,
where the Sine function is evaluated in terms of radians, and sinCoeff, coeff, constant, n
are real numbers.
Radians are a means by which to describe the angle and are related to the
more commonly used degrees as follows:
360 degrees = 2 * Pi * radians
therefore, 1 radian = 180 / Pi degrees = 57.295... degrees.
sinCoeff - a double which corresponds to the coefficient of the Sine function term (i.e. sinCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Sine function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Cosine function term (i.e. constant above).exponent - the power to which the result of the Sine function is raised (i.e. n above).
public void addTanSum(double[] tanCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel
documentation.
Each term of the sum which makes up the basis function will take to form:
tanCoeff * (tan(coeff * x + constant))n,
where the Tangent function is evaluated in terms of radians, and tanCoeff,
coeff, constant, n are real numbers.
Radians are a means by which to describe the angle and are related to the
more commonly used degrees as follows:
360 degrees = 2 * Pi * radians
therefore, 1 radian = 180 / Pi degrees = 57.295... degrees.
tanCoeff - a double which corresponds to the coefficient of the Tangent function term (i.e. tanCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Tangent function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Tangent function term (i.e. constant above).exponent - the power to which the result of the Tangent function is raised (i.e. n above).
public void addLogSum(double[] logCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel documentation.
Each term of the sum of log terms will take the following form:
logCoeff * (log(coeff * x + constant))n
where the Log function (log) is the natural (i.e. base e) logarithm, and logCoeff, coeff, constant, n
are real numbers.
logCoeff - a double which corresponds to the coefficient of the Log function term (i.e. logCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Log function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Log function term (i.e. constant above).exponent - the power to which the result of the Log function is raised (i.e. n above).
public void addExpSum(double[] expCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel documentation.
Each of the elements of the sum of Exponential function terms given will take the following form:
expCoeff * (exp(coeff * x + constant))n
where the exp of the Exponential function, and expCoeff, coeff, constant, n are real numbers.
expCoeff - a double which corresponds to the coefficient of the Exponential function term (i.e. expCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Exponential function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Exponential function term (i.e. constant above).exponent - the power to which the result of the Exponential function is raised (i.e. n above).
public void addAbsoluteSum(double[] absCoeff,
double[] coeff,
double[] constant,
double[] exponent)
GeneralLinearFactorModel documentation.
Each of the elements of the sum takes the following form:
absCoeff * (Abs(coeff * x + constant))n,
where Abs is the absolute function given by:
Abs(x) = -x, if x ≤ 0Abs(x) = x, if x > 0and absCoeff, coeff, constant, n are real numbers.
absCoeff - a double which corresponds to the coefficient of the Absolute function term (i.e. absCoeff above).coeff - a double which corresponds to the coefficient of the variable x, within the argument term of the Absolute function term (i.e. coeff above).constant - a double which corresponds to the constant term within the argument term of the Absolute function term (i.e. constant above).exponent - the power to which the results of the Absolute function is raised (i.e. n above).
public void addStepSum(double[] stepCoeff,
double[] coefficient,
double[] constant)
GeneralLinearFactorModel documentation.
Each of the elements of this sum takes the following form:
stepCoeff * Step(coeff * x + constant),
where the Step(x) is given by:
Step(x) = 0, if x ≤ 0Step(x) = 1, if x > 0and stepCoeff, coeff, constant are real numbers.
stepCoeff - a double which corresponds to the coefficient of the Step function term (i.e. stepCoeff above).constant - a double which corresponds to the constant term within the argument term of the Step function term (i.e. constant above).public double getChiSquare()
setGeneralFit(double[], double[], double[], double[], boolean[]).
Remark: Roughly speaking for an unweighted function of best fit, the square root of the Chi-Squared measure is the average distance in the y-direction from a data point to the curve of best fit.
Recall that here we are finding the curve of best fit in accordance with minimizing the sum of the squares of the terms:
( y_i - (a_1 * f_1(x_i) + ... + a_n * f_n(x_i)) / sigma_i,
where we sum over the i, for 1 <= i <= n; the (x_i, y_i)
are the data points, f_1,..., f_n are the function basis elements and sigma_i are
the standard deviations of the measurement error of the values of y_i. The Chi-Squared measure
is precisely the value of this sum.
You must fit the function before the corresponding Chi-Squared can be evaluated.
public double[] setGeneralFit(double[] xDataPoints,
double[] yDataPoints,
double[] sigma,
double[] initialValue,
boolean[] fit)
throws Exception
Remark: Please note that before this methods is called the function basis must be set using the
add... methods contained within this class.
In the simplest case this method given a data set (x_i, y_i), for i=0, ..., m,
finds the coefficients a_1, ..., a_n such that the function:
f(x) = a_1 * f_1(x) + a_2 * f_2 + ... + a_n * f_n(x)
where f_1,...,f_n are the function basis set using #setFunctionBasis, is the function
of best fit in accordance with the least square approach for the data set considered. That is, the coefficients
are selected in such a way that the points (x_i, f(x_i)), for i=1,...,m; are a best fit
in accordance with the least squares approach for the given data set (x_i, y_i), i=1,...,m.
Please note that within this implementation we reserve the right to fix some of the
coefficients before the function is fitted. In such instances the above described fitting is performed
with some of the coefficients fixed to the initial value which they are given. The fit parameter
is used in order to determine which (if any) of the coefficient are kept fixed during the fitting.
As mentioned within the overview of this class (see GeneralLinear) you are able to incorporate
the following features when determining the greatest likelihood coefficients:
y_i's) are taken into account. In order to
fit the function taking into account the measurement errors you are required to provide the k-th term of the parameter
sigma for the measurement error of the y_k (i.e. the value of the y-axis coordinate of the
k-th data point) which is then used within the fitting of the function.
sigma_i corresponds the weight applied to the i-th data point.
0, where the array has a length equal to the number of data points
considered.
xDataPoints - an array where the k-th term corresponds to the x-axis coordinate value of the k-th data element of the data set from which the (weighted or not) function of best fit is fitted.yDataPoints - an array where the k-th term corresponds to the y-axis coordinate value of the k-th data element of the data set from which the (weighted or not) function of best fit is fitted.sigma - an array where the k-th term depending on your point of view to one of the following: 0, where the array has a length equal to the number of data points considered. initialValue - an array where the k-th term corresponds the initial estimate of the value of the k-th coefficient within the function of best fit (i.e. a_k above).fit - an array of booleans where the k-th term is a true boolean if the k-th coefficient is fitted to the data set considered, and is a false boolean if the k-th coefficient is not fitted to the data set considered and held fixed. Please note that if a coefficient is held fixed then its value within the best fit function will correspond to the initialValue parameter value provided.
Exceptionpublic double bestFitValue(double evaluationPoint)
#setFunctionBasis
setGeneralFit(double[], double[], double[], double[], boolean[])
evaluationPoint - the coordinate value of the point in the x-axis about which the value of the fitted function is evaluated.
|
WebCab Probability and Statistics v3.5 (J2SE Edition) |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||