Which method gives a unique set of values to the constants in the equation of fitting the curves?

Has nobody answered this question? The common reason why a method produces different results on a repeated start is there was a random element in the method. In the case of the curve fitting toolbox, if you do not specify the starting point for the estimation, then fit uses a RANDOM start point.

If you start ANY optimizer off from a different point, then you need to expect at least slightly different results. The difference may be as small as the convergence tolerances allow. Or, it may be worse, if the problem has multiple local solutions. Sometimes they are all equivalent, although not always. The point is, this is behavior that you SHOULD EXPECT TO SEE HAPPEN. It is entirely natural. Occasionally, your data is so poor that the model is underspecified. In that case, you might see multiple solutions. In the case of such insuffiicent data, then you need to get better data. The hint in this last case is usually that you get many warning messages produced, probably about singular or nearly singular matrices.

Can you avoid such problems? (Assuming your data is at least viable to produce a solution.) That is, can you get the same solution each time? There are two ways to resolve it, if this really bothers you.

  1. Best is to just provide intelligent starting values of your own. That makes it most possible that you will get a viable solution. Even if you cannot provide something intelligently chosen, then a set of fixed values is an option.
  2. If you cannot provide starting values at all, then set the starting random seed for the random number generator.

When using the method of least squares, these costs increase proportionally to the measuring interval, as the main amount of computations is caused by a numerical integration of the equations of motion of the body for the obtaining the calculated values of angular velocity components.

From: Rigid Body Dynamics for Space Applications, 2017

Curve Fitting

Kumar Molugaram, G. Shanker Rao, in Statistical Techniques for Transportation Engineering, 2017

5.2 The Method of Least Squares

The method of least squares assumes that the best fit curve of a given type is the curve that has the minimal sum of deviations, i.e., least square error from a given set of data.

Suppose that the data points are (x1, y1), (x2, y2), …, (xn, yn) where x is the independent variable and y is the dependent variable. The fitting curve f(x) has the deviation (error) ei from each data point, as follows:

e1=y1–f( x1),e2=y2–f(x2),⋮en=yn–f(xn)

According to the method of least squares, the best fitting curve has the property that ∑1nei2=∑1n[yi−f(xi)]2 is minimum.

We now introduce the method of least squares using polynomials in the following sections.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128115558000052

Variational Forms and Finite Element Approximation: 1-D Problems

O.C. Zienkiewicz, ... J.Z. Zhu, in The Finite Element Method: its Basis and Fundamentals (Seventh Edition), 2013

4.7 Least squares approximations

A general variational principle also may be constructed if the constraints described in the previous section are simply the governing equations of the problem

(4.69)C(u)=A(u)

Obviously the same procedure can be used in the context of the penalty function approach by setting Π=0 in (4.59). We can thus write a “variational principle”

(4.70)Π ¯¯=12∫ΩA1 2+A22+⋯dx=12∫ΩAT(u)A(u)dx

for any set of differential equations. In the above equation the boundary conditions are assumed to be satisfied by u (forced boundary condition) and the parameter α is dropped as it becomes a multiplier.

Clearly, the above statement is a requirement that the sum of the squares of the residuals of the differential equations should be a minimum at the correct solution. This minimum is obviously zero at that point, and the process is simply the well-known least squares method of approximation.

It is equally obvious that we could obtain the correct solution by minimizing any functional of the form

(4.71)Π¯¯=12∫Ω p1A12+p2A 22+⋯dx=12∫ΩAT(u)pA(u)dx

in which p1,p2,…, etc., are positive valued weighting functions or constants and p is a diagonal matrix:

(4.72)p=p10p2p3 0⋱

The above alternative form is sometimes convenient as it places a different importance on the satisfaction of individual components of the equation set and allows additional freedom in the choice of the approximate solution. Once again this weighting function could be chosen so as to ensure a constant ratio of terms contributed by various equations.

A least squares method of the kind shown above is a very powerful alternative procedure for obtaining integral forms from which an approximate solution can be started, and has been used with considerable success [15–18]. As a least squares variational principle can be written for any set of differential equations without introducing additional variables, we may well inquire as to what the difference is between these and the natural variational principles discussed previously. On performing a variation in a specific case the reader will find that the Euler equations which are obtained no longer give the original differential equations but give higher order derivatives of these. Thus, higher order continuity of trial functions is now generally needed. This may be a serious drawback but frequently can be bypassed by stating the original problem as a set of lower order equations. The appearance of higher order derivatives in the Euler equations also introduces the possibility of spurious solutions if incorrect boundary conditions are used.

Example 4.8

Least squares solution for Helmholtz equation

To illustrate the use of a least squares approach consider the Helmholtz problem governed by (4.26), for which we have already obtained a natural variational principle [Eq. (4.30)] in which only first derivatives were involved, requiring C0 continuity for ϕ. Now, if we use the operator L and term b defined by (4.28), we have a set of approximating equations with

(4.73)Kab=∫Ωd2 Nadx2+cNa d2Nbdx2 +cNbdxfa=∫Ωd2Na dx2+cNaQdx

The reader will observe that due to the presence of second derivatives C1 continuity is now needed for the trial functions N.

An alternative, avoiding the requirement of C1 functions, is to write (4.26) as a first-order system. This can be written as

(4.74)A(u)=q+dϕdx-dqdx+cϕ+Q=0

or, introducing the vector u,

(4.75)u=qϕ=(Nũ)

as the unknown we can write an approximation as

(4.76)u≈uˆ=Nq00Nϕq̃ϕ̃=Nũ

where Nq and are C0 shape functions for the q and ϕ variables, respectively. The least squares approximation is now given by

(4.77a)δΠ¯ ¯=δũT∫Ω(LN)T[(LN)u ̃+b]dx=0

where

(4.77b)LN= NqdNϕd x-dNqdxcNϕ;b=0Q

The reader can now perform the final steps to obtain the K and f matrices. The approximation equations in a form requiring only C0 continuity are obtained, however, at the expense of additional variables. Use of such forms has been made extensively in the finite element context [15–22].

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781856176330000046

Analysis of Time Series

Kumar Molugaram, G. Shanker Rao, in Statistical Techniques for Transportation Engineering, 2017

12.6.4 Method of Least Square

The method of least squares is a widely used method of fitting curve for a given data. It is the most popular method used to determine the position of the trend line of a given time series. The trend line is technically called the best fit. In this method a mathematical relationship is established between the time factor and the variable given. Let (t1, y1), (t2, y2), …, (tn, yn) denote the given time series. In this method the trend value yc of the variable y are computed so as to satisfy the conditions:

1.

The sum of the deviations of y from their corresponding trend values is zero.

i.e.,

∑(y−yc)=0

2.

The sum of the square of the deviations of the values of y from their corresponding trend values is the least.

i.e.,

∑(y−yc)2

is least.

The equation of the trend line can be expressed as

y c=a+bx

where a and b are constants and the trend line satisfies the conditions:

1.

∑(y−y c)=0

2.

∑(y−yc) 2 is least.

The values of a and b determined such that they satisfy the equations.

(12.1)∑y=na+b∑x

(12.2)∑xy =b∑x+a∑x2

Eqs. (12.1) and (12.3) are called normal equations.

Solving Eqs. (12.1) and (12.2) we get

a= ∑y·∑x2−∑x∑xyn∑x2−(∑x)2

and

b=n·∑xy−∑x∑yn∑x2−(∑x)2

In the equation, yc=a+bx, of the trend, a represents the trend of the variable when x=0 and b represents the slope of the trend line. If b is positive, the trend line will be upward and if b is negative the trend line will be downward.

When the origin is mentioned and the deviations from the origin is denoted by x, we get

a =∑yn,b=∑xy∑x2

(The sum of derivation from the origin=∑x=0).

Note: If n is odd, we take the middle value (Middle year) as the origin. If n is even, there will be two middle values. In this case we take the mean of the two middle values as the origin.

Merits

1.

The method is mathematically sound.

2.

The estimates a and b are unbiased.

3.

The least square method gives trend values for all the years and the method is devoid of all kinds of subjectivity.

4.

The algebraic sum of deviations of actual values from trend values is zero and the sum of the deviations ∑(y−yc)2 is minimum.

Demerits

1.

The least square method is highly mathematical, therefore, it is difficult for a layman to understand it.

2.

The method is not flexible. If certain new values are included in the given, time series, the values of n, ∑x, ∑y,∑x2, and ∑xy would change. Which affects the trend values.

3.

It has been assumed that y is only a linear function of time period x. Which may not be true in many situations.

12.6.4.1 Solved Examples

Example 12.6: Find the least square line y=a+bx for the data:

Solution:

xyx2xy
–2 1 4 –2
–1 2 1 –2
0 3 0 0
1 3 1 3
2 4 4 8
∑x=0 ∑y=13 ∑x2=10 ∑xy=7

∑x=0, ∑y=13, ∑x2=10, ∑ xy=7, and x=5

The normal equations are

na+b∑x=∑y

a∑x+b∑ x2=∑xy

Putting the values of n, ∑x,∑y,∑x2,∑xy in the above equation, we get

(12.3)5a+b·0=13⇒5a=13

(12.4)a·0+b(10)=7⇒10b=7

From Eqs. (12.3) and (12.4) we get

a=13 5=2.6,b=710=0.7

The required of least square line is

y=2.6+(0.7)x

Example 12.7: Fit a straight line trend by the method of least square from the following data and find the trend values.

Year 1958 1959 1960 1961 1962
Sales (in lakhs of units) 65 95 80 115 105

Solution: We have n=5

∴ n is odd.

Taking middle year, i.e., 1960 as the origin. We get,

YearSalesxx2xy
1958 65 –2 4 –130
1959 95 –1 1 –95
1960 80 0 0 0
1961 115 1 1 115
1962 105 2 4 210
Total ∑y=460 ∑x=0 ∑x2=10 ∑xy=100

∴ n=5, ∑x=0, ∑x2=10, ∑y=460, and ∑xy=100

a=∑yn⇒a=4605=92

b =∑xy∑x2=10010=10

∴ The equation of the straight line trend is

yc=a+bx⇒yc=92+10x

For the year 1958, x=–2

⇒yc=y1958=92+10(−2)=92−20=72

For the year 1959, x=–1

⇒yc= y1959=92+10(−1)=92−10=82

For the year 1960, x=0

⇒yc=y1960=92+10(0)=92+0=92

For the year 1961, x=1

⇒yc =y1961=92+10(1) =92+10=102

For the year 1962, x=2

⇒yc=y1962=92+10(2)=92+20 =112

We have

YearTrend value
1958 72
1959 82
1960 92
1961 102
1962 112

and the straight line trend is yc=92+10x or simply y=92+10x.

Example 12.8: Determine the trend by the method of least squares. Also find the trend values.

Year 1950 1951 1952 1953 1954 1955 1956 1957
Value 346 411 392 512 626 640 611 796

Solution: Here n=9

∴ n is even.

1953 and 1954 are the middle years.

The origin is 1953+19542=1953.5

We take x=year−1953.5

Yearxyx2xy
1950 –3.5 346 12.25 –1211.00
1951 –2.5 411 6.25 –1027.50
1952 –1.5 392 2.25 –588.00
1953 –0.5 512 0.25 –256.00
1954 0.5 626 0.25 313.00
1955 1.5 640 2.25 960.00
1956 2.5 611 6.25 1527.50
1957 3.5 796 12.25 2786.00
Total 0 4334 42.00 2504.00

∴ We have n=8, ∑x=0, ∑y=4334, ∑x2=42, and ∑xy=2504

a=∑yn⇒a=43348=541.75

b=∑xy∑x2=250442=59.60

∴ The equation of the trend line is

y=541.75+(59.60)x

The trend values are:

YearxTrend: y=541.75+(59.60)x
1950 –3.5 333.15
1951 –2.5 392.75
1952 –1.5 452.35
1953 –0.5 511.95
1954 0.5 571.95
1955 1.5 631.15
1956 2.5 690.75
1957 3.5 750.35

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012811555800012X

Main equations of diffraction theory

In Computer Design of Diffractive Optics, 2013

The method of least squares

The method of least squares [6] is also a variance method which can be used for the approximate solution of equation (1.95) by minimising the functional of the type:

(1.103)Ju =∫VL^u−f2dV= L^u−f,L^u−f

The functional (1.103) has a minimum on the functions which are the solution of the system of Euler equations (1.99). The sought function of the complex amplitude is presented in the form of the linear combination of the known approximating linearly independent basic functions with the unknown coefficients (1.97) which are found from the system of linear algebraic equations (1.100). The matrix of the system (1.100) and the column vectors have the following form in this case:

(1.104)M^ij=L^ ψin,L^ψjn,i,j=1,Nn¯C^i=C i,g^j=f,L^ψ jn

If the operator L^ is nonsingular, i.e., there is an inverse operator L^−1, then the matrix M^ is symmetric and positively determined, and the sequence of approximate solutions u(n) from equation (1.97) is reduced in the norm to the exact solution of equation (1.95) at n → ∞.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781845696351500018

Appendix 2: Least squares analysis

Mark S. Nixon, Alberto S. Aguado, in Feature Extraction & Image Processing for Computer Vision (Third Edition), 2012

11.2 Curve fitting by least squares

Curve fitting by the method of least squares concerns combining a set of measurements to derive estimates of the parameters which specify the curve that best fits the data. By the least squares criterion, given a set of N (noisy) measurements fi, i∈1, N, which are to be fitted to a curve f(a), where a is a vector of parameter values, we seek to minimize the square of the difference between the measurements and the values of the curve to give an estimate of the parameters according to

(11.7)aˆ=min ∑i=1N(fi−f(xi,yi,a))2

Since we seek a minimum, by differentiation we obtain

(11.8) ∂∑i=1N(fi−f(xi,yi,a))2∂a= 0

which implies that

(11.9)2∑i=1N(fi−f(xi,yi,a))∂f(a)∂a=0

The solution is usually of the form

(11.10)Ma=F

where M is a matrix of summations of products of the index i and F is a vector of summations of products of the measurements and i. The solution, the best estimate of the values of a, is then given by

(11.11)aˆ=M−1F

For example, let us consider the problem of fitting a 2D surface to a set of data points. The surface is given by

(11.12)f(x,y,a )=a+bx+cy+dxy

where the vector of parameters a=[a b c d]T controls the shape of the surface and (x,y) are the coordinates of a point on the surface. Given a set of (noisy) measurements of the value of the surface at points with coordinates (x,y), fi=f(x,y)+vi, we seek to estimate values for the parameters using the method of least squares. By Eq. (11.7), we seek

(11.13)aˆ=[aˆbˆcˆdˆ]T=min∑i=1N(fi−f(xi,yi,a))2

By Eq. (11.9), we require

(11.14)2∑i=1N(f i−(a+bxi+cyi+dxiyi))∂f(xi,yi,a)∂a=0

By differentiating f(x, y, a) with respect to each parameter, we have

(11.15)∂f(xi,yi)∂a=1

(11.16)∂f(xi,yi)∂b=x

(11.17)∂f(xi,yi)∂c=y

and

(11.18)∂f(xi, yi)∂d=xy

and by substituting Eqs (11.15)–(11.18) in Eq. (11.14), we obtain four simultaneous equations:

(11.19)∑ i=1N(fi−(a+bxi +cyi+dxiyi))×1=0

(11.20)∑i=1N(fi−(a+bxi+cyi+dxi yi))×xi=0

(11.21)∑i=1N(fi−(a+bxi+c yi+dxiyi))×y i=0

and

(11.22)∑i=1N(fi−(a+bxi+cyi+dxi yi))×xiyi=0

Since ∑i=1Na=Na, Eq. (11.19) can be reformulated as

(11.23)∑i=1Nfi−Na−b∑i=1Nxi−c∑i=1Nyi−d∑i=1Nxiyi=0

and Eqs (11.20)–(11.22) can be reformulated likewise. By expressing the simultaneous equations in matrix form, we get

(11.24)[N∑i=1Nxi∑i=1Nyi∑i=1Nxiyi∑i=1Nxi∑i=1N(xi)2∑i=1Nxiyi∑i=1N(xi)2yi∑i=1Nyi∑i=1Nxiyi∑i=1N(yi)2 ∑i=1Nxi(yi )2∑i=1Nxiy i∑i=1N(xi) 2yi∑i=1Nxi(yi)2∑i=1N(xi)2(yi)2 ][abcd]=[∑i=1Nfi∑i=1Nfixi∑i=1Nfiyi∑i=1 Nfixiyi]

and this is the same form as Eq. (11.10) and can be solved by inversion, as in Eq. (11.11). Note that the matrix is symmetric and its inversion, or solution, does not impose such a great computational penalty as appears. Given a set of data points, the values need to be entered in the summations, thus completing the matrices from which the solution is found. This technique can replace the one used in the zero-crossing detector within the Marr–Hildreth edge detection operator (Section 4.3.3) but appeared to offer no significant advantage over the (much simpler) function implemented there.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123965493000173

Static State Estimation

Soliman Abdel-hady Soliman Ph.D., Ahmad M. Al-Kandari Ph.D., in Electrical Load Forecasting, 2010

2.4.1 Historical Perspective

The development of least absolute value method, as well as the least error squares, can be tracked back to the middle of the eighteenth century. The development of the methods was a result of trying to find the best method to summarize the information obtained from a number of measurements. The pioneers in regression analysis—Bascovish (1757), Laplace (1781), Gauss (1809), and Glaisher (1872)—proposed criteria for determining the best-fitting straight line through three or more points.

In 1781, Laplace presented a procedure for finding the best set of measurements based on minimizing the sum of the absolute deviations—namely, the least absolute value.

In 1809, Gauss demonstrated that the method of least squares is a consequence of the Gaussian Law of Error (normal distribution). Glaisher (1872) later showed that for a Laplacian (double exponential) distribution, the least absolute value estimator gives the most probably true value. In the early 1800s, regression analysis work focused on the conditions under which least squares regression and least absolute value regression give the best estimates. Laplace in 1812 showed that, for a large sample size, the method of least squares is superior. Houber in 1830 evaluated the least error squares and the least absolute value estimators and showed that for a Gaussian distribution, the least squares estimator gives the best results. Meanwhile, he noted that one advantage of the least absolute value estimator is that unbiased estimates can be obtained for any symmetrical distribution. However, Laurent in 1875 questioned the exactness of the Gaussian distribution, and on the basis of actual studies of measurements, he concluded that the method of least error squares should not be used when one has only a small number of observations. Jefferys, in 1939, showed that there is equivalence between the following three statements:

1.

The Gaussian distribution is correct.

2.

The mean value is the best value.

3.

The method of the LES gives the best estimates.

He also pointed out that, for other symmetrical distributions, the method of least error squares should not be used, and stated that there is much to be said for the use of least absolute value when the distribution law is unknown, because it is less affected by large residuals than the least error squares.

The debates took place before the advent of computers and fast efficient methods of calculating the least error squares and least absolute value estimates; therefore, the discussions were primarily restricted to relatively small-size problems. Larger problems were primarily solved using analytical methods. In general, only least error squares regression was used because there were no efficient techniques for obtaining least absolute value estimates. In the early 1950s, emphasis was placed on developing efficient computational procedures for solving large problems. However, by this time the method of least squares was well established as the method for doing regression analysis.

The popularity of least error squares continued to grow even though it was known that it does not lead to the best available estimates of unknown parameters when the law of error (distribution) is other than Gaussian. But if the number of independent observations available is much larger than the number of parameters to be estimated, the method of least squares can usually be counted on to yield nearly best estimates.

In summary, the results of research to date indicate that the least absolute value of error technique gives better approximation when the errors in the measurements set have unknown distribution and also when the sample size is small.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123815439000026

Comprehensive error analysis beyond system innovations in Kalman filtering

Jianguo Wang, ... Baoxin Hu, in Learning Control, 2021

3.4 Redundancy contribution in Kalman filtering

Baarda [1] initiated the reliability theory in the method of least squares. It consists of the internal reliability and the external reliability. The former is a measure of the system capability to detect measurement outliers with a specific probability while the latter is the model response to undetected model errors (systematic and measurement errors) [3]. Three measures are commonly used in internal reliability analysis [3]: (1) the redundancy contribution, which controls how a given error in a measurement affects its residual; (2) the minimal detectable outlier on a measurement at a significance level of α and with the test power of 1−β; and (3) the maximum non-centrality parameter as a global measure, which is based on the quadratic form of the measurement residual vector. For further details about reliability analysis, refer to [1,3,17].

The reliability analytics was systematically introduced into Kalman filtering by [25]. Here, the discussion is limited to the redundancy contribution of measurements (inclusive of real and pseudo measurements) because it is distinctly the key of reliability analysis. Refer to [25,27] for more details on the subject.

Quantitatively, the redundancy contribution of a measurement vector is represented by the diagonal elements of the idempotent matrix DvvDll−1 originating from the well-known equation,

(3.31)vl=DvvD ll−1el

where el and vl are the error vector and estimated residual vector of a measurement vector l, respectively, with the measurement variance matrix Dll and the variance matrix Dvv of vl. In order to derive the redundancy contribution in Kalman filtering, the residual vectors given in (3.22), (3.27) and (3.28) are further expressed by the system innovation vector at epoch k:

(3.32)vlx(k)=Dlxlx(k)Dxx−1(k/k−1)G(k)d(k/k−1)

(3.33) vw(k)=Q(k)B T(k−1)Dxx−1(k/k−1)G(k)d(k/k−1)

(3.34)vz(k)=[C(k)G(k)−I]d (k/k−1)

with their variance matrices

(3.35)Dv lxvlx(k)=A(k,k−1)Dxx (k−1)AT(k,k−1)CT(k)→Ddd−1(k−1)C(k)A(k,k−1)Dxx(k−1) AT(k,k−1)

(3.36)Dvwvw(k)=Q( k)BT(k,k−1)CT(k)→Ddd−1 (k−1)C(k)B(k,k−1)Q(k)

(3.37)Dvzvz(k)=[I−C(k) G(k)]R(k)

The redundancy contributions in measurement groups corresponding to (3.27), (3.28) and (3.22) are

(3.38) rlx(k)=trace{A(k,k−1)Dxx(k− 1)AT(k,k−1)→ CT(k)Ddd−1 (k−1)C(k)}

(3.39)r lw(k)=trace{Q(k )BT(k,k−1)CT(k)→Ddd−1(k−1)C(k)B(k,k−1)}

(3.40)rz(k)=trace {I−C(k)G(k)}

For the entire system either after (3.1) and (3.2), after (3.21) and (3.22), or after (3.27), (3.28) and (3.22), the total redundancy number at epoch k satisfies [25,27,4]

(3.41)r(k)=r lx(k)+rlw (k)+rz(k)=p(k)

wherein p(k) is the number of the real measurements or the dimension of z(k).

In practice, Q(k) and R(k) are commonly diagonal so that the individual redundancy indices in components for the process noise vector are

(3.42)rwi(k)={Q(k)BT(k,k−1)CT(k)Ddd−1(k−1)→C(k)B(k,k−1 )}ii(i=1,2,...,m(k))

and for the measurement vector

(3.43)rzi(k)={I−C(k)G(k)}i i(i=1,2,...,p(k))

Indeed, Dlxlx(k) in (3.26) is not a diagonal matrix in general, no individual redundancy indices in components are here defined for lx(k). For more about the reliability measures for the correlated measurements one refers to [17,37].

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128223147000080

Recursive Estimation

GENE H. HOSTETTER, in Handbook of Digital Signal Processing, 1987

I INTRODUCTION

In 1795 Karl Friedrich Gauss (1777–1855) invented the method of least squares estimation in the course of calculating planetary and comet orbits from telescopic measurement data [1]. Six precise measurements would suffice to determine the six parameters characterizing each orbit, but individual measurements were likely to be quite inaccurate. More measurements than the minimum number were used, and the “best fit” to an orbit was found, in the sense of minimizing the sum of squares of the corresponding parameter measurement errors. Gauss's approach was to develop the method, then argue eloquently that it yielded the “most accurate” estimate. Adrien Marie Legendre (1707–1783) independently developed least squares estimation and published the results first, in 1806.

Through the years least squares methods have become increasingly important in many applications, including communications, control systems, navigation, and signal and image processing [2, 3]. The next section develops the fundamental ideas of least squares estimation. The solution involves a linear transformation of the measurements to obtain the optimal estimate. Then a recursive formulation [4, 5] of the least squares solution is derived in which the measurements are processed sequentially. The digital processing for recursive least squares constitutes filtering of incoming discrete-time measurement signals to produce discrete-time outputs representing estimates of the measured system parameters. Several illustrative examples are given. The section concludes with discussion of probabilistic interpretations of least squares and an indication of how recursive least squares methods can be generalized.

In 1960, building on the work of others, Rudolph E. Kalman published his first paper [6] on linear minimum mean square (MMS) estimation. The approach was a fundamental departure from that of Gauss in that it began with a stochastic formulation rather than giving stochastic interpretation to an already developed procedure. The result, now known as the Kalman filter [7–10], is an elegant generalization of recursive least squares that nicely unifies and extends many earlier results. It is especially convenient for digital computer implementation.

With the ideas of recursive least squares established, we formulate the basic linear MMS estimation problem in Section III and derive the recursive Kalman filter equations. Measurements to be processed are represented by a state-variable noise-driven model that has additive measurement noise. As each measurement is incorporated, the Kalman filter produces an optimal estimate of the model state based on all previous measurements through the latest one. With each filter iteration the estimate is updated and improved by the incorporation of new data. If the noises involved have Gaussian probability distributions, the filter produces minimum mean-square error (MSE) estimates. Otherwise, it produces estimates with the smallest MSE obtainable with a linear filter; nonlinear filters could be superior.

Section IV begins with a summary of the matrix Kalman filtering equations and a block diagram of the filter, which includes a replica of the state-variable model for the measurements. A BASIC language computer program for demonstrating first-order Kalman filters is given, and important considerations in the programming of multivariate filters are discussed. The next section introduces extensions of the Kalman filter to situations involving noise coupling matrices, deterministic inputs to the model, nonzero mean values, known initial conditions, correlated noises, and bias estimation.

Section VI is concerned with some of the computational aspects of Kalman filtering [11–14]. Insufficient care in modeling can lead to unrealistic confidence in the estimation accuracy, to the point where additional measurements are effectively ignored by the filter—a situation called divergence. The effects of computational inaccuracies can be reduced by using alternative arrangements of the computations, such as square-root filtering. Examples [15–22] are given to illustrate key concepts.

The final section is a short bibliography that includes references to other material on optimal smoothing [23–28], to filtering for continuous-time systems [26–28], and to several papers describing applications of Kalman filtering.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780080507804500187

Linear regression

J. Hayavadana, in Statistics for Textile and Apparel Management, 2012

7.8 Confidence limits

It is often noticed that the dependent variable is subjected to random errors and method of least squares can be considered in such cases. If the variable are linearly related we can write

Y=α+βx

The coefficients α and β can be assumed as the population parameters of the true regression of y on x. The values of α and β are likely to vary from one sample to another, hence, the necessity for confidence limits for mean and population are set.

The 100 (1 – 2r) limits for are

a±tk,rS1n+ x2¯∑x−x¯ 2y2,b±tk,rS/∑x−x¯2y2

Where a and b represent α and β from the standard linear equation.

Note: K = n – 2 and ‘S’ is the S.D. about the regression line referring to the example number 1 and assuming r = 2.5%, from table t5, 2.5% = 2.57, then by solving the normal equations it is shown that a = 27.18, b = 0.5982, S = 0.616 the 95% confidence limits (α) are given by

27.18±2.57×0.61517+9.5712 47.7143=27.18±2.27

And similarly for β

0.5982± 0.228

Confidence limits for variables X and Y

It is accepted that regression is mainly used to predict the value of Y based on X. However, these Y values are subjected to error because of uncertainties in the estimation of α and β and hence, the confidence limits are found for them.

If X = Xf, then Y value at Xf is Y = a + b Xf

The confidence limits for ‘true mean’ is given by

a+bXf±tk,rS1n+Xf−X¯2∑X−X ¯2Y2

Referring to the example 1, the estimates of the modulus when Xf = 8 and its limits are

27.2+0.6x8±2.57x0.61517+8−9.571 247.71432=32±0.70

The Integral with which the modulus is expected to lie is found by

27.2 +0.6×8±2.57×0.6151+17+8−9.571247.71432=32±1.7

Thus it can be concluded from the above exercise, that future type cords with processing tension 8 are expected to a mean modulus lying between 31.3 and 32.7 while individual test pieces will be with moduli ranging between 30.3 and 33.7 provided the other conditions remain constant.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780857090027500072

APPLICATIONS OF THE POINT KERNEL TECHNIQUE

JAMES WOOD, in Computational Methods in Reactor Shielding, 1982

For R > c

In this range there is an additional parameter c/R. We therefore fit, by the method of least squares, the family of curves given by Rockwell to a function of two variables. The following function provides a rough fit to the data – that is adequate for our purpose.

(5.78) y=[1+(2⋅65−2⋅92t2+1⋅23t22)e−0⋅208t1]x… …x[0⋅166+0⋅330t2−0⋅0814t22−(0⋅000417+0⋅00364t2)t1],

where

t1 = μs(c+R)

t2 = c/R

and y = z/R.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780080286853500083