The least squares method Theory briefly. Least square method

If some physical quantity depends on the other value, then this dependence can be explored, measuring Y when different values x. As a result of measurements, a number of values \u200b\u200bare obtained:

x 1, x 2, ..., x i, ..., x n;

y 1, Y 2, ..., y i, ..., y n.

According to such an experiment, you can construct a graph of the dependence y \u003d ƒ (x). The resulting curve makes it possible to judge the form of the function ƒ (x). but permanent coefficientswhich are included in this feature remain unknown. Determine them allows the method smallest squares. Experimental points, as a rule, do not fall exactly on the curve. The least square method requires that the sum of the squares of the deviations of experimental points from the curve, i.e. 2 was the smallest.

In practice, this method is most often (and most simple) used in case of linear dependence, i.e. when

y \u003d kx. or y \u003d A + BX.

Linear dependence is very widespread in physics. And even when the dependence is non-linear, usually try to build a chart so to get a straight line. For example, if it is assumed that the refractive index of the glass n is associated with the length λ of the light wave by the relation N \u003d a + b / λ 2, then the dependence N from λ -2 is constructed on the chart.

Consider addiction y \u003d kx.(Direct, passing through the origin of the coordinates). We will form φ - the sum of the squares of the deviations of our points from the straight

The value of φ is always positive and it turns out to be the less, the closer to the straight line are our points. The least squares method argues that for K, it is necessary to choose such a value at which φ has a minimum


or
(19)

The calculation shows that the rms value of the determination of the value K is equal to

, (20)
where - n number of measurements.

Consider now a somewhat more difficult case when points must satisfy the formula y \u003d a + bx (Direct, not passing through the origin of the coordinates).

The task is to find the best values \u200b\u200bof A and B according to the existing set of values \u200b\u200bof X I, Y I.

Again, make a quadratic form φ equal to the sum of the squares of the deviations of the points x i, y i from the straight

and find the values \u200b\u200bof A and B, in which φ has a minimum

;

.

.

The joint decision of these equations gives

(21)

RMS Definition Errors A and B are equal

(23)

. & NBSP (24)

When processing the measurement results, this method is convenient all data to be reduced to the table, in which all amounts included in formula (19) - (24) are pre-calculated. The forms of these tables are shown in the examples under consideration.

Example 1.The main equation of the dynamics of the rotational motion ε \u003d m / j (straight, passing through the origin) was investigated. At different values \u200b\u200bof the moment M, the angular acceleration ε of some body was measured. It is required to determine the moment of inertia of this body. Results of measurements of the moment of force and angular acceleration are listed in the second and third columns Tables 5..

Table 5.
N. M, N · m ε, C -1 M 2. M · ε. ε - km. (ε - km) 2
1 1.44 0.52 2.0736 0.7488 0.039432 0.001555
2 3.12 1.06 9.7344 3.3072 0.018768 0.000352
3 4.59 1.45 21.0681 6.6555 -0.08181 0.006693
4 5.90 1.92 34.81 11.328 -0.049 0.002401
5 7.45 2.56 55.5025 19.072 0.073725 0.005435
– – 123.1886 41.1115 – 0.016436

By formula (19), we determine:

.

To determine the standard error, we use the formula (20)

0.005775 kg -one · m. -2 .

By formula (18) we have

; .

S j \u003d (2.996 · 0.005775) /0.3337 \u003d 0.05185 kg · m 2.

Wrong reliability P \u003d 0.95, according to the table of styudent coefficients for n \u003d 5, we find T \u003d 2.78 and determine the absolute error ΔJ \u003d 2.78 · 0.05185 \u003d 0.1441 ≈ 0.2 kg · m 2.

Results Write in the form:

J \u003d (3.0 ± 0.2) kg · m 2;


Example 2. We calculate the temperature coefficient of metal resistance using the least squares method. Resistance depends on temperature according to the linear law

R T \u003d R 0 (1 + α T °) \u003d R 0 + R 0 α T °.

The free member determines the resistance R 0 at a temperature of 0 ° C, and the corner coefficient is the product of the temperature coefficient α to resistance R 0.

The results of measurements and calculations are shown in the table ( see Table 6).

Table 6.
n. t °, C r, Oh. t-¯ T (T-¯ T) 2 (T-¯ T) R r - BT - A (R - BT - a) 2, 10 -6
1 23 1.242 -62.8333 3948.028 -78.039 0.007673 58.8722
2 59 1.326 -26.8333 720.0278 -35.581 -0.00353 12.4959
3 84 1.386 -1.83333 3.361111 -2.541 -0.00965 93.1506
4 96 1.417 10.16667 103.3611 14.40617 -0.01039 107.898
5 120 1.512 34.16667 1167.361 51.66 0.021141 446.932
6 133 1.520 47.16667 2224.694 71.69333 -0.00524 27.4556
515 8.403 – 8166.833 21.5985 – 746.804
Σ / N. 85.83333 1.4005 – – – – –

According to formulas (21), (22) determine

R 0 \u003d ¯ r- α r 0 ¯ t \u003d 1.4005 - 0.002645 · 85.83333 \u003d 1.1735 Oh..

Find a mistake in determining α. Since, according to the formula (18) we have:

.

Using formulas (23), (24) we have

;

0.014126 Oh..

Wrong reliability P \u003d 0.95, according to the table of styudent coefficients for n \u003d 6, we find T \u003d 2.57 and determine the absolute error Δα \u003d 2.57 · 0.000132 \u003d 0.000338 grad -1..

α \u003d (23 ± 4) · 10 -4 Grad. -1 at p \u003d 0.95.


Example 3. It is required to determine the radius of the curvature of the lenses along the Rings of Newton. The radii of the rings of Newton R m was measured and the numbers of these rings M were determined. Newton Rings Radius are associated with Radius of curvature lenses R and Ring Room by the equation

r 2 m \u003d mλr - 2D 0 R,

where D 0 is the thickness of the gap between the lens and the plane-parallel plate (or the deformation of the lens),

λ is the wavelength of the falling light.

λ \u003d (600 ± 6) nm;
R 2 m \u003d y;
m \u003d x;
λr \u003d b;
-2d 0 r \u003d a,

then the equation will take the form y \u003d a + bx.

.

Measurement and computing results are listed in table 7..

Table 7.
n. x \u003d M. y \u003d R 2, 10 -2 mm 2 m -¯ M. (M -¯ M) 2 (M -¯ M) Y y - BX - A, 10 -4 (Y - BX - a) 2, 10 -6
1 1 6.101 -2.5 6.25 -0.152525 12.01 1.44229
2 2 11.834 -1.5 2.25 -0.17751 -9.6 0.930766
3 3 17.808 -0.5 0.25 -0.08904 -7.2 0.519086
4 4 23.814 0.5 0.25 0.11907 -1.6 0.0243955
5 5 29.812 1.5 2.25 0.44718 3.28 0.107646
6 6 35.760 2.5 6.25 0.894 3.12 0.0975819
21 125.129 – 17.5 1.041175 – 3.12176
Σ / N. 3.5 20.8548333 – – – – –

Least square method

Least square method ( MNA, OLS, ORDINARY LEAST SQUARES) - one of the basic regression analysis methods for evaluating unknown parameters of regression models on sample data. The method is based on minimizing the sum of the squares of regression residues.

It should be noted that the method of solving the problem in any area can be called in any way, if the decision lies or satisfies some criterion for minimizing the sum of the squares of some functions from the desired variables. Therefore, the method of smallest squares can also be used for approximate representation (approximation) specified function Other (simpler) functions, when finding a set of values \u200b\u200bthat satisfy equations or restrictions, the number of which exceeds the number of these values, etc.

Essence of MNC

Let a certain (parametric) model of probabilistic (regression) dependencies between (explained) variable y. and multiple factors (explaining variables) x.

where - vector unknown model parameters

- Random model error.

Suppose also there are selective observations of the values \u200b\u200bof the specified variables. Let - the observation number (). Then - the values \u200b\u200bof variables into -M observation. Then, at specified values \u200b\u200bof parameters B, you can calculate theoretical (model) values \u200b\u200bof the explanable variable y:

The value of residues depends on the values \u200b\u200bof the parameters b.

The essence of MNC (conventional, classical) is to find such parameters B, at which the sum of the squares of the residues (eng. Residual Sum Of Squares ) It will be minimal:

In general, the solution to this problem can be carried out by numerical optimization methods (minimization). In this case, talk about nonlinear MNC (NLS or NLLS - English. Non-Linear Least Squares). In many cases, you can get an analytical solution. To solve the minimization problem, it is necessary to find stationary points of the function by directing it according to unknown parameters B, equating derivatives to zero and solving the obtained system of equations:

If the random model errors have a normal distribution, have the same dispersion and uncorrelated, the MNK estimates of the parameters coincide with the estimates of the maximum truthfulness method (MMP).

MNA in the case of a linear model

Let the regression dependence be linear:

Let be y. - vector-column observation of the explanatory variable, A - matrix of observation of factors (lines of the matrix - vectors of the values \u200b\u200bof factors in this observation, according to columns - vector values \u200b\u200bof this factor in all observations). The matrix representation of the linear model is:

Then the estimate vector of the explanatory variable and the regression residues will be equal

accordingly, the sum of the squares of regression residues will be equal to

Differentiating this feature by the parameter vector and equating derivatives to zero, we obtain a system of equations (in matrix form):

.

The solution of this system of equations and gives a general formula for MN-estimates for a linear model:

For analytical purposes, the last representation of this formula is useful. If in the regression model cENTRENTIn this presentation, the first matrix makes sense of a selective covariance matrix of factors, and the second is a vector of covariance of factors with a dependent variable. If, in addition, the data is also normated at the speed (that is, ultimately standardized), then the first matrix has the meaning of the selective correlation matrix of factors, the second vector - vector of selective correlations of factors with a dependent variable.

An important property of mN-estimates for models with Constanta - The line of constructed regression passes through the center of gravity of sample data, that is, equality is performed:

In particular, as a last resort, when the only regressor is a constant, we obtain that the MNC-evaluation of a single parameter (actually constant) is equal to the average value of the explanable variable. That is, the arithmetic average, known for its good properties From the laws of large numbers, it is also an Essential MNK - satisfies the criterion of a minimum of the sum of the squares of deviations from it.

Example: the simplest (pair) regression

In case of steam room linear regression calculation formulas are simplified (you can do without matrix algebra):

Properties of MNK estimates

First of all, we note that for linear models of MNA estimates are linear estimatesThis follows from the above formula. For the disability of MNK estimates, it is necessary and enough the most important condition Regression analysis: Conditional by factors The mathematical expectation of a random error should be zero. This condition, in particular, is performed if

  1. expected value random errors equals zero and
  2. factors and random errors are independent random variables.

The second condition is the condition of exogenous factors - principal. If this property is not fulfilled, we can assume that almost any estimates will be extremely unsatisfactory: they will not even be legal (that is, even very big volume Data does not allow to obtain quality estimates in this case). In the classical case, a stronger assumption of the determination of factors is made, in contrast to a random error, which automatically means the fulfillment of the exogency condition. In general, for the consistency of estimates, it is enough to perform an exogency condition together with the convergence of the matrix to a certain non-degenerate matrix with an increase in the size of the sample to infinity.

In addition to consistency and non-ability, estimates (usual), MNC were also effective (the best in class of linear unstasted estimates) requires additional properties of a random error:

These assumptions can be formulated for the covariance matrix of random errors.

Linear model satisfying such conditions is called classic. MNA estimates for classical linear regression are unstable, weissious and most effective estimates in the class of all linear unrelated estimates (in English-language literature is sometimes used by abbreviation Blue (Best Linear Unbaised Estimator) - the best linear unambiguous assessment; In the domestic literature, the Gaussian - Markova Theorem is more often given). As it is easy to show, the covariance matrix of the odds of the coefficients will be equal to:

Generalized MNK.

Method of least squares allows broad generalization. Instead of minimizing the sum of the squares of residues, you can minimize some positively defined quadratic form from the residual vector, where - some symmetric positively defined weight matrix. Normal MNC is a special case of this approach when the weight matrix is \u200b\u200bproportional to single matrix. As is known from the theory of symmetric matrices (or operators) for such matrices there is a decomposition. Therefore, the specified functionality can be represented as follows, that is, this functionality can be represented as the sum of the squares of some converted "residues". Thus, you can select the class of least squares methods - LS-methods (Least Squares).

It has been proven (Theorem Aitken), which for a generalized linear regression model (in which no limitations are imposed on the covariaration matrix of random errors) are the most effective (in the class of linear unrelated estimates) are estimates of T.N. generalized MNC (OMNA, GLS - Generalized Least Squares) - LS-methods with a weight matrix equal to the reverse covariance matrix of random errors :.

It can be shown that the formula for OMNA-estimates of the parameters of the linear model has the form

The covariance matrix of these estimates will respectively will be equal

In fact, the Essence of the OMNA is a specific (linear) transformation (P) of the source data and the use of ordinary MNC to transformed data. The purpose of this transformation is for converted data random errors already satisfy classical assumptions.

Weighted MNC

In the case of a diagonal weight matrix (and hence the covariance matrix of random errors) we have the so-called weighted MNA (WLS - Weighted Least Squares). In this case, the weighted sum of the squares of the model residues is minimized, that is, each observation receives "weight", inversely proportional dispersion of a random error in this observation :. In fact, the data is converted by weighing observations (division by magnitude proportional to the proportional to the standard deviation of random errors), and ordinary MNC is applied to suspended data.

Some special cases of the application of MNA in practice

Approximation of linear dependence

Consider the case when as a result of studying the dependence of some scalar value from some scalar value (this may be, for example, the dependence of the voltage from the current force:, where - the constant value, the resistance of the conductor) was measured by these values, as a result of which values \u200b\u200bwere obtained and corresponding values. Measurement data must be recorded in the table.

Table. Measurement results.

Measurement number
1
2
3
4
5
6

The question sounds like this: what the value of the coefficient can be chosen to the best way Describe addiction? According to MN, this value should be such that the sum of the squares of deviations from values

it was minimal

The sum of the squares of deviations has one extremum - minimum, which allows us to use this formula. Find from this formula the value of the coefficient. To do this, we transform its left part as follows:

The latter formula allows us to find the value of the coefficient, which was required in the task.

History

Before the beginning of the XIX century. Scientists did not have certain rules to solve the system of equations in which the number of unknown less than the number of equations; Until this time, private receptions were used that depended on the type of equations and from the sharpness of the calculators, and therefore different computers based on the same observational data came to various conclusions. Gaussu (1795) belongs to the first application of the method, and Legendre (1805) independently discovered and published it under the modern name (FR. Méthode des Moindres Quarrés ). Laplace tied a method with probability theory, and American mathematician Eldeine (1808) considered its theoretical and probabilistic applications. The method is distributed and improved by further research by Enk, Bessel, Ganzen and others.

Alternative use of MNK.

The idea of \u200b\u200bthe least squares method can also be used in other cases that are not directly related to regression analysis. The fact is that the sum of squares is one of the most common proximity for vectors (Euclidean metric in finite-dimensional spaces).

One of the applications is "Solution" systems linear equationsin which the number of equations is greater than the number of variables

where the matrix is \u200b\u200bnot square, but a rectangular size.

Such a system of equations, in general, has no solution (if the rank is actually more than the number of variables). Therefore, this system can be "solved" only in the sense of choosing such a vector to minimize the "distance" between vectors and. To do this, you can apply the criteria for minimizing the sum of the squares of the difference between the left and right parts The equations of the system, that is. It is easy to show that the solution to this minimization problem leads to the solution of the following system of equations

3. Approximation of functions using the method

smallest squares

The least square method is used in the processing of experimental results for approximation (approximation) experimental data Analytical formula. The specific type of formula is chosen, as a rule, from physical considerations. Such formulas can be:

other.

The essence of the method of smallest squares is as follows. Let the measurement results are represented by the table:

Table 4

x N.

y N.

(3.1)

where F. - famous function,a 0, a 1, ..., a m - Unknown constant parameters whose values \u200b\u200bshould be found. In the least squares method, the approximation of the function (3.1) to experimental dependence is considered the best if the condition is performed

(3.2)

i.e sum a. Squares of deviations of the desired analytical function from experimental dependence must be minimal .

Note that the functionQ. called svissy.


Since unhappy

it has a minimum. A prerequisite for a minimum of several variable functions is equal to zero of all private derivatives of this function by parameters. Thus, finding the best values \u200b\u200bof the parameters of the approximating function (3.1), that is, their values \u200b\u200bfor whichQ \u003d q (a 0, a 1, ..., a m ) Minimal, reduces to solving the system of equations:

(3.3)

The least squares method can be given the following geometric interpretation: one line is found among the endless family of lines of this species, for which the sum of the squares of the qualities of the experimental points and the corresponding ordinate points found by the equation of this line will be the smallest.

Linear function

Let the experimental data need to be represented by a linear function:

Requires such valuesa and B. for which the function

(3.4)

it will be minimal. The necessary conditions The minimum of the function (3.4) is reduced to the system of equations:

After transformation, we obtain a system of two linear equations with two unknowns:

(3.5)

solving which, we find the desired parameter valuesa and b.

Finding the parameters of the quadratic function

If the approximating function is a quadratic dependence

then its parameters a, b, c Find the function of a minimum function:

(3.6)

The conditions of the minimum function (3.6) are reduced to the system of equations:


After transformation, we obtain a system of three linear equations with three unknowns:

(3.7)

for the solution of which we find the desired parameter valuesa, b and c.

Example . Let as a result of the experiment obtained the following table of valuesx and Y:

Table 5

y I.

0,705

0,495

0,426

0,357

0,368

0,406

0,549

0,768

It is required to approximate the experimental data by linear and quadratic functions.

Decision. The finding of the parameters of approximating functions is reduced to solving systems of linear equations (3.5) and (3.7). To solve the problem, we use the processor of the spreadsheetExcel.

1. First, it hits the sheets 1 and 2. Let's make experimental valuesx I I. y I.in columns And B, starting from the second line (in the first line, put the headlines of the columns). Then, for these columns, calculate the amount and put them in the tenth line.

In columns C - G Match respectively calculation and summation

2. Discharge sheets. Advanced calculations in a similar way for a linear dependence on a sheet 1 for a quadratic dependence on a sheet 2.

3. Under the resulting table, we form a matrix of coefficients and a vector-column of free members. We solve the system of linear equations according to the following algorithm:

For calculation reverse matrix and multiplicate matrices take advantage Master functionsand functions Brassand Mumset.

4. In the H2 cell block:H. 9 based on the obtained coefficients calculate valid values Polynomialy I. vych., In block I 2: I 9 - deviations D Y I. = y I. exp. - y I. vych., In the column j - Slept:

Tables and built using Masters Chart Graphs are given in Figures6, 7, 8.


Fig. 6. Table of calculating the coefficients of the linear function,

approximating Experimental data.


Fig. 7. Table of calculating the coefficients of the quadratic function,

approximating Experimental data.


Fig. 8. Graphic representation of the results of approximation

experimental data linear and quadratic functions.

Answer. Approximated experimental data by linear dependence y. = 0,07881 x. + 0,442262 with hopeless Q. = 0,165167 and quadratic dependence y. = 3,115476 x. 2 – 5,2175 x. + 2,529631 with hopeless Q. = 0,002103 .

Tasks. Approximate the function specified tables, linear and quadratic functions.

Table 6.

№0

x.

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

y.

3,030

3,142

3,358

3,463

3,772

3,251

3,170

3,665

1

3,314

3,278

3,262

3,292

3,332

3,397

3,487

3,563

2

1,045

1,162

1,264

1,172

1,070

0,898

0,656

0,344

3

6,715

6,735

6,750

6,741

6,645

6,639

6,647

6,612

4

2,325

2,515

2,638

2,700

2,696

2,626

2,491

2,291

5

1.752

1,762

1,777

1,797

1,821

1,850

1,884

1,944

6

1,924

1,710

1,525

1,370

1,264

1,190

1,148

1,127

7

1,025

1,144

1,336

1,419

1,479

1,530

1,568

1,248

8

5,785

5,685

5,605

5,545

5,505

5,480

5,495

5,510

9

4,052

4,092

4,152

4,234

4,338

4,468

4,599

Which finds the widest application in various fields of science and practical activity. It can be physics, chemistry, biology, economy, sociology, psychology, and so on, so on. The will of fate often have to deal with the economy, and therefore today I will execute you a junior in an amazing country called Econometric \u003d) ... How do you not want it?! There is very good - you just need to decide! ... But here's the fact that you probably definitely want - it is to learn to solve the tasks method of least squares. And especially diligent readers will learn to solve them not only unmistakably, but also very quickly ;-) But first general setting of the task + Related example:

Suppose in some subject area, the indicators that have a quantitative expression are investigated. In this case, there is every reason to believe that the indicator depends on the indicator. This assistance can be like scientific hypothesisSo and based on the elementary common sense. Leave, however, science aside and explore more appetizing areas - namely, food stores. Denote by:

- Shopping area of \u200b\u200bthe food store, sq.m.,
- annual turnover of the food store, million rubles.

It is clear that what more square The store, in most cases there will be more of its turnover.

Suppose that after conducting observations / experiments / counts / dances with a tambourine at our disposal is numerical data:

With guests, I think everything is clear: - This is the area of \u200b\u200bthe 1st store, - its annual turnover, - the area of \u200b\u200bthe 2nd store, - its annual turnover, etc. By the way, it is not necessary to have access to secret materials at all - a fairly accurate estimate of the turnover can be obtained by means mathematical statistics. However, we are not distracted, the course of commercial espionage is already paid \u003d)

Tabar data can also be written in the form of points and depict into the usual for us. cartesian system .

Reply on important question: how many points are needed for high-quality research?

The bigger, the better. The minimum allowable set consists of 5-6 points. In addition, with a small amount of data, the "abnormal" results cannot be included in the sample. So, for example, a small elite store can help out more "their colleagues", thereby distorting general patternwhich is required to find!

If you just just need to choose a function, schedule which passes as close to points . This feature is called approximating (approximation - approximation) or theoretical function . Generally speaking, it immediately appears an obvious "applicant" - polynomial high degreewhose graph passes through all points. But this option is complicated, and often just incorrect (because the schedule will be "loop" all the time and poorly reflect the main trend).

Thus, the wanted function should be quite simple and at the same time reflect the dependence adequately. How do you guess, one of the methods of finding such functions and is called method of least squares. First I will look through it essence in general. Let some function bring the experimental data:


How to estimate the accuracy of this approximation? Calculate and differences (deviations) between experimental and functional values (Learning the drawing). The first thought that comes to mind is to evaluate how great the amount is, but the problem is that the differences may be negative (eg, ) And deviations as a result of this summation will be mutually separated. Therefore, as an estimate of the accuracy of approximation, it is suited to accept the amount modules Deviations:

or in the twisted form: (Suddenly someone does not know: - This is the sum icon, and the auxiliary variable "counter", which takes values \u200b\u200bfrom 1 to).

Approaching experimental points with various functions, we will receive different values , and obviously, where this amount is less - the function is more accurate.

This method exists and is called it method of least modules. However, in practice, he received much more distribution least square methodin which possible negative values \u200b\u200bare not eliminated by the module, but the construction of deviations in the square:

, after which the efforts are directed to the selection of such a function so that the sum of the squares of deviations It was as little as possible. Actually, hence the name of the method.

And now we come back to another an important moment: As noted above, the selected function should be quite simple - but there are also a lot of such functions: linear , hyperbolic, exponential, logarithmic, quadratic etc. And, of course, it would immediately like to "reduce the field of activity." What class of functions to choose for research? Primitive, but effective reception:

- the easiest to portray points In the drawing and analyze their location. If they tend to be placed in a straight line, then you should search equation direct with optimal values \u200b\u200band. In other words, the challenge is to find such coefficients - so that the sum of the squares of deviations was the smallest.

If the points are located, for example, by hyperball, it is not clear that the linear function will give a bad approximation. In this case, we are looking for the most "profitable" coefficients for the hyperbole equation - those that give the minimum sum of squares .

Now note that in both cases we are talking about functions of two variableswhose arguments are parameters of wanted dependencies:

And essentially, we need to solve the standard task - to find minimum function of two variables.

Recall our example: Suppose that the "store" points tend to be located in a straight line and there is every reason to assume that linear dependency Commodity turnover from shopping area. We will find such coefficients "A" and "BE" to the sum of the squares of deviations It was the smallest. Everything is as usual - first private derivatives of the 1st order. According to rule of linearity You can differentiate directly under the amount icon:

If you want to use this information for an essay or courses - I will be very grateful for the link in the list of sources, such detailed calculations will find a little where:

Make up standard system:

We reduce each equation on the "deuce" and, in addition, "collapse" amounts:

Note : Independently analyze why "A" and "BE" can be taken out of the sum icon. By the way, it can be done formally with the amount

Rewrite the system in the "applied" form:

After that, the algorithm of solving our task is started:

Coordinates of points do we know? We know. Amount Can we find? Easily. Make up simpler system of two linear equations with two unknown("A" and "BE"). System solve, for example, cramer methodAs a result, we get a stationary point. Checking a sufficient condition of Extremum, you can make sure that at this point the function Reaches exactly minimum. Check is associated with additional calculations and therefore leave it for the scenes (If necessary, the missing frame can be viewed). We make the final conclusion:

Function the best way (at least compared to any other linear function) Binds experimental points . Roughly speaking, her schedule passes as close as possible to these points. In tradition econometrics The resulting approximating function is also called equation of paired linear regression .

The problem under consideration has a great practical value. In a situation with our example, the equation allows you to predict what trade turnover ("Igarek") will be at the store, with a different value of trading area (Tom or other meaning "X"). Yes, the resulting forecast will be only a forecast, but in many cases it will be quite accurate.

I will scatter just one task with "real" numbers, because there are no difficulties in it - all calculations at the level school program 7-8 class. In 95 percent of cases, you will be invited to find a linear function, but at the very end of the article I will show that it is not more difficult to find the equations of optimal hyperboles, exhibitors and some other functions.

In fact, it remains to distribute the promised buns - so that you learned to solve such examples not only accurately, but also quickly. Carefully learn the standard:

A task

As a result of the study of the relationship between two indicators, the following pairs of numbers were obtained:

The smaller squares method find a linear function that best brings empirical (experienced) data. Make a drawing on which in the Cartesian rectangular coordinate system to build experimental points and graph of the approximating function . Find the sum of the squares of deviations between empirical and theoretical values. Find out whether the function will be better (from the point of view of the least squares method) Apply the experimental points.

Note that "ICS" values \u200b\u200bare natural, and it has a characteristic meaningful meaning, which I will tell a little later; But they, of course, can be fractional. In addition, depending on the content of one task as "Icx", and the "ignorable" values \u200b\u200bcan be completely or partially negative. Well, we have a "faceless" task, and we start it decision:

The optimal function coefficients will find as a solution of the system:

In order to more compact recording, the "counter" variable can be omitted, since it is clear that the summation is carried out from 1 to.

The calculation of the necessary amounts is more convenient to arrange in a tabular form:


Calculations can be carried out on the microcalculatory, but it is much better to use Excel - and faster, and without errors; We watch a short video:

Thus, we get the following system:

Here you can multiply the second equation for 3 and from the 1st equation to subtract the 2nd. But this luck - in practice the system is more often not gifted, and in such cases saves cramer method:
So the system has a single solution.

Perform a check. I understand that I do not want, but why miss the mistakes where they can not be absolutely missed? Substitute the solution found to the left part of each system equation:

Right parts of the respective equations are obtained, it means that the system is solved correctly.

Thus, the desired approximating function: - from all linear functions Experimental data best approaches it.

Unlike straight dependence of the store turnover from its square, the dependence found is inverse (the principle of "the more - the less"), and this fact is immediately detected by negative angular coefficient. Function tells us that with an increase in a certain indicator on 1 unit, the value of the dependent indicator decreases average0.65 units. As they say, the higher the price of buckwheat, the less it is sold.

To build a graph of an approximating function, we will find two of its values:

and do a drawing:


Built line called trend Line (namely - line linear trend. In the general case, the trend is not necessarily a straight line). All familiar expression "be in trend", and, I think that this term does not need additional comments.

Calculate the sum of the squares of deviations between empirical and theoretical values. Geometrically - this is the sum of the squares of the length of the "raspberry" segments (two of which are so small that they are not even visible).

Calculations Let us in the table:


They can be done again manually, just in case I will bring an example for the 1st point:

but much more efficiently do in a famous way:

Once again, repeat: what is the meaning of the result? Of all linear functions function The indicator is the smallest, that is, in its family, this is the best approximation. And here, by the way, the final question of the problem is not accidental: what if the proposed exponential function Will it be better to bring the experimental points?

We find the appropriate amount of the squares of deviations - to distinguish, I will indicate their letter "Epsilon". The technique is exactly the same:


And again to every fire calculation for the 1st point:

In Excel, we use standard feature Exp (Syntax can be viewed in Exele Help).

Output:, Therefore, the exponential function brings the experimental points worse than direct .

But it should be noted that "worse" is do not mean, what is wrong. Now built a graph of this exponential function - and he also passes close to points - Yes, so that without an analytical study and it is difficult to say, what a function is more accurate.

On this decision is completed, and I return to the question of the natural values \u200b\u200bof the argument. In various studies, as a rule, economic or sociological, natural "ices" numerical months, years or other equal time intervals. Consider, for example, such a task.

The essence of the method is that the criterion for the quality of the solution under consideration is the sum of the squares of errors, which they strive to minimize. To use this you need to spend as you can more Unknown measurements random variable (The greater - the higher the accuracy of the solution) and some many alleged solutions from which you want to choose the best. If many solutions are parameterized, then you need to find the optimal value of the parameters.

Why minimize the squares of errors, and not the errors themselves? The fact is that in most cases errors come in both directions: the assessment may be more measurement or less. If you fold mistakes with different signs, they will be mutually compensated for, and as a result, the amount will give us an incorrect idea of \u200b\u200bthe quality of the assessment. Often, so that the total assessment has the same dimension as the measured values, the square root is removed from the sum of the squares of the error squares.


Photo:

MNK is used in mathematics, in particular - in the theory of probability and mathematical statistics. This method has the greatest application in filtering tasks when it is necessary to separate the beneficial signal from the noise superimposed on it.

It is used in mathematical analysis for an approximate representation of a given function to simpler functions. Another embodiment of MNA is the solution of systems of equations with the number of unknown smaller than the number of equations.

I came up with a few more early unexpected applications of MNA, which would like to tell in this article.

MNC and typos

The scourge of automatic translators and search engines are typos and spelling errors. Indeed, if the word is different for only 1 letter, the program reges it already as another word and translates / looking for it incorrectly or does not translate / does not find it at all.

I had a similar problem: there were two databases with the addresses of Moscow homes, and it was necessary to combine them into one. But the addresses were recorded in different style. In one database was the Standard Kladre (All-Russian address classifier), for example: "Babushkina Pilot Str. D10K3". And in another base there was a postal style, for example: "ul. Pilot Babushkin, house 10 Corp.3. " It seems that there are no errors in both cases, but it is incredibly difficult to automate the process (in each base of 40 thousand entries!). Although the typos there also grabbed ... How to give a computer to understand that the 2d addresses belong to the same house? There was me that I was useful to MNA.

What I've done? Finding another letter in the first address, I was looking for the same letter in the second address. If they both were on the same place, I thought an error for this letter equal to 0. If they were located in the neighboring positions, the error was equal to 1. If there was a shift to 2 positions, the error was 2, etc. Such a letter was not in another address at all, the error was relying equal to N + 1, where n is the number of letters in the 1st address. Thus, I calculated the amount of error squares and connected those records in which this amount was minimal.

Of course, the rooms of houses and enclosures were processed separately. I do not know if I invented the next "bike", or it was really it was, but the task was solved quickly and efficiently. Interestingly, does this method apply in search engines? Perhaps it is applied, since each self-respecting search engine at the meeting of an unfamiliar word offers a replacement from familiar words ("Perhaps you had in mind ..."). However, they can make this analysis somehow differently.

MNC and search for pictures, persons and maps

This method can be applied in the search for pictures, drawings, cards, and even on people's faces.

Photo:

Now all search engines, instead of searching for pictures, in fact, use the search by signatures to the pictures. This is undoubtedly a useful and convenient service, but I propose to supplement it with a real search by pictures.

The sample picture is introduced and the rating is drawn up on the sum of the squares of the deviations of characteristic points. The definition of these most characteristic points is the nontrivial task itself. However, it is completely solved: for example, for people it is an eye corner, lips, nose tip, nostrils, edges and eyebrow centers, pupils, etc.

By comparing these parameters, you can find a face that is most similar to the sample. I have already seen sites where such a service is working, and you can find a celebrity that is most like a photo we offer, and even make an animation that turns you into a celebrity and back. Surely the same method works in the databases of the Ministry of Internal Affairs containing the photographs of criminals.

Photo: Pixabay.com.

Yes, and on fingerprints you can do the same method. The search for cards is focused on natural irregularities of geographic objects - bends of rivers, mountain ranges, outlines of shores, forests and fields.

Here is so wonderful and universal method MNA. I am sure that you, dear readers, you can find many unusual and unexpected applications of this method.