Gradient first-order method. Gradient methods of unconditional optimization

The gradient method and its varieties belong to the most common methods for finding extremum functions of several variables. The idea of \u200b\u200bthe gradient method is that in the process of searching for extremum (for a certainty of the maximum) move each time in the direction of the greatest increase in the target function.

The gradient method involves the calculation of the first derivatives of the target function by its arguments. It, as well as the previous ones, belongs to the approximate methods and allows us to not achieve the optimum point, but only approach it for a finite number of steps.

Fig. 4.11.

Fig. 4.12.

(two-dimensional case)

First choose the starting point if in the one-dimensional case (see subparagraph 4.2.6) from it

it is possible to move left or right (see Fig. 4.9), then a multidimensional case, the number of possible directions of movement is infinitely large. In fig. 4.11, illustrating the case of two variables, arrows coming from the starting point BUT, Different possible directions are shown. In this case, the movement according to some of them gives an increase in the value of the target function in relation to the point BUT (for example, directions 1-3), And in other areas leads to its decrease (directions 5-8). Given that the position of the optimum point is unknown, it is considered the best direction in which the target function increases faster. This direction is called gradient Functions. Note that at each point of the coordinate plane, the direction of the gradient is perpendicular to the tangent of the level line by the same point.

In mathematical analysis, it is proved that the components of the gradient vector function w. =/(*, x 2 ..., x n) are its private derivatives on arguments, i.e.

& Hell / (x 1, x 2 ,.= (Du / DCU, DC 2, ..., Du / dh n). (4.20)

Thus, when searching for a maximum of a gradient method on the first iteration, the components of the gradient according to formulas (4.20) are calculated for the initial point and make a working step in the found direction, i.e. The transition to a new point -0)

U "with coordinates:

1§gas1 / (x (0)),

or in vector form

where X - A permanent or variable parameter determining the length of the working step ,? І\u003e 0. On the second iteration again calculate

gradient vector is already for new point., After which it is

gichic formula go to point x ^ > etc. (Fig. 4.12). For arbitrary to-iterations have

If the maximum is found, and the minimum of the target function, then on each iteration is a step in the direction opposite to the direction of the gradient. It is called the direction of antigigradient. Instead of formula (4.22) in this case will be

There are many varieties of the method of the gradient, differing in the choice of the working steps. You can, for example, switch to each subsequent point at a permanent value. X and then

the length of the working steps is the distance between adjacent points x ^

their 1 "- it turns out to be a proportional module of the gradient vector. You can, on the contrary, on each iteration to choose X. So that the length of the working step remains constant.

Example. It is required to find a maximum function

y \u003d 110-2 (LH, -4) 2 -3 (* 2 -5) 2.

Of course, using prerequisite Extremum, immediately get the desired solution: x] - 4; x 2 \u003d 5. However on this simple example It is convenient to demonstrate the gradient method algorithm. Calculate the gradient of the target function:

grad. y \u003d (du / dh-, du / dh 2) \u003d (4 (4 - *,); 6 (5 - x 2)) and choose the starting point

L * "\u003d (x) °\u003e \u003d 0; 4 °\u003e \u003d O).

The value of the target function for this point, how easy it is to calculate, equal in [x ^ J \u003d 3. Put X. \u003d const \u003d 0.1. The magnitude of the gradient at the point

ZS (0) is grad y | x ^ j \u003d (16; 30). Then in the first iteration, we obtain according to formulas (4.21) the coordinates of the point

x 1) \u003d 0 + 0.1 16 \u003d 1.6; x ^ \u003d 0 + 0.1 30 \u003d 3.

in (x (1)) \u003d 110 - 2 (1.6 - 4) 2 - 3 (3 - 5) 2 \u003d 86.48.

As can be seen, it is significantly more than the previous value. In the second iteration, we have according to formulas (4.22):

  • 1,6 + 0,1 4(4 - 1,6) = 2,56;

You're a little exprehensive :)

Method of rewarding descent

The idea of \u200b\u200bthis method is that the search occurs in the direction of the coordinate descent during the new iteration. The descent is carried out gradually for each coordinate. The number of coordinates directly depends on the number of variables.
To demonstrate the operation of this method, to begin with, it is necessary to take the function z \u003d f (x1, x2, ..., xn) and select any point M0 (x10, x20, ..., xn0) in n space, which depends on the number of characteristics of the function. The next step is to fix all points of the function in the constant, except the very first. This is done in order for the search for multidimensional optimization to reduce the task of one-dimensional optimization on a certain segment, that is, the search for the X1 argument.
To find the value of this variable, it is necessary to descend along this coordinate to the new point M1 (x11, x21, ..., xn1). Further, the function is differentiated and then we can find the value of the new one with this expression:

After finding the value of the variable, it is necessary to repeat iteration with fixation of all arguments besides X2 and start producing a descent along the new coordinate to the next new point M2 (x11, x21, x30 ..., xn0). Now the value of the new point will occur by expression:

And again, iteration with fixation will be repeated until all arguments from XI before XN are terminated. With the last iteration, we will consistently pass in all possible coordinates in which we already find local minima, so the target function on the last coordinate will reach the global minimum. One of the advantages of this method is that at any time there is an opportunity to interrupt the descent and the last found point will be a minimum point. This is useful when the method goes into an infinite cycle and the result of this search can be considered the last found coordinate. However, the target search for a global minimum search in the field may not be achieved due to the fact that we interrupted the minimum search (see Figure 1).


Figure 1 - Canceling the execution of the reinforcement descent

The study of this method has shown that each found calculated point in space is a global minimum point. specified function, and the function z \u003d f (x1, x2, ..., xn) is convex and differentiable.
From here we can conclude that the function z \u003d f (x1, x2, ..., xn) of convex and differentiable in space, and each found limit point in the sequence M0 (x10, x20, ..., xn0) will be a global minimum point (see Figure 2) This function according to the method of the coordinate descent.


Figure 2 - Local minimum points on the axis of coordinates

It can be concluded that this algorithm is perfectly coping with simple tasks Multidimensional optimization, by consistently solving N of the number of tasks of one-dimensional optimization, for example, by the method of the golden section.

The course of execution of the method of the coordinate descent occurs according to the algorithm of the scheme described in the block (see Figure 3). Iterations of this method:
Initially, you must enter several parameters: the accuracy of the epsilon, which should be strictly positive, the starting point X1 with which we will start the execution of our algorithm and set the lambda j;
The next step will take the first starting point X1, after which the conventional one-dimensional equation with one variable and the formula for finding a minimum will be, where k \u003d 1, j \u003d 1:

Now, after calculating the extremum point, it is necessary to check the number of arguments in the function and if J is less than n, then it is necessary to repeat the previous step and override the argument J \u003d J + 1. For all other cases, go to the next step.
Now it is necessary to override the variable x by the formula x (k + 1) \u003d y (n + 1) and try to converge the function in a given accuracy by expression:

Now the extremum point depends on this expression. If a this expression True, then the calculation of the extremum point is reduced to x * \u003d xk + 1. But it is often necessary to perform additional iterations depending on the accuracy, so the values \u200b\u200bof the arguments will be overridden y (1) \u003d x (k + 1), and the values \u200b\u200bof the indexes J \u003d 1, k \u003d k + 1.


Figure 3 - block diagram of the coordination method

Total, we have an excellent and multifunctional multidimensional optimization algorithm that is able to break. complex task, into several successively iterative one-dimensional. Yes, this method is quite easy to implement and has a slight definition of points in space, because this method guarantees convergence to the local point of the minimum. But even with such weighty advantages, the method is able to go into infinite cycles due to what can get into a kind of ravine.
There are ambition functions in which there are depressions. The algorithm, hitting one of these depressions, can no longer get out and the point of the minimum it will be found there. Same big number The sequential use of the same method of one-dimensional optimization can be strongly reflected in weak computing machines. It is not enough that convergence in this function is very slow, since it is necessary to calculate all variables and often a highly specified accuracy increases the time to solve the problem and the main disadvantage of this algorithm - limited applicability.
Conduring the study of various algorithms for solving optimization problems, it is impossible not to note that the quality of these algorithms is played a huge role. It is also not worth forgetting such important characteristics as the time and stability of the execution, the ability to find the best values, minimizing or maximizing the target function, simplicity of implementing the solution of practical problems. The method of the coordinate descent is easy to use, but in the problems of multidimensional optimization, most often, it is necessary to carry out comprehensive calculations, and not the partition of the whole task for subtasks.

Nededer Method - Mid

It is worth noting the fame of this algorithm among researchers of methods of multidimensional optimization. Nelder Mode is one of the few methods that based on the concept of consistent transformation of the deformable simplex around the extremum point and do not use the movement algorithm towards the global minimum.
This simplex is regular, but it seems like a polyhedron with the equifiable vertices of the simplicity in N-dimensional space. In various spaces, the simplex is displayed in an R2-equilateral triangle, and in R3 is the correct tetrahedron.
As mentioned above, the algorithm is the development of Spendley, Hecks and Himsworth, but, in contrast to the latter, allows the use of incorrect simplies. Most often, under the simplex, the convex polyhedron is meant with the number of vertices N + 1, where n is the number of model parameters in N-dimensional space.
In order to start using this method, it is necessary to determine the basic vertex of all available sets of coordinates using the expression:

The most remarkable in this method is that the simplix has the opportunity to independently perform certain functions:
Reflection through the center of gravity, reflection with compression or stretching;
Stretching;
Compression.
The advantage among these properties is reflected as this parameter is the most optional - functional. From any selected vertex it is possible to reflect on the center of gravity of the simplication by expression:.

Where XC is the center of gravity (see Figure 1).


Figure 1 - Reflection through the center of gravity

The next step it is necessary to calculate the arguments of the target function in all vertices of the reflected simplicity. After that, we get full information How simplex will behave in space, which means both information about the behavior of the function.
In order to search the minimum point or maximum target function using methods using simplices, you must follow the following sequence:
At each step, the simplex is built, at each point of which, it is necessary to calculate all its vertices, after which it is necessary to sort the results obtained ascending;
The next step is the reflection. It is necessary to attempt to get the meanings of a new simplex, and by reflections, we get rid of unwanted values \u200b\u200bthat try to move the simplex not towards the global minimum;
To get the values \u200b\u200bof the new simplight, from the resulting sorted results, we take two vertices with worst values. There are such cases that immediately pick up suitable values It will not be possible, then you will have to return to the first step and make a compression of the simplex to the point with the smallest value;
The end of the search for an extremum point is the center of gravity, provided that the difference value between functions has the smallest values \u200b\u200bat the symplex points.

The Algorithm of Nedender - Mida also uses these features of working with the simplex according to the following formulas:

The reflection function through the center of gravity of the simplex is calculated according to the following expression:

This reflection is performed strictly towards the point of extremum and only through the center of gravity (see Figure 2).


Figure 2 - The reflection of the symplex occurs through the center of gravity

The compression function inward simplex is calculated according to the following expression:

In order to carry out compression, it is necessary to determine the point with the smallest value (see Figure 3).


Figure 3 - Symplex compression occurs to the smallest argument.

The reflection function with a symplex compression is calculated by the following expression:

In order to reflect with compression (see Figure 4), it is necessary to remember the operation of two separate functions - this is reflected through the center of gravity and compression of the simplicity to the smallest value.


Figure 4 - Reflection with compression

The reflection function with a symplex stretch (see Figure 5) occurs using two functions - this is a reflection through the center of gravity and stretching through the most important value.


Figure 5 - reflection with stretching.

To demonstrate the work of the Nededer - Mid method, you need to refer to the block of the algorithm scheme (see Figure 6).
It is paramount, as in the previous examples, you need to set the Mindness parameter ε, which should be strictly larger than zero, as well as specify non-optional parameters for calculating α, β and a. This will need to calculate the function f (x0), as well as to build the simplex itself.

Figure 6 - The first part of the Nededer method - Mid.

After building a simplex, it is necessary to calculate all the values \u200b\u200bof the target function. As it was described above about the search for extremum using a simplex, you need to calculate the function of the F (X) symplex in all its points. Next, we produce sorting where the base point will be:

Now that the base point is calculated, as well as all the other sorted in the list, we manufacture the conditions for attaining the accuracy for the accuracy previously given by us:

As soon as this condition becomes true, then the point X (0) of the simplex will be considered the desired point of extremum. In another case, we turn to the next step where you need to determine the new importance of the center of gravity by the formula:

If this condition is performed, then the point X (0) will be a minimum point, otherwise, it is necessary to go to the next step in which it is necessary to search for the smallest function argument:

From the function, it is necessary to get the minimum value of the argument for what to go to the next step of the algorithm. Sometimes there is a problem that several arguments immediately have the same value calculated from the function. By solving such a problem, the re-determination of the value of the argument is up to ten thousand.
After re-calculating the minimum argument, it is necessary to re-save the new obtained values \u200b\u200bon the n positions of the arguments.


Figure 7 - The second part of the Nededer method is a middle.

The value calculated from the previous function must be substituted in the FMIN condition< f(xN). При истинном выполнении данного условия, точка x(N) будет являться минимальной из группы тех, которые хранятся в отсортированном списке и нужно вернуться к шагу, где мы рассчитывали центр тяжести, в противном случае, производим сжатие симплекса в 2 раза и возвращаемся к самому началу с новым набором точек.
Studies of this algorithm show that methods with irregular symplexes (see Figure 8) are still poorly studied, but it does not prevent them from perfectly cope with the tasks.
More deep tests show that the parameters of the stretching functions, compression and reflection functions can be experimentally able to experimentally, but you can use the generally accepted parameters of these functions α \u003d 1/2, β \u003d 2, γ \u003d 2 or α \u003d 1/4, β \u003d 5/2, γ \u003d 2. Therefore, before discarding this method to solve the task, it is necessary to understand that for each new search for unconditional extremum, it is necessary to intently observe the behavior of the simplosis during its work and celebrate non-standard solutions method.


Figure 8 - the process of finding a minimum.

Statistics showed that in the work of this algorithm there is one of the most common problems - this is the degeneration of the deformable symplex. This happens when each time several vertices of the simplex fall into one space, the dimension of which does not satisfy the task.
Thus, the dimension during operation and the specified dimension thump several simplicity vertices into one straight, running method into an infinite cycle. The algorithm in this modification is not yet equipped with a way to get out of such a position and shift one vertex to the side, so you have to create a new simplex with new parameters so that such a future does not happen.
Another feature possesses this method - this is incorrect work at six and more vertices of the simplex. However, when modifying this method, you can get rid of this problem and not even losing the speed of execution, but the value of the allocated memory will noticeably increase. This method It can be considered cyclical because it is completely based on cycles, therefore, incorrect work is noticed with a large number of vertices.
The Algorithm of Nedender - Mida rightfully can be considered one of best methods Finding the point of extremum with a simplex and is excellent for use in various types of engineering and economic tasks. Even despite the cyclicality, the number of memory it uses a very small amount, compared with the same method of rewarding descent, and to find the most extremum it is necessary to calculate only the values \u200b\u200bof the center of gravity and function. A small but sufficient, the number of comprehensive parameters gives this method to widespread use in complex mathematical and current production tasks.
Simplex algorithms are the edge, the horizons of which we will not soon open, but now they greatly simplify our life with their visual component.

P.S. My text is completely mine. I hope to someone this information will be useful.

1. The concept of gradient methods.A prerequisite for the existence of an extremum of continuous differentiable function is the conditions of the form

where - the arguments of the function. More compact this condition can be recorded in the form

(2.4.1)

where - the designation of the gradient of the function at a specified point.

Optimization methods used in the determination of the extremum of the gradient's target function are called gradient.They are widely used in the optimal adaptive control systems by established states, in which the search for the optimal (in the sense of the selected criterion) of the steady state of the system when it changes its parameters, structures, or external influences.

Equation (2.4.1) in the general case is nonlinear. Direct solution is either impossible or quite difficult. Finding solutions of this kind of equations is possible by organizing a special procedure for finding an extremum point based on the use of various kinds of recurrent formulas.

The search procedure is built in the form of a multi-step process, in which each subsequent step leads to an increase or decrease in the target function, i.e. conditions are performed in the case of a maximum search and a minimum, respectively:

Through n. and n-1 marked steps numbers, and through and - vectors corresponding to the values \u200b\u200bof the target function arguments on n.-M and ( p-1) -mo steps. After the r-th step you can get

i.e. after R - steps - the target function will no longer increase (decrease) with any further change in its arguments; The latter means the achievement of a point with coordinates for which you can write that

(2.4.2)
(2.4.3)

where - the extreme value of the target function.

To solve (2.4.1), in general, the following procedure can be applied. We write the value of the coordinates of the target function as

where - some coefficient (scalar), not equal to zero.

At the point of extremum because

The solution of equation (2.4.1) in this method is possible if the condition of the convergence of the iterative process is performed for any initial value.

Definition methods based on the solution of equation (2.2.) Differ from each other with a choice, i.e., the choice of step changes in the target function in the process of searching for extremum. This step can be permanent Or a variable in the second case, the law of changing the step value, in turn, may be predetermined or. depend on the current value (may be nonlinear).

2. The method of the formal descent. The method of the formation of the premium descent is that the search for extremum should be carried out in the direction of the greatest change in the gradient or anti-agadient, since this path is the so-time to achieve an extreme point. When it is implemented, first of all, it is necessary to calculate the gradient at this point and select the step value.

Calculation of the gradient.Since as a result of optimization there are coordinates of the point of extremum, for which the ratio is true:

the computational procedure for determining the gradient can be replaced by the procedure for determining the components of the gradients in the discrete points of the target function space

(2.4.5)

where - small change coordinate

If we assume that the gradient determination point is in the middle

cut

Choosing (2.4.5) or (2.4.6) depends on the steepness of the function in the area - ah ;; If the steepness is not large, the preference should be given (2.4.5), since the calculations are less here; Otherwise, more accurate results gives calculation of software (2.4.4). Improving the accuracy of determining the gradient is also possible by averaging random deviations.

Selection of step valueThe complexity of choosing the step value is that the direction of the gradient may vary from point to point. At the same time, too much step will lead to a deviation from the optimal trajectory, i.e. from the direction of the gradient or antigigructure, and too small step-of a very slow movement towards extremum due to the need to perform a large amount of calculations.

One of the possible methods of assessing the value of the step is the Newton method - Rafson. Consider it on an example of a one-dimensional case under the assumption that the extremum is achieved at a point determined by the solution of the equation (Fig.2.4.2).

Let the search begins from the point and in the vicinity of this point, the function is decomposable in a series of Taylor. Then

The direction of the gradient at the point coincides with the direction of tangent. When searching for a minimum extreme point change the coordinate h.when moving along a gradient, you can write in the form:

Fig.2.4.2 Scheme for calculating a step by Newton's method - Rafson.

Substitting (2.4.7) in (2.4.8), we get:

As under the condition this example The value is achieved at the point determined by the solution by the equation, you can try to take such a step to i.e.

Substitute new meaning In the target function. If then at the point, the definition procedure is repeated, with the result that the value is:



etc. The calculation is stopped if the changes in the target function are small, i.e.

where the permissible error of determining the target function.

The optimal gradient method.The idea of \u200b\u200bthis method is as follows. IN by the usual method Anorest descent step is selected in the general case [when] arbitrarily, guided only by the fact that it should not exceed a certain value. In the optimal gradient method, the step value is selected based on the requirement that from this point in the direction of the gradient (antigigs) should be moved until the target function increases (decrease). If this requirement is not performed, it is necessary to stop moving and determine the new direction of movement (direction of the gradient), etc. (before finding the optimal point).

Thus, the optimal values \u200b\u200band to search for a minimum and maximum are respectively determined from the solution of equations:

In (1) and (2), respectively

Consequently, the definition at each step is to find from equations (1) or (2) for each point of the trajectory of the movement along the gradient, starting with the source.

Relaxation method

The method algorithm is to find the axial direction, along which the target function decreases the most strongly (when searching for a minimum). Consider the task unconditional optimization

To determine the axial direction at the starting point of the search, derivatives are determined from the region, according to all independent variables. The axial direction corresponds to the highest derivative module.

Let - axial direction, i.e. .

If the sign is negative, the function decreases in the direction of the axis, if positive - in the opposite direction:

At the point calculate. In the direction of flattering the function, one step is made, it is determined and in the case of improving the criterion, the steps continue until the minimum value is found at the selected direction. At this point, derivatives on all variables are again determined, with the exception of those on which the descent is carried out. Again there is an axial direction of the fastest descend, which further steps are made, etc.

This procedure is repeated until the optimal point is achieved, when moving from which, on any axial direction, no further decrease occurs. In practice, the criterion for the end of the search is the condition

which, when turns into exact equality condition, zero derivatives at the extremum point. Naturally, condition (3.7) can only be used if the optimum lies inside the permissible region of changing independent variables. If the optimum falls on the border of the region, the criterion of type (3.7) is unsuitable and instead of it should be applied to the positivity of all derivatives at permissible axial directions.

The descent algorithm for the selected axial direction can be recorded so

(3.8)

where-applying variable variable at every step of descent;

The value of k + 1 step, which may vary depending on the number number:

- Z sign function;

The vector of the point in which the last time was calculated;



The "+" sign in the algorithm (3.8) is accepted when searching for Max I, and the sign "-" - when searching min i. And less than step H., the more quantity Calculations on the way of movement to the optimum. But with too much value h near Optimum, the search process may occur. Near Optimum It is necessary that the H condition was performed

The simplest algorithm for changing the step h consists next. At the beginning of the descent, a step is set equal to for example, 10% of the range D; Changes with this step are designed at the selected direction to the age of pores, while a condition is satisfied for two subsequent calculations.

If the condition is violated on any step, the descent direction on the axis changes to the opposite and the descent continues from the last point with a reduced twice step.

The formal recording of this algorithm is as follows:

(3.9)

As a result of using such a strategy, the descent will be reduced in the Optimum area in this area and the search in the direction can be discontinued when less than E.

Then the new axial direction is searched for the initial step for further descent, usually less traveled along the previous axial direction. The nature of movement in the optimum in this method is shown in Figure 3.4.

Figure 3.5 - Trajectory of movement to the optimum in the relaxation method

Improving the search algorithm for this method can be achieved by applying single-parameter optimization methods. In this case, a problem solving scheme may be proposed:

Step 1. - axial direction,

; , if a ;

Step 2. - New axial direction;

Gradient method

This method uses a function gradient. Gradient function at point The vector is called the projections of which the coordinate axes are private derivatives in coordinates (Fig. 6.5)

Figure 3.6 - Gradient Functions

.

The direction of the gradient is the direction of the fastest increase of the function (the most steep "slope" of the response surface). The direction opposite to him (the direction of the anti-adender) is the direction of the naistrest decrease (the direction of the Gliten Big "descent).

The projection of the gradient to the plane of variables is perpendicular to the tangent to the level line, i.e. The gradient is orthogonal to the lines of the constant level of the target function (Fig. 3.6).

Figure 3.7 - Trajectory of movement to the optimum in the method

gradient

In contrast to the relaxation method in the method of gradient, the steps are made in the direction of the larger decrease (increase) function.

Optimum search is made in two stages. At the first stage there are values \u200b\u200bof private derivatives in all variables that determine the direction of the gradient in the point under consideration. In the second stage, a step is carried out in the direction of the gradient when searching for a maximum or in the opposite direction - when searching for a minimum.

If an analytical expression is unknown, then the direction of the gradient is determined by the search at the object of trial movements. Let the starting point. The increment is given, while. Determine the increment and derivative

Similarly determine derivatives by the rest of the variables. After finding the components of the gradient, trial movements stop and working steps at the selected direction begin. Moreover, the magnitude of the step is the greater, the greater the absolute value of the vector.

When performing a step, the values \u200b\u200bof all independent variables simultaneously change. Each of them gets the increment proportional to the corresponding component of the gradient

, (3.10)

or in vector form

, (3.11)

where is a positive constant;

"+" - when searching for Max I;

"-" - when searching Min I.

The gradient search algorithm for the rationing of the gradient (division on the module) is used as

; (3.12)

(3.13)

Determines the magnitude of the step in the direction of the gradient.

The algorithm (3.10) has the advantage that when approaching the optimum, the length of the step is automatically reduced. And with the algorithm (3.12), the strategy of the change can be constructed regardless of the absolute value of the coefficient.

In the gradient method, each is separated by one working step, after which the derivatives are again calculated, the new direction of the gradient is determined and the search process continues (Fig. 3.5).

If the step size is selected too small, then the movement to the optimum will be too long due to the need to calculate in very many points. If the step is selected too large, looping may occur in the optimum area.

The search process continues until, will not be close to zero or until the boundary of the variable task area will be achieved.

In the algorithm with automatic clarification of the step, the value is specified so that the change in the direction of the gradient in adjacent points and

Optimum search end criteria:

; (3.16)

; (3.17)

where - vector norm.

The search is completed when performing one of the conditions (3.14) - (3.17).

The disadvantage of the gradient search (also considered above) is that when it is used, it is possible to detect only the local extremum function. To find other local extremums, it is necessary to search from other initial points.

As we have already noted, the task of optimization is the task of finding such values \u200b\u200bof factors. h. 1 = h. 1* , h. 2 = h. 2* , …, h. K. = h. K. * in which the response function ( w.) reaches extreme value w. \u003d EXT (Optimum).

Various methods for solving optimization problem are known. One of the most widely used is the gradient method, also called Boxing Wilson and the method of steep ascent.

Consider the essence of the gradient method on the example of a two-factor response function y \u003d.f (x. 1 , H. 2 ). In fig. 4.3 In the factorial space, the curves of equal values \u200b\u200bof the response function (level curves) are depicted. Point with coordinates h. 1 *, h. 2 * corresponds to the extremal value of the response function w. EXT.

If we choose any point of factor space as source ( h. 1 0 , h. 2 0), the shortest way to the top of the response function from this point is the path, by a curve, tangent to which at each point coincides with the normal to the level curve, i.e. This is the path in the direction of the gradient of the response function.

Gradient of continuous unambiguous function y \u003d.f.(x. 1 , H. 2) - this is a vector determined in the direction of gradient with coordinates:

where i,j. - single vectors in the direction of coordinate axes h. 1 I. h. 2. Private derivatives and characterize the direction of the vector.

Since we unknown a type of dependence y \u003d.f.(x. 1 , H. 2), we cannot find private derivatives, and determine the true direction of the gradient.

According to the gradient method in some part of the factor space, the initial point is selected (initial levels) h. 1 0 , h. twenty . Regarding these initial levels, a symmetric two-level experiment plan is built. Moreover, the variation interval is chosen so small that the linear model turns out to be adequate. It is known that any curve on a sufficiently small plot can be approximated by a linear model.

After constructing a symmetric two-level plan, an interpolation task is solved, i.e. Linear model is built:

and its adequacy is checked.

If a linear model has been adequate for the selected variation interval, then the direction of the gradient can be determined:

Thus, the direction of the gradient of the response function is determined by the values \u200b\u200bof the regression coefficients. This means that we will move in the direction of the gradient, if from the point with coordinates ( ) We turn to the point with the coordinates:

where m -a positive number that determines the step in the direction of the gradient.

Insofar as h. 1 0 \u003d 0 and h. 2 0 \u003d 0, then .

Having determined the direction of the gradient () and choosing the step m., carry out experience at the initial level H. 1 0 , h. 2 0 .


Then make a step in the direction of the gradient, i.e. We carry out experience at the point with coordinates. If the value of the response function has increased compared to its value at the initial level, we take another step in the direction of the gradient, i.e. We carry out experience at the point with coordinates:

Gradient movement continue until the response function begins to decrease. In fig. 4.3 Movement over a gradient corresponds to a direct emerging from the point ( h. 1 0 , h. twenty). It gradually deviates from the true direction of the gradient shown by the stroke line, due to the nonlinearity of the response function.

As soon as the response function, the value of the response function decreased, the gradient movement stops, take experience with the maximum value of the response function for the new initial level, make up a new symmetrical two-level plan and again solve the interpolation task.

Buing a new linear model , carry out regression analysis. If at the same time checking the significance of factors shows that at least one coeffic

the ficient, which means, the extremum area of \u200b\u200bthe response function (Optimum area) has not yet been reached. The new direction of the gradient is determined and movement to the optimum area begins.

Clarification of the direction of the gradient and the movement along the gradient continues until the process of solving the next interpolation problem will not show the validity of the factors that all factors are insignificant, i.e. everything . This means that the optimum area is achieved. This solution to the optimization task is stopped, and take experience with the maximum value of the response function for optimum.

In general, the sequence of actions necessary to solve the optimization problem by the method of the gradient can be represented as a flowchart (Fig. 4.4).

1) source levels of factors ( h. J. 0) It should be selected possible closer to the point of optimum, if there is some prior information about its position;

2) Variation intervals (Δ h. J.) It is necessary to choose such that the linear model certainly turned out to be adequate. The boundary below Δ. h. J. In this case, the minimum value of the variation interval, in which the response function remains significant;

3) step value ( t.) When moving along the gradient, choose so that the largest of the works did not exceed the difference in the upper and lower levels of factors in the normalized form

.

Hence, . With a smaller meaning t. The difference in the response function at the initial level and at the point with the coordinates may be insignificant. With a greater value of the step, it is dangerous to slip optimum response function.