Very large coefficient of variation. Calculation of the coefficient of variation in Microsoft Excel

Many are faced with the variability of the trait under study in individual units of the population, its fluctuation relative to a certain value, that is, with its variation. This is what must be taken into account in order to obtain the most reliable information about the progress of the implementation of one or another scientific research.

Most researchers, when determining the interval of change in the value of a particular parameter, most often resort to absolute and Among the latter, the coefficient of variation is most widely used, which, if the value under study is characterized by a normal distribution, is a criterion for the homogeneity of the population. This indicator allows you to determine what degree of dispersion the values ​​of the parameter under study will have, without paying attention to the scale and unit of measurement.

The coefficient of variation can be calculated by dividing by the arithmetic mean of the variable, expressed as a percentage. The result of this calculation can fall in the range from zero to infinity, increasing as the feature variation increases. If the obtained value is less than 33.3%, the variation of the trait is weak. If more - strong. In the latter case, the data set under study is heterogeneous, it is recognized as atypical, and therefore cannot be a generalizing indicator. Therefore, for this population, it is worth using other indicators.

It should be noted that the coefficient of variation not only characterizes the homogeneity of a certain population, but is also used as a comparative assessment of it. For example, it is used if it is necessary to fluctuate one or another attribute in populations for which the calculated value of the average value is different. In this case, the scatter of the data obtained does not allow an objective assessment of the value acquired. The coefficient of variation characterizes the relative variability of the variable, and therefore can be a relative measure of fluctuations in the value of the parameter under study.

However, there are some limitations here. In particular, it is possible to assess the degree of fluctuation of the parameter values ​​only for a specific attribute and if the population has a certain composition. At the same time, the equality of these indicators can indicate both strong and weak variations. This is the case if the signs are different or the studies are carried out on different populations. Such a result is formed under the influence of very objective reasons, and this must be taken into account when processing the experimental data obtained.

The coefficient of variation is widely used in various branches of science and technology. In particular, it is actively involved in the assessment of parameter fluctuations in economics and sociology. At the same time, the use of the coefficient becomes impossible if it is necessary to assess the variability of variables that are capable of changing their sign to the opposite. After all, then as a result of the calculations, incorrect values ​​​​of this indicator will be obtained: either it will be very small, or it will have a negative sign. In the latter case, it is worth checking the correctness of the calculations performed.

Thus, we can say that the coefficient of variation is a parameter that will allow you to estimate the degree of dispersion and relative variability medium size. The use of this indicator allows you to identify the most significant factors, focusing on which will allow you to achieve your goals and solve the necessary tasks.

The coefficient of variation is one of the most applicable in the financial sector statistical coefficients. We will tell you how to calculate the coefficient of variation and how it can be useful to a financial director.

What is the coefficient of variation and why is it needed

The coefficient of variation (CV) is a measure of the relative spread random variable. It shows what proportion is the average spread of a random variable from the average value of this variable.

In general, the coefficient of variation is used to determine the dispersion of values ​​without reference to the scale of the measured value and units of measurement. The coefficient of variation is included in the group of relative methods of statistics, it is measured as a percentage and therefore it can be used to compare the variation of several unrelated processes and phenomena.

Using the coefficient of variation in financial modeling

The coefficient of variation is the leader among the variational statistical methods used by financial and investment analysts.

Analysts use the ratio:

  1. To determine the stability of the predictive model.
  2. To compare several predictive models (mostly investment models) with different absolute levels of return and risk.
  3. For XYZ analysis.

The formula for calculating the coefficient of variation

The coefficient of variation is calculated by the formula:

where CV is the coefficient of variation,

σ is the standard deviation of a random variable,

tav is the average value of a random variable.

Variation coefficient formula for investment financial models:

where NPV is net present value.

The formula for the coefficient of variation for investments in securities:

where:% year - the yield on the security in % per annum.

Coefficient of variation in Excel

=SDVPA(value range)/AVERAGE(value range)

Or using the built-in "Data Analysis" package.

Analysis of the coefficient of variation

The coefficient of variation is more versatile than dispersion and standard deviation, because it allows you to compare the risk and return of two or more assets, which may differ significantly. True, the method of assessing the return / risk pair using the coefficient of variation has limitations. If the expected return tends to zero, then the value of the coefficient of variation tends to infinity. And even a slight change in the expected profitability of a project (or a security) leads to a significant change in the coefficient, which must be taken into account when justifying investment decisions.

  • less than 10%, then the degree of risk of the project is insignificant,
  • from 10% to 20% - medium,
  • more than 20% - significant,
  • if the value of the coefficient of variation is more than 33%, then the financial model is considered to be heterogeneous, unstable. It is impossible to make objective investment decisions on it.

Examples of calculating the coefficient of variation in Excel

Example 1

The first is the opening of a network of retail outlets for jewelry trade in Moscow and St. Petersburg.

The second is the opening of a network of retail outlets throughout Russia in million-plus cities.

An enterprise financial analyst compiled financial models of both projects in Excel and performed 5,000 Monte Carlo runs for NPV in each project (see also, how to create a visual financial model in excel ). Further, using the "Data Analysis" analysis package, I obtained the following statistical indicators (see tables 1 and 2).

Table 1. Project indicators 1

Average Estimated NPV from Project 1 will be 14.05 thousand dollars, the variance (it is also the standard deviation) will be equal to 1.72 thousand dollars.

The coefficient of variation for the first project is:

CV = 1.72/14.05 = 12%

The project is recognized as medium risk.

The average estimated NPV from Project 2 will be $25.23 thousand, the variance will be $6.30 thousand.

The coefficient of variation for the second project will be:

CV = 6.30/25.23 = 24.97%

The project is recognized as high-risk.

If we compare projects 1 and 2 by the coefficient of variation, then Project 1 should be chosen, since it has a better income / risk ratio.

Example 2

The company "Sigma" conducts XYZ analysis of the product range in terms of sales volatility. The company's product line is represented by five products: A, B, C, D and E.

There are monthly sales statistics for Last year for each product (see picture). In practice, it is better to have statistics for a period of more than three years /

Picture. Sales statistics for the last year for each product

The financial analyst of the company calculated the coefficient of variation for each product

CVa = STDVA(B2:B13)/AVERAGE(B2:B13) = 30%

The company has the following intervals for XYZ groups:

Z - 31–100%.

This means that goods B and D belong to category X. Demand for them is constant, stocks in warehouses for them must be closely monitored and constantly replenished.

Goods A and C belong to category Y. Demand for them deviates within 30% from month to month. Perhaps there is a seasonality in demand. It is necessary to analyze sales statistics more deeply and develop an optimal policy on stock balances for this group.

Product E has the most volatile demand, sales for it are carried out irregularly, so it might make sense to switch to pre-order work with it.

conclusions

It should be remembered that the coefficient of variation is not the only way to evaluate the effectiveness of an investment, as it does not take into account several important factors:

  1. Volumes of initial investment.
  2. Possible asymmetric distribution. When calculating the coefficient of variation, it is assumed that the spread of values ​​of a random variable is located symmetrically to the average (often along normal distribution). But this is not always true. For example, for options, the yield of which cannot be lower than zero, there is a distribution asymmetry, and it is necessary to analyze the coefficient of variation for them with an eye to other methods. statistical analysis.
  3. Investment policy of the subject of investment.
  4. Other non-numeric factors.

However, the method of estimating statistical, including financial, data by calculating the coefficient of variation is deservedly recognized as one of the most effective comparative methods of statistics.

Of all the measures of variation, the standard deviation is the most used for other types of statistical analysis. However, the standard deviation gives an absolute estimate of the measure of the dispersion of values, and in order to understand how large it is relative to the values ​​themselves, a relative indicator is required. This indicator is called the coefficient of variation.

Variation coefficient formula:

This indicator is measured as a percentage (if multiplied by 100%).

It is accepted in statistics that if the coefficient of variation

less than 10%, then the degree of data dispersion is considered insignificant,

from 10% to 20% - medium,

more than 20% and less than or equal to 33% - significant,

the value of the coefficient of variation does not exceed 33%, then the population is considered homogeneous,

if more than 33%, then - heterogeneous.

Averages calculated for a homogeneous population are significant, i.e. really characterize this population, for a heterogeneous population they are insignificant, they do not characterize the population due to a significant spread in the values ​​of the attribute in the population.

Let's take an example with the calculation of the average linear deviation.

And a reminder schedule

Based on these data, we calculate: the mean value, the range of variation, the mean linear deviation, the variance, and the standard deviation.

The mean is the usual arithmetic mean.

The range of variation is the difference between the maximum and minimum:

The average linear deviation is calculated by the formula:

The dispersion is calculated by the formula:

Standard deviation - Square root from dispersion:

We summarize the calculation in a table.

Variation of an indicator reflects the variability of a process or phenomenon. Its degree can be measured using several indicators.

    Span variation is the difference between the maximum and minimum. Reflects the range of possible values.

    Average linear deviation- reflects the average of the absolute (modulo) deviations of all values ​​of the analyzed population from their average value.

    Dispersion is the mean square of the deviations.

    standard deviation- the root of the variance (mean squared deviations).

    The coefficient of variation- the most universal indicator, reflecting the degree of dispersion of values, regardless of their scale and units of measurement. The coefficient of variation is measured as a percentage and can be used to compare variation various processes and phenomena.

Thus, in statistical analysis there is a system of indicators reflecting the homogeneity of phenomena and the stability of processes. Often, variation indicators do not have independent meaning and are used for further data analysis. The exception is the coefficient of variation, which characterizes the homogeneity of the data, which is a valuable statistical characteristic.

One of the key stages in the preparation of procurement documentation is the calculation of the initial maximum contract price (IMCP). Legislation provides for several ways in which calculations can be made. The most commonly used method is comparable market prices. In this case, the final IMCC should be determined taking into account the coefficient of variation. Therefore, all customers need to understand what this indicator includes and how to correctly determine it.

What is the coefficient of variation

The size of the NMCC is determined at the planning stage. This amount should be reflected in the plan and schedule. Immediately before the preparation of the notice, it is adjusted taking into account the current economic situation at that time. Issues related to the NMCC are discussed in Article 22 of the 44-FZ. Methods for its calculation are described in the Order of the Ministry of Economy and Development No. 567 dated October 02, 2013. The same document provides rules for determining the coefficient of variation.

Several methods have been developed for identifying IMCC: regulatory, tariff, design estimates, cost. The method of comparable market prices is considered to be the highest priority. It is recommended to use it when determining the starting price. It involves comparing commercial offers provided by potential suppliers at the request of the customer. To conduct such an analysis, the coefficient of variation is used. It is expressed as a percentage.

The coefficient of variation is understood as a measure of the relative spread of offered prices. It shows what proportion the average price spread takes from the average price value. This indicator can take the following values:

  1. Less than 10%. In this case, the difference in prices is considered insignificant.
  2. From 10% to 20%. The spread is considered average.
  3. From 20% to 33%. The difference is considered significant, but acceptable.
  4. Over 33%. The data is inhomogeneous. When calculating the NMCC, it is not allowed to use data with a coefficient of variation over 33%.

A special formula has been developed to determine the coefficient. It is easy to calculate the parameter using it by substituting the appropriate data. You can simplify your task by using calculators, which are widely represented on the Internet today.

What to do if the coefficient is too high

If, when calculating the coefficient of variation, a value of less than 33% is obtained, then the sample is recognized as homogeneous. Therefore, the obtained value can be used to determine the NMCC.

If there is such a situation that the value of the coefficient is higher than 33 percent, then adjustments will need to be made to the data used. To do this, additional market research is carried out. It is necessary to collect commercial proposals from more suppliers and repeat the calculation based on the new data. If it is not possible to collect additional offers, you can use the information from previously concluded contracts, which are stored in the register of contracts.

In an extreme situation, when it is impossible to achieve the desired coefficient of variation, unsuitable sentences can be excluded from the sample. You can also ask the supplier to indicate the amount you need in their offer.

Calculation rules

The method for calculating the coefficient of variation is prescribed in the order of the Ministry of Economic Development No. 567. According to current regulations, the customer must send at least five requests for commercial proposals to potential suppliers. For the calculation, at least three proposals are used that fully meet the requirements of the customer.

It should be noted that Order No. 567 is not normative act, therefore, its implementation is not mandatory. There are no penalties for its violation. However, in order to avoid disputable situations of the customer, it is recommended to use these particular calculation rules.

The following formula is used to determine the coefficient of variation:

The standard deviation allows you to determine the spread of the data. To determine it, choose the average price and the measure of dispersion. The standard deviation can be calculated using the following formula:

In situations where the purchase includes several items at the same time, the calculation is carried out for each of them. This allows you to identify products with the largest price range.

Calculation example

Assume that a government agency is purchasing printers for its own needs. Appropriate requests were sent to potential suppliers. Four commercial offers prices: 2500 rubles, 2800 rubles, 2450 rubles and 2600 rubles.

The next step is to calculate the standard deviation

Rating 4.87 (15 Votes)

Variation indicators. When studying a variable trait in units of a population, one cannot limit oneself only to calculating the average value from individual variants, since the same average may not apply to populations that are identical in composition.

The variation of a trait is the difference between the individual values ​​of a trait within the studied population.

The term "variation" comes from the Latin variatio - change, fluctuation, difference. However, not all differences are commonly referred to as variation.

Variation in statistics means such quantitative changes the values ​​of the trait under study within a homogeneous population, which are due to the intersecting influence of the action various factors. The fluctuation of individual values ​​is characterized by the variation indicators. The greater the variation, the further apart, on average, the individual values ​​lie from each other.

There is a variation of a trait in absolute and relative values.

Absolute indicators include: range of variation, mean linear deviation, mean square deviation, variance. All absolute indicators have the same dimension as the quantities under study.

Relative indicators include coefficients of oscillation, linear deviation and variation.

The figures are absolute. Let us calculate the absolute indicators characterizing the variation of the trait.

The range of variation is the difference between the maximum and minimum values ​​of a trait.

R = Xmax – Xmin.

The range of variation indicator is not always applicable, since it takes into account only the extreme values ​​of the trait, which can be very different from all other units.

More accurately, you can determine the variation in a series using indicators that take into account the deviations of all options from the arithmetic mean.

There are two such indicators in statistics: the mean linear and the mean square deviation.

Average linear deviation (L) is the arithmetic mean of absolute values deviations of individual options from the average.

The practical use of the average linear deviation is as follows, with the help of this indicator, the composition of workers, the rhythm of production, and the uniformity of the supply of materials are analyzed.

The disadvantage of this indicator is that it complicates the calculations of a probable type and makes it difficult to apply the methods of mathematical statistics.

The standard deviation () is the most common and accepted measure of variation. It is somewhat larger than the average linear deviation. For moderately asymmetric distributions, the following relation between them is established

To calculate it, each deviation from the average is squared, all the squares are summed (taking into account the weight), after which the sum of the squares is divided by the number of members of the series and the square root is extracted from the quotient.

All these actions are expressed by the following formula

those. the standard deviation is the square root of the arithmetic mean of the squared deviations from the mean.

The standard deviation is a measure of the reliability of the mean. The smaller σ, the better the arithmetic mean reflects the entire represented population.

The arithmetic mean of the squared deviations of the options for the values ​​of the attribute from the average value is called the variance (), which is calculated by the formulas

Distinctive feature this indicator is that when squaring () specific gravity small deviations decreases, and large increases in the total amount of deviations.

The dispersion has a number of properties, some of which make it easier to calculate:

1. The dispersion of a constant value is 0.

If , then and .

Then .

2. If all variants of the feature values ​​(x) are reduced by the same number, then the variance will not decrease.

Let , but then in accordance with the properties of the arithmetic mean and .

The variance in the new series will be equal to

Those. the variance in the series is equal to the variance of the original series.

3. If all variants of the attribute values ​​are reduced by the same number of times (k times), then the variance will decrease by a factor of k2.

Let , then and .

The variance of the new series will be equal to

4. The dispersion calculated in relation to the arithmetic mean is minimal. The mean square of deviations, calculated with respect to an arbitrary number , is greater than the variance, calculated with respect to the arithmetic mean, by the square of the difference between the arithmetic mean and the number , i.e. . The dispersion from the mean has the property of minimality, i.e. it is always less than the variances calculated from any other quantities. In this case, when we equate to 0 and therefore do not calculate the deviation, the formula becomes:

Above, the calculation of variation indicators for quantitative traits was considered, but in economic calculations the task of assessing the variation of qualitative features can be set . For example, when studying the quality of manufactured products, products can be divided into high-quality and defective.

In this case we are talking about alternative features.

Alternative features are those that some units of the population have, while others do not. For example, the availability of work experience for applicants, an academic degree for university teachers, etc. The presence of a feature in population units is conventionally denoted by 1, and the absence is denoted by 0. Then, if the proportion of units with a feature (in the total number of units of the population) is denoted by p, and the proportion of units that do not have a feature by q, the variance of an alternative feature can be calculate by general rule. Moreover, p + q = 1, and hence q = 1– p.

First, we calculate the average value of the alternative feature:

Calculate the average value of the alternative feature

,

those. the mean value of an alternative attribute is equal to the proportion of units that have this attribute.

The variance of the alternative sign will be equal to:

Thus, the variance of an alternative attribute is equal to the product of the proportion of units that have a given attribute by the proportion of units that do not have this attribute.

And the standard deviation will be equal to =.

The indicators are relative. For the purposes of comparing the fluctuations of different traits in the same population, or when comparing the fluctuation of the same trait in several populations, variation indicators expressed in relative terms are of interest. The basis for comparison is the arithmetic mean. These indicators are calculated as the ratio of the range of variation, the average linear deviation or standard deviation to the arithmetic mean or median.

Most often they are expressed as a percentage and determine not only a comparative assessment of the variation, but also characterize the homogeneity of the population. The set is considered homogeneous if the coefficient of variation does not exceed 33%. There are the following relative indicators of variation:

1. The coefficient of oscillation reflects the relative fluctuation of the extreme values ​​of the attribute around the average.

3. The coefficient of variation evaluates the typicality of the mean values.

.

The smaller , the more homogeneous the population according to the trait under study and the more typical the average. If ≤33%, then the distribution is close to normal, and the population is considered homogeneous. From the above example, the second set is homogeneous.

Types of variances and the rule for adding variances. Along with studying the variation of a trait over the entire population as a whole, it is often necessary to trace the quantitative changes in the trait by groups into which the population is divided, as well as between groups. This study of variation is achieved through the calculation and analysis various kinds dispersion.

In this case, it is possible to determine three indicators of the variability of the trait in the aggregate:

1. The general variation of the totality, which is the result of the action of all causes. This variation can be measured by the total variance (), which characterizes the deviations of the individual values ​​of the population trait from the general average

.

2. Variation of group averages, expressing the deviations of group averages from the general average and reflecting the influence of the factor by which the grouping was made. This variation can be measured by the so-called intergroup variance(δ2)

,

where are group averages, a is the total average for the entire population, and is the number of individual groups.

3. Residual (or intra-group) variation, which is expressed in the deviation of the individual values ​​of the trait in each group from their group average and, therefore, reflects the influence of all other factors except the one underlying the grouping. Since the variation in each group is reflected by the group variance

,

then for the entire population, the residual variation will reflect the average of the group variances. This variance is called the average of the intragroup variances () and it is calculated by the formula

This equality, which has a strictly mathematical proof, is known as the rule for adding variances.

The rule for adding variances allows you to find the total variance by its components, when the individual values ​​of the attribute are unknown, and only group indicators are available.

Determination coefficient. The variance addition rule allows you to identify the dependence of the results on certain factors using the determination coefficient.

It characterizes the influence of the attribute underlying the grouping on the variation of the resulting attribute. The correlation ratio varies from 0 to 1. If , then the grouping attribute does not affect the resultant one. If , then the resulting feature changes only depending on the feature underlying the grouping, and the influence of other factor features is equal to zero.

Indicators of asymmetry and kurtosis. In the field of economic phenomena, strictly symmetrical series are extremely rare, more often you have to deal with asymmetric series.

In statistics, several indicators are used to characterize asymmetry. If we take into account that in a symmetrical series, the arithmetic mean coincides in value with the mode and median, then the simplest indicator of asymmetry () will be the difference between the arithmetic mean and the mode, i.e.

The kurtosis value is calculated by the formula

If >0, then the kurtosis is considered positive (the distribution is peaked), if<0, то эксцесс считается отрицательным (распределение низковершинно).