Confidence level. Confidence intervals for frequencies and proportions

Often the appraiser has to analyze the real estate market of the segment in which the appraisal object is located. If the market is developed, it can be difficult to analyze the entire set of presented objects, therefore, a sample of objects is used for analysis. This sample is not always homogeneous, sometimes it is required to clear it of extremes - too high or too low market offers. For this purpose, it is applied confidence interval. The purpose of this study is to conduct a comparative analysis of two methods of calculating confidence interval and choose best option calculation when working with different samples in the estimatica.pro system.

Confidence interval - an interval of feature values ​​calculated on the basis of a sample, which, with a known probability, contains the estimated parameter population.

The meaning of calculating the confidence interval is to build such an interval from the sample data so that it can be asserted with given probability that the value of the estimated parameter is in this interval. In other words, the confidence interval with a certain probability contains the unknown value of the estimated quantity. The wider the interval, the higher the inaccuracy.

There are different methods for determining the confidence interval. In this article, we will consider 2 ways:

  • through the median and standard deviation;
  • across critical value t-statistics (Student's coefficient).

Stages comparative analysis different ways CI calculation:

1. form a data sample;

2. we process it with statistical methods: we calculate the mean value, median, variance, etc.;

3. we calculate the confidence interval in two ways;

4. Analyze the cleaned samples and the obtained confidence intervals.

Stage 1. Data sampling

The sample was formed using the estimatica.pro system. The sample included 91 sales offers 1 room apartments in the 3rd price zone with the type of layout "Khrushchev".

Table 1. Initial sample

The price of 1 sq.m., c.u.

Fig.1. Initial sample



Stage 2. Processing of the initial sample

Sample processing by statistical methods requires the calculation of the following values:

1. Arithmetic mean

2. Median - a number that characterizes the sample: exactly half of the sample elements are greater than the median, the other half is less than the median

(for a sample with an odd number of values)

3. Range - the difference between the maximum and minimum values ​​in the sample

4. Variance - used to more accurately estimate the variation in data

5. The standard deviation for the sample (hereinafter referred to as RMS) is the most common indicator of the dispersion of adjustment values ​​around the arithmetic mean.

6. Coefficient of variation - reflects the degree of dispersion of adjustment values

7. oscillation coefficient - reflects the relative fluctuation of the extreme values ​​of prices in the sample around the average

Table 2. Statistical indicators of the original sample

The coefficient of variation, which characterizes the homogeneity of the data, is 12.29%, but the coefficient of oscillation is too large. Thus, we can state that the original sample is not homogeneous, so let's move on to calculating the confidence interval.

Stage 3. Calculation of the confidence interval

Method 1. Calculation through the median and standard deviation.

The confidence interval is determined as follows: the minimum value - the standard deviation is subtracted from the median; maximum value- SSE is added to the median.

Thus, the confidence interval (47179 CU; 60689 CU)

Rice. 2. Values ​​within confidence interval 1.



Method 2. Building a confidence interval through the critical value of t-statistics (Student's coefficient)

S.V. Gribovsky in the book " Mathematical Methods property value assessment” describes how to calculate the confidence interval through the Student's coefficient. When calculating by this method, the estimator himself must set the significance level ∝, which determines the probability with which the confidence interval will be built. Significance levels of 0.1 are commonly used; 0.05 and 0.01. They correspond to confidence probabilities of 0.9; 0.95 and 0.99. With this method, the true values ​​of the mathematical expectation and variance are considered to be practically unknown (which is almost always true when solving practical evaluation problems).

Confidence interval formula:

n - sample size;

The critical value of t-statistics (Student's distributions) with a significance level ∝, the number of degrees of freedom n-1, which is determined by special statistical tables or using MS Excel (→"Statistical"→ STUDRASPOBR);

∝ - significance level, we take ∝=0.01.

Rice. 2. Values ​​within the confidence interval 2.

Step 4. Analysis of different ways to calculate the confidence interval

Two ways to calculate the confidence interval - through the median and Student's coefficient - led to different values intervals. Accordingly, two different purified samples were obtained.

Table 3. Statistical indicators for three samples.

Indicator

Initial sample

1 option

Option 2

Mean

Dispersion

Coef. variations

Coef. oscillations

Number of retired objects, pcs.

Based on the calculations performed, we can say that the values ​​of the confidence intervals obtained by different methods intersect, so you can use any of the calculation methods at the discretion of the appraiser.

However, we believe that when working in the estimatica.pro system, it is advisable to choose a method for calculating the confidence interval, depending on the degree of market development:

  • if the market is not developed, apply the method of calculation through the median and standard deviation, since the number of retired objects in this case is small;
  • if the market is developed, apply the calculation through the critical value of t-statistics (Student's coefficient), since it is possible to form a large initial sample.

In preparing the article were used:

1. Gribovsky S.V., Sivets S.A., Levykina I.A. Mathematical methods for assessing the value of property. Moscow, 2014

2. Data from the estimatica.pro system

And others. All of them are estimates of their theoretical counterparts, which could be obtained if there were not a sample, but the general population. But alas, the general population is very expensive and often unavailable.

The concept of interval estimation

Any sample estimate has some scatter, because is a random variable depending on the values ​​in a particular sample. Therefore, for more reliable statistical inferences, one should know not only the point estimate, but also the interval, which with a high probability γ (gamma) covers the estimated indicator θ (theta).

Formally, these are two such values ​​(statistics) T1(X) and T2(X), what T1< T 2 , for which at a given level of probability γ condition is met:

In short, it is likely γ or more the true value is between the points T1(X) and T2(X), which are called the lower and upper bounds confidence interval.

One of the conditions for constructing confidence intervals is its maximum narrowness, i.e. it should be as short as possible. Desire is quite natural, because. the researcher tries to more accurately localize the finding of the desired parameter.

It follows that the confidence interval should cover the maximum probabilities of the distribution. and the score itself be at the center.

That is, the probability of deviation (of the true indicator from the estimate) upwards is equal to the probability of deviation downwards. It should also be noted that for skewed distributions, the interval on the right is not equal to the interval left.

The figure above clearly shows that the greater the confidence level, the wider the interval - a direct relationship.

This was a small introduction to the theory of interval estimation of unknown parameters. Let's move on to finding confidence limits for the mathematical expectation.

Confidence interval for mathematical expectation

If the original data are distributed over , then the average will be a normal value. This follows from the rule that a linear combination of normal values ​​also has a normal distribution. Therefore, to calculate the probabilities, we could use the mathematical apparatus of the normal distribution law.

However, this will require the knowledge of two parameters - the expected value and the variance, which are usually not known. You can, of course, use estimates instead of parameters (arithmetic mean and ), but then the distribution of the mean will not be quite normal, it will be slightly flattened down. Citizen William Gosset of Ireland adroitly noted this fact when he published his discovery in the March 1908 issue of Biometrica. For secrecy purposes, Gosset signed with Student. This is how the Student's t-distribution appeared.

However, the normal distribution of data, used by K. Gauss in the analysis of errors in astronomical observations, is extremely rare in terrestrial life and it is quite difficult to establish this (for high precision about 2,000 observations are needed). Therefore, it is best to drop the normality assumption and use methods that do not depend on the distribution of the original data.

The question arises: what is the distribution of the arithmetic mean if it is calculated from the data of an unknown distribution? The answer is given by the well-known in probability theory Central limit theorem(CPT). In mathematics, there are several versions of it (the formulations have been refined over the years), but all of them, roughly speaking, boil down to the assertion that the sum of a large number of independent random variables obeys the normal distribution law.

When calculating the arithmetic mean, the sum of random variables is used. From this it turns out that the arithmetic mean has a normal distribution, in which the expected value is the expected value of the initial data, and the variance is .

Smart people know how to prove the CLT, but we will verify this with the help of an experiment conducted in Excel. Let's simulate a sample of 50 uniformly distributed random variables (using Excel functions RANDOMBETWEEN). Then we will make 1000 such samples and calculate the arithmetic mean for each. Let's look at their distribution.

It can be seen that the distribution of the average is close to the normal law. If the volume of samples and their number are made even larger, then the similarity will be even better.

Now that we have seen for ourselves the validity of the CLT, we can, using , calculate the confidence intervals for the arithmetic mean, which, with a given probability, cover the true mean or expected value.

To set the upper and lower bounds, you need to know the parameters normal distribution. As a rule, they are not, therefore, estimates are used: arithmetic mean and sample variance. Again, this method gives a good approximation only for large samples. When the samples are small, it is often recommended to use Student's distribution. Don't believe! Student's distribution for the mean occurs only when the original data has a normal distribution, that is, almost never. Therefore, it is better to immediately set the minimum bar for the amount of required data and use asymptotically correct methods. They say 30 observations are enough. Take 50 - you can't go wrong.

T 1.2 are the lower and upper bounds of the confidence interval

– sample arithmetic mean

s0– sample standard deviation (unbiased)

n – sample size

γ – confidence level (usually equal to 0.9, 0.95 or 0.99)

c γ =Φ -1 ((1+γ)/2) is the reciprocal of the standard normal distribution function. In simple terms, this is the number of standard errors from the arithmetic mean to the lower or upper bound (the indicated three probabilities correspond to the values ​​\u200b\u200bof 1.64, 1.96 and 2.58).

The essence of the formula is that the arithmetic mean is taken and then a certain amount is set aside from it ( with γ) standard errors ( s 0 /√n). Everything is known, take it and count.

Before the mass use of PCs, to obtain the values ​​​​of the normal distribution function and its inverse, they used . They are still used, but it is more efficient to turn to ready-made Excel formulas. All elements from the formula above ( , and ) can be easily calculated in Excel. But there is also a ready-made formula for calculating the confidence interval - CONFIDENCE NORM. Its syntax is the following.

CONFIDENCE NORM(alpha, standard_dev, size)

alpha– significance level or confidence level, which in the above notation is equal to 1-γ, i.e. the probability that the mathematicalthe expectation will be outside the confidence interval. With a confidence level of 0.95, alpha is 0.05, and so on.

standard_off is the standard deviation of the sample data. You don't need to calculate the standard error, Excel will divide by the root of n.

the size– sample size (n).

The result of the CONFIDENCE.NORM function is the second term from the formula for calculating the confidence interval, i.e. half-interval. Accordingly, the lower and upper points are the average ± the obtained value.

Thus, it is possible to build universal algorithm calculation of confidence intervals for the arithmetic mean, which does not depend on the distribution of the original data. The price for universality is its asymptotic nature, i.e. the need to use relatively large samples. However, in the century modern technologies collect right amount data is usually not difficult.

Testing Statistical Hypotheses Using a Confidence Interval

(module 111)

One of the main problems solved in statistics is. In a nutshell, its essence is this. An assumption is made, for example, that the expectation of the general population is equal to some value. Then the distribution of sample means is constructed, which can be observed with a given expectation. Next, we look at where in this conditional distribution the real average is located. If she goes out allowable limits, then the appearance of such an average is very unlikely, and with a single repetition of the experiment it is almost impossible, which contradicts the hypothesis put forward, which is successfully rejected. If the average does not go beyond the critical level, then the hypothesis is not rejected (but it is not proved either!).

So, with the help of confidence intervals, in our case for the expectation, you can also test some hypotheses. It's very easy to do. Suppose the arithmetic mean for some sample is 100. The hypothesis is being tested that the expected value is, say, 90. That is, if we put the question primitively, it sounds like this: can it be that with the true value of the average equal to 90, the observed the average was 100?

To answer this question, additional information on the average standard deviation and sample size. Let's say the standard deviation is 30, and the number of observations is 64 (to easily extract the root). Then the standard error of the mean is 30/8 or 3.75. To calculate the 95% confidence interval, you will need to set aside two standard errors on both sides of the mean (more precisely, 1.96). The confidence interval will be approximately 100 ± 7.5, or from 92.5 to 107.5.

Further reasoning is as follows. If the tested value falls within the confidence interval, then it does not contradict the hypothesis, since fits within the limits of random fluctuations (with a probability of 95%). If the tested point is outside the confidence interval, then the probability of such an event is very small, in any case below the acceptable level. Hence, the hypothesis is rejected as contradicting the observed data. In our case, the expectation hypothesis is outside the confidence interval (the tested value of 90 is not included in the interval of 100±7.5), so it should be rejected. Answering the primitive question above, one should say: no, it cannot, in any case, this happens extremely rarely. Often, this indicates a specific probability of erroneous rejection of the hypothesis (p-level), and not a given level, according to which the confidence interval was built, but more on that another time.

As you can see, it is not difficult to build a confidence interval for the mean (or mathematical expectation). The main thing is to catch the essence, and then things will go. In practice, most use the 95% confidence interval, which is about two standard errors wide on either side of the mean.

That's all for now. All the best!

The mind is not only in knowledge, but also in the ability to apply knowledge in practice. (Aristotle)

Confidence intervals

general review

Taking a sample from the population, we will obtain a point estimate of the parameter of interest to us and calculate the standard error in order to indicate the accuracy of the estimate.

However, for most cases, the standard error as such is not acceptable. It is much more useful to combine this measure of precision with an interval estimate for the population parameter.

This can be done by using knowledge of the theoretical probability distribution of the sample statistic (parameter) in order to calculate a confidence interval (CI - Confidence Interval, CI - Confidence Interval) for the parameter.

In general, the confidence interval extends the estimates in both directions by some multiple of the standard error ( given parameter); the two values ​​(confidence limits) that define the interval are usually separated by a comma and enclosed in parentheses.

Confidence interval for mean

Using the normal distribution

The sample mean has a normal distribution if the sample size is large, so knowledge of the normal distribution can be applied when considering the sample mean.

In particular, 95% of the distribution of the sample means is within 1.96 standard deviations (SD) of the population mean.

When we have only one sample, we call this the standard error of the mean (SEM) and calculate the 95% confidence interval for the mean as follows:

If this experiment is repeated several times, then the interval will contain the true population mean 95% of the time.

This is usually a confidence interval, such as the range of values ​​within which the true population mean (general mean) lies with a 95% confidence level.

Although it is not quite strict (the population mean is a fixed value and therefore cannot have a probability related to it) to interpret the confidence interval in this way, it is conceptually easier to understand.

Usage t- distribution

You can use the normal distribution if you know the value of the variance in the population. Also, when the sample size is small, the sample mean follows a normal distribution if the data underlying the population are normally distributed.

If the data underlying the population are not normally distributed and/or the general variance (population variance) is unknown, the sample mean obeys Student's t-distribution.

Calculate the 95% confidence interval for the population mean as follows:

Where - percentage point (percentile) t- Student distribution with (n-1) degrees of freedom, which gives a two-tailed probability of 0.05.

In general, it provides a wider interval than when using a normal distribution, because it takes into account the additional uncertainty that is introduced by estimating the population standard deviation and/or due to the small sample size.

When the sample size is large (of the order of 100 or more), the difference between the two distributions ( t-student and normal) is negligible. However, always use t- distribution when calculating confidence intervals, even if the sample size is large.

Usually 95% CI is indicated. Other confidence intervals can be calculated, such as 99% CI for the mean.

Instead of product of standard error and table value t- distribution that corresponds to a two-tailed probability of 0.05 multiply it (standard error) by a value that corresponds to a two-tailed probability of 0.01. This is a wider confidence interval than the 95% case because it reflects increased confidence that the interval does indeed include the population mean.

Confidence interval for proportion

The sampling distribution of proportions has a binomial distribution. However, if the sample size n reasonably large, then the proportion sample distribution is approximately normal with mean .

Estimate by sampling ratio p=r/n(where r- the number of individuals in the sample with the characteristics of interest to us), and the standard error is estimated:

The 95% confidence interval for the proportion is estimated:

If the sample size is small (usually when np or n(1-p) less 5 ), then the binomial distribution must be used in order to calculate the exact confidence intervals.

Note that if p expressed as a percentage, then (1-p) replaced by (100p).

Interpretation of confidence intervals

When interpreting the confidence interval, we are interested in the following questions:

How wide is the confidence interval?

A wide confidence interval indicates that the estimate is imprecise; narrow indicates a fine estimate.

The width of the confidence interval depends on the size of the standard error, which in turn depends on the sample size, and when considering a numeric variable from the variability of the data, give wider confidence intervals than studies of a large data set of few variables.

Does the CI include any values ​​of particular interest?

You can check whether the likely value for a population parameter falls within a confidence interval. If yes, then the results are consistent with this likely value. If not, then it is unlikely (for a 95% confidence interval, the chance is almost 5%) that the parameter has this value.

Confidence intervals ( English Confidence Intervals) one of the types of interval estimates used in statistics, which are calculated for a given level of significance. They allow us to make a statement that the true value of an unknown statistical parameter of the general population is in the obtained range of values ​​with a probability that is given by the chosen level of statistical significance.

Normal distribution

When the variance (σ 2 ) of the population of data is known, a z-score can be used to calculate confidence limits (boundary points of the confidence interval). Compared to using a t-distribution, using a z-score will not only provide a narrower confidence interval, but also provide more reliable estimates of the mean and standard deviation (σ), since the Z-score is based on a normal distribution.

Formula

To determine the boundary points of the confidence interval, provided that the standard deviation of the population of data is known, the following formula is used

L = X - Z α/2 σ
√n

Example

Assume that the sample size is 25 observations, the sample mean is 15, and the population standard deviation is 8. For a significance level of α=5%, the Z-score is Z α/2 =1.96. In this case, the lower and upper limits of the confidence interval will be

L = 15 - 1.96 8 = 11,864
√25
L = 15 + 1.96 8 = 18,136
√25

Thus, we can state that with a probability of 95% the mathematical expectation of the general population will fall in the range from 11.864 to 18.136.

Methods for narrowing the confidence interval

Let's say the range is too wide for the purposes of our study. There are two ways to decrease the confidence interval range.

  1. Reduce the level of statistical significance α.
  2. Increase the sample size.

Reducing the level of statistical significance to α=10%, we get a Z-score equal to Z α/2 =1.64. In this case, the lower and upper limits of the interval will be

L = 15 - 1.64 8 = 12,376
√25
L = 15 + 1.64 8 = 17,624
√25

And the confidence interval itself can be written as

In this case, we can make the assumption that with a probability of 90%, the mathematical expectation of the general population will fall into the range.

If we want to keep the level of statistical significance α, then the only alternative is to increase the sample size. Increasing it to 144 observations, we obtain the following values ​​of the confidence limits

L = 15 - 1.96 8 = 13,693
√144
L = 15 + 1.96 8 = 16,307
√144

The confidence interval itself will look like this:

Thus, narrowing the confidence interval without reducing the level of statistical significance is only possible by increasing the sample size. If it is not possible to increase the sample size, then the narrowing of the confidence interval can be achieved solely by reducing the level of statistical significance.

Building a confidence interval for a non-normal distribution

If the standard deviation of the population is not known or the distribution is non-normal, the t-distribution is used to construct a confidence interval. This technique is more conservative, which is expressed in wider confidence intervals, compared to the technique based on the Z-score.

Formula

The following formulas are used to calculate the lower and upper limits of the confidence interval based on the t-distribution

L = X - tα σ
√n

Student's distribution or t-distribution depends on only one parameter - the number of degrees of freedom, which is equal to the number of individual feature values ​​(the number of observations in the sample). The value of Student's t-test for a given number of degrees of freedom (n) and the level of statistical significance α can be found in the lookup tables.

Example

Assume that the sample size is 25 individual values, the mean of the sample is 50, and the standard deviation of the sample is 28. You need to construct a confidence interval for the level of statistical significance α=5%.

In our case, the number of degrees of freedom is 24 (25-1), therefore, the corresponding tabular value of Student's t-test for the level of statistical significance α=5% is 2.064. Therefore, the lower and upper bounds of the confidence interval will be

L = 50 - 2.064 28 = 38,442
√25
L = 50 + 2.064 28 = 61,558
√25

And the interval itself can be written as

Thus, we can state that with a probability of 95% the mathematical expectation of the general population will be in the range.

Using a t-distribution allows you to narrow the confidence interval, either by reducing statistical significance or by increasing the sample size.

Reducing the statistical significance from 95% to 90% in the conditions of our example, we get the corresponding tabular value of Student's t-test 1.711.

L = 50 - 1.711 28 = 40,418
√25
L = 50 + 1.711 28 = 59,582
√25

In this case, we can say that with a probability of 90% the mathematical expectation of the general population will be in the range.

If we do not want to reduce the statistical significance, then the only alternative is to increase the sample size. Let's say that it is 64 individual observations, and not 25 as in the initial condition of the example. Table value Student's t-test for 63 degrees of freedom (64-1) and the level of statistical significance α=5% is 1.998.

L = 50 - 1.998 28 = 43,007
√64
L = 50 + 1.998 28 = 56,993
√64

This gives us the opportunity to assert that with a probability of 95% the mathematical expectation of the general population will be in the range.

Large Samples

Large samples are samples from a population of data with more than 100 individual observations. Statistical studies have shown that larger samples tend to be normally distributed, even if the distribution of the population is not normal. In addition, for such samples, the use of z-score and t-distribution give approximately the same results when constructing confidence intervals. Thus, for large samples, it is acceptable to use a z-score for a normal distribution instead of a t-distribution.

Summing up

"Katren-Style" continues the publication of Konstantin Kravchik's cycle about medical statistics. In two previous articles, the author touched on the explanation of such concepts as and.

Konstantin Kravchik

Mathematician-analyst. Specialist in the field of statistical research in medicine and humanities

Moscow city

Very often in articles on clinical trials you can find a mysterious phrase: "confidence interval" (95% CI or 95% CI - confidence interval). For example, an article might say: "Student's t-test was used to assess the significance of differences, with a 95% confidence interval calculated."

What is the value of the "95% confidence interval" and why calculate it?

What is a confidence interval? - This is the range in which the true mean values ​​in the population fall. And what, there are "untrue" averages? In a sense, yes, they do. In we explained that it is impossible to measure the parameter of interest in the entire population, so the researchers are content with a limited sample. In this sample (for example, by body weight) there is one average value (a certain weight), by which we judge the average value in the entire general population. However, it is unlikely that the average weight in the sample (especially a small one) will coincide with the average weight in the general population. Therefore, it is more correct to calculate and use the range of average values ​​of the general population.

For example, suppose the 95% confidence interval (95% CI) for hemoglobin is between 110 and 122 g/L. This means that with a 95 % probability, the true mean value for hemoglobin in the general population will be in the range from 110 to 122 g/L. In other words, we do not know the average hemoglobin in the general population, but we can indicate the range of values ​​for this feature with 95% probability.

Confidence intervals are particularly relevant to the difference in means between groups, or what is called the effect size.

Suppose we compared the effectiveness of two iron preparations: one that has been on the market for a long time and one that has just been registered. After the course of therapy, the concentration of hemoglobin in the studied groups of patients was assessed, and the statistical program calculated for us that the difference between the average values ​​of the two groups with a probability of 95% is in the range from 1.72 to 14.36 g/l (Table 1).

Tab. 1. Criterion for independent samples
(groups are compared by hemoglobin level)

This should be interpreted as follows: in the part of patients in the general population that takes new drug, hemoglobin will be higher on average by 1.72-14.36 g / l than in those who took an already known drug.

In other words, in the general population, the difference in the average values ​​for hemoglobin in groups with a 95% probability is within these limits. It will be up to the researcher to judge whether this is a lot or a little. The point of all this is that we are not working with one average value, but with a range of values, therefore, we more reliably estimate the difference in a parameter between groups.

In statistical packages, at the discretion of the researcher, one can independently narrow or expand the boundaries of the confidence interval. By lowering the probabilities of the confidence interval, we narrow the range of means. For example, at 90% CI, the range of means (or mean differences) will be narrower than at 95% CI.

Conversely, increasing the probability to 99% widens the range of values. When comparing groups, the lower limit of the CI may cross the zero mark. For example, if we extended the boundaries of the confidence interval to 99 %, then the boundaries of the interval ranged from –1 to 16 g/L. This means that in the general population there are groups, the difference between the averages between which for the studied trait is 0 (M=0).

Confidence intervals can be used to test statistical hypotheses. If the confidence interval crosses the zero value, then the null hypothesis, which assumes that the groups do not differ in the studied parameter, is true. An example is described above, when we expanded the boundaries to 99%. Somewhere in the general population, we found groups that did not differ in any way.

95% confidence interval of difference in hemoglobin, (g/l)


The figure shows the 95% confidence interval of the mean hemoglobin difference between the two groups as a line. The line passes the zero mark, therefore, there is a difference between the average values, zero, which confirms the null hypothesis that the groups do not differ. The difference between the groups ranges from -2 to 5 g/l, which means that hemoglobin can either decrease by 2 g/l or increase by 5 g/l.

The confidence interval is a very important indicator. Thanks to it, you can see if the differences in the groups were really due to the difference in the means or due to a large sample, because with a large sample, the chances of finding differences are greater than with a small one.

In practice, it might look like this. We took a sample of 1000 people, measured the hemoglobin level and found that the confidence interval for the difference in the means lies from 1.2 to 1.5 g/L. The level of statistical significance in this case p

We see that the concentration of hemoglobin has increased, but almost imperceptibly, therefore, statistical significance appeared due to the sample size.

Confidence intervals can be calculated not only for averages, but also for proportions (and risk ratios). For example, we are interested in the confidence interval of the proportions of patients who achieved remission while taking the developed drug. Assume that the 95% CI for the proportions, i.e. for the proportion of such patients, is in the range 0.60–0.80. Thus, we can say that our medicine has a therapeutic effect in 60 to 80% of cases.