Measurements of dispersion are vital to explain the scatter of the data, or, alternatively, data variability with regard to the central tendency, whereas measures of central tendency are useful to understand what are the usual values of the data. The mean or median of two separate samples may be the same, but their levels of variability may differ, or vice versa. Both of these properties should always be present in a proper description of a data set. There are other ways to quantify dispersion, and each has benefits and drawbacks of its own.
Range
Range is the difference between the sample data set's largest and smallest values. One of the easiest measures of dispersion to calculate. It is highly sensitive to outliers and unusual observations because it only considers the data set's most extreme values. The distribution in the middle of the data set is not well represented by this.
If Pk=(k(n+1))/100 is not an integer then the weighted average estimate makes use of simple interpolation between the two observed values, using the formula below: Pk=(1-m) Xi+mX((i+1)) Where; m – is the fractional part i – is the integer part k – the desired location
Decile
The deciles divide the ordered observation into ten equal parts. Basically, the first decile, D1 is the number that divides the bottom 10% of the data from the top 90%. To obtain the deciles, divide the data set into tenths and then determine the number dividing the tenths. The formula is:
Dk=(k(n+1))/10 – Weighted average estimate method
If Dk=(k(n+1))/10 is not an integer then the weighted average estimate makes use of simple interpolation between the two observed values, using the formula below:
Dk=(1-m) Xi+mX((i+1))
Where;
m – is the fractional part
i – is the integer part
k – the desired location
Quartile
The quartile divides the ordered observations into four (4) equal parts. The formula to compute quartile is:
Qk=(k(n+1))/4 – Weighted average estimate method
If Qk=(k(n+1))/10 is not an integer then the weighted average estimate makes use of simple interpolation between the two observed values, using the formula below:
Qk=(1-m) Xi+mX((i+1))
Where;
m – is the fractional part
i – is the integer part
k – the desired location
Example 1: The following data sets are the number of years of operation of 20 mining companies: 4, 6, 7, 5, 6, 30, 23, 25, 20, 21, 17, 18, 17, 19, 11, 10, 10, 8, 20, 16. Determine the 95th percentile, D6 , and Q1.
Solution: Arrange the data set in order.
4, 5, 6, 6, 7, 8, 10, 10, 11, 16, 17, 17, 18, 19, 20, 20, 21, 23, 25, 30
Compute for P95=95(20+1)/100=19.95. Since, P95 is not an integer, then the P95 is computed by P95=(1-.95) X19+0.95X((19+1))=0.05(25) +0.95(30) = 1.25 + 28.5 = 29.75 years.
Therefore, we can say that 95 percent of the 20 mining companies have been operating for less than 29.75 years.
The solution to the D6 and Q1 locations are have been left as exercises.
I am happy if you share your solution or answer in the chatbox
Happy Reading!
0 Comments