Tag: class interval

Quartiles, Deciles and Percentiles
Quartiles divides a set of data into four equal parts.

A median divides a set of data into two parts each with equal number of items.

The first quartile, mostly referred to as the lower quartile contains 25% of the total data items. Lower quartile can be described as the median of the bottom half.

Second quartile is actually the median of the whole data(50%).

The third quartile is usually referred to as upper quartile and contains 75% of total data items. It can be described as the median of the upper half the data set.

Formula for the getting the first quartile Q1

Where
- L is the lower class boundary of the quartile class.
- n is the total frequency
- c is the cumulative frequency above the quartile class
- i is the class interval
- f is the frequency of the lower quartile class
Formula for the getting the second quartile Q₂

Second quartile Q₂ is actually the median of the data

it is calculated from:

where
- L is the lower class boundary of the median class.
- n is the total frequency
- c is the cumulative frequency above the median class
- i is the class interval
- f is the frequency of the median class
Formula for the getting the third quartile Q₃

where
- L is the lower class boundary of the upper quartile class.
- n is the total frequency
- c is the cumulative frequency above the third quartile class
- i is the class interval of the upper quartile class
- f is the frequency of the upper quartile class
Deciles

Deciles divides a set of data into ten equal parts.

First decile is when n is divided by 10. that is; Decile = n/10

where n is the total frequency for the data

Percentiles

Percentiles divides a set of data into hundred equal parts.

one percentile is given as (1/100)*n

In quartiles, deciles and percentiles, data is arranged in ascending order

Example 5

The table below shows the distribution of heights to the nearest cm of students in a school.

Table of heights of some students

Find (a) the median

(b)(i) lower quartile (ii) upper quartile (iii) 80th percentile.

Solution

(a) The new frequency table for the data is shown here

There are 130 students . Therefore, the median height is the 65^th student. that is; median is 130/2.

The 65^th student falls in the 150-159 class. This class is called the median class.

Using the formula for the median:

(b) (i)

Lower quartile Q1 = L + (n/4 – C)i/f, that is:

ii)

Upper quartile Q3= L + (3n/4)-23)*5/9

(C)

The 80th percentile of the data is given by 80/100)*130=104^th value.

The 104^th student falls in the 160-169 class

80th percentile= L+(80/100n-C)i/f

The complete solution is as below:

Example

Determine the lower quartile and upper quartile for the following set of data

15, 20, 16, 15, 18, 17, 13, 9, 17, 18, 11

solution

arranging in ascending order

9, 11, 13, 15, 15, 16, 17, 17, 18, 18, 20

The median number is 16. On left of 16 there are 5 values and on the right 5 values.

16 is at the center of the data list

9, 11, 13, 15, 15 | 16 | 17, 17, 18, 18, 20

The first half contains: 9, 11, 13, 15, 15

The central value in that lower half is 13 and it is the first quartile of the data

The upper half includes: 17, 17, 18, 18, 20

The central value is 18 and is hence the upper quartile for the data list

Related Topics
- Introduction to statistics
- Mathematics
January 31, 2024
Grouped and Ungrouped data
In statistics, data items can be considered as a group instead of considering an individual item especially when the number of records are huge.

In grouping, you take few neighboring items and put them in a group, for example if you have items like 41,42,42,43,45,46, you can decide to consider a group of 41-45 instead of listing the numbers individually.

Let us consider the data provided below that represents ages of some 20 senior workers in a company:

63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

we can make the frequency table as we discussed earlier

Age Tally Frequency
51 / 1
52 / 1
53 //// 4
54 // 2
56 / 1
58 //// 4
63 // 2
64 // 2
65 / 1
67 // 2
Total summation 20
ungrouped data of senior workers in a company

63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

We can reduce the size of the table by grouping the data in 5 values as shown. please note that we have changed the first column from age to class meaning it will represent a class of a certain age group.

class Tally Frequency
51-55 ~~////~~ /// 8
56-60 ~~////~~ 5
61-65 ~~////~~ 5
66-70 // 2
Total summation 20
Grouped data for senior workers in a company

Measurements such as height, mass, age, time e.t.c are usually estimates of the actual values therefore any value between 50.5 and 51.4 could be estimated as 51. Therefore we can write interval x as 50.5 ≤x< 51.5.

A class interval 51-55 includes all masses between 50.5 t0 55.5

The values 50.5 and 55.5 are called the class boundaries of the class 51-55.

50.5 is the lower class boundary in this case and 55.5 is the upper class boundary.

The difference between the class boundaries is the class width(class size). For example in the example above, class width =55.5-50.5 = 5

when grouping data, ensure the groups are not so many, the most recommended is 5-12 groups.

practice question

The data below shows masses of 30 animals in animal farm.

27, 28, 24, 25, 30, 40, 30, 28, 26,43, 27, 28, 33, 35, 36, 27, 30, 28, 31, 30, 28, 29, 30, 35, 32, 26, 25, 42, 43, 27.

Required:

(a) Make a grouped frequency table for the data

(b) represent the grouped data in a bar graph and then in a pie chart4

Solution

The first step is deterring the number of classes. This we do by determining the range and the size of each group. let say each group should have n items and the range is R.

The number of groups (classes=R/n) approximated to the nearest whole number that is greater than R/n.

the range is the difference between the highest score and the lowest score. In the above data, the range = 43-24 = 19

Assuming we want each class has five items, then number of classes = 19/5≈3.8 which should be 4 to the nearest whole number. however we said the best numbers is between 5-12. hence we can reduce the number of items per group, probably to 4.

hence 19/4 = 4.75 classes ≈5

five classes are better than four because fewer number of items in a group can increase accuracy when calculating the measures of central tendencies.

the groups starts from the lowest value, and then add 3 items to get the upper boundary of that group. Note we have added 3 and not 4 because the lower boundary need 3 more items to make 4 items in the group.

The frequency table for the grouped data should be as follow

classes Tally Frequency
24-27 ~~////~~ //// 9
28-31 ~~////~~ ~~////~~ // 12
32-35 //// 4
36-39 / 1
40-43 //// 4
Total summation 30
Frequency table for masses of animals in a farm

The data can be represented in the in a bar graph as shown

Practice question

The marks obtained by students in a Java test were recorded as follow

71, 73, 64, 58, 49, 52, 62, 68, 52, 48, 55, 63, 60, 71, 66, 61, 58, 57, 65, 64, 49, 52, 59, 53, 59, 74, 56, 57, 59,66.

required:
- make a frequency distribution table for the data
- Draw a histogram to show this information
Related Topics
January 17, 2024