Tag: class interval

  • Quartiles, Deciles and Percentiles

    Quartiles, Deciles and Percentiles

    Quartiles divides a set of data into four equal parts.

    A median divides a set of data into two parts each with equal number of items.

    The first quartile, mostly referred to as the lower quartile contains 25% of the total data items. Lower quartile can be described as the median of the bottom half.

    Second quartile is actually the median of the whole data(50%).

    The third quartile is usually referred to as upper quartile and contains 75% of total data items. It can be described as the median of the upper half the data set.

    Formula for the getting the first quartile Q1

    Where

    • L is the lower class boundary of the quartile class.
    • n is the total frequency
    • c is the cumulative frequency above the quartile class
    • i is the class interval
    • f is the frequency of the lower quartile class
    Formula for the getting the second quartile Q2

    Second quartile Q2 is actually the median of the data

    it is calculated from:

    where

    • L is the lower class boundary of the median class.
    • n is the total frequency
    • c is the cumulative frequency above the median class
    • i is the class interval
    • f is the frequency of the median class
    Formula for the getting the third quartile Q3

    where

    • L is the lower class boundary of the upper quartile class.
    • n is the total frequency
    • c is the cumulative frequency above the third quartile class
    • i is the class interval of the upper quartile class
    • f is the frequency of the upper quartile class

    Deciles

    Deciles divides a set of data into ten equal parts.

    First decile is when n is divided by 10. that is; Decile = n/10

    where n is the total frequency for the data

    Percentiles

    Percentiles divides a set of data into hundred equal parts.

    one percentile is given as (1/100)*n

    In quartiles, deciles and percentiles, data is arranged in ascending order

    Example 5

    The table below shows the distribution of heights to the nearest cm of students in a school.

    Table of heights of some students

    Find (a) the median

    (b)(i) lower quartile (ii) upper quartile (iii) 80th percentile.

    Solution

    (a) The new frequency table for the data is shown here

    There are 130 students . Therefore, the median height is the 65th student. that is; median is 130/2.

    The 65th student falls in the 150-159 class. This class is called the median class.

    Using the formula for the median:

    (b) (i)

    Lower quartile Q1 = L + (n/4 – C)i/f, that is:

    ii)

    Upper quartile Q3= L + (3n/4)-23)*5/9

    (C)

    The 80th percentile of the data is given by 80/100)*130=104th value.

    The 104th student falls in the 160-169 class

    80th percentile= L+(80/100n-C)i/f

    The complete solution is as below:

    Example

    Determine the lower quartile and upper quartile for the following set of data

    15, 20, 16, 15, 18, 17, 13, 9, 17, 18, 11

    solution

    arranging in ascending order

    9, 11, 13, 15, 15, 16, 17, 17, 18, 18, 20

    The median number is 16. On left of 16 there are 5 values and on the right 5 values.

    16 is at the center of the data list

    9, 11, 13, 15, 15 | 16 | 17, 17, 18, 18, 20

    The first half contains: 9, 11, 13, 15, 15

    The central value in that lower half is 13 and it is the first quartile of the data

    The upper half includes: 17, 17, 18, 18, 20

    The central value is 18 and is hence the upper quartile for the data list

    Related Topics

  • Grouped and Ungrouped data

    In statistics, data items can be considered as a group instead of considering an individual item especially when the number of records are huge.

    In grouping, you take few neighboring items and put them in a group, for example if you have items like 41,42,42,43,45,46, you can decide to consider a group of 41-45 instead of listing the numbers individually.

    Let us consider the data provided below that represents ages of some 20 senior workers in a company:

    63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

    we can make the frequency table as we discussed earlier

    AgeTallyFrequency
    51/1
    52/1
    53////4
    54//2
    56/1
    58////4
    63//2
    64//2
    65/1
    67//2
    Totalsummation20
    ungrouped data of senior workers in a company

    63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

    We can reduce the size of the table by grouping the data in 5 values as shown. please note that we have changed the first column from age to class meaning it will represent a class of a certain age group.

    classTallyFrequency
    51-55//// ///8
    56-60////5
    61-65////5
    66-70//2
    Totalsummation20
    Grouped data for senior workers in a company

    Measurements such as height, mass, age, time e.t.c are usually estimates of the actual values therefore any value between 50.5 and 51.4 could be estimated as 51. Therefore we can write interval x as 50.5 ≤x< 51.5.

    A class interval 51-55 includes all masses between 50.5 t0 55.5

    The values 50.5 and 55.5 are called the class boundaries of the class 51-55.

    50.5 is the lower class boundary in this case and 55.5 is the upper class boundary.

    The difference between the class boundaries is the class width(class size). For example in the example above, class width =55.5-50.5 = 5

    when grouping data, ensure the groups are not so many, the most recommended is 5-12 groups.

    practice question

    The data below shows masses of 30 animals in animal farm.

    27, 28, 24, 25, 30, 40, 30, 28, 26,43, 27, 28, 33, 35, 36, 27, 30, 28, 31, 30, 28, 29, 30, 35, 32, 26, 25, 42, 43, 27.

    Required:

    (a) Make a grouped frequency table for the data

    (b) represent the grouped data in a bar graph and then in a pie chart4

    Solution

    The first step is deterring the number of classes. This we do by determining the range and the size of each group. let say each group should have n items and the range is R.

    The number of groups (classes=R/n) approximated to the nearest whole number that is greater than R/n.

    the range is the difference between the highest score and the lowest score. In the above data, the range = 43-24 = 19

    Assuming we want each class has five items, then number of classes = 19/5≈3.8 which should be 4 to the nearest whole number. however we said the best numbers is between 5-12. hence we can reduce the number of items per group, probably to 4.

    hence 19/4 = 4.75 classes ≈5

    five classes are better than four because fewer number of items in a group can increase accuracy when calculating the measures of central tendencies.

    the groups starts from the lowest value, and then add 3 items to get the upper boundary of that group. Note we have added 3 and not 4 because the lower boundary need 3 more items to make 4 items in the group.

    The frequency table for the grouped data should be as follow

    classesTallyFrequency
    24-27//// ////9
    28-31//// //// //12
    32-35////4
    36-39/1
    40-43////4
    Totalsummation30
    Frequency table for masses of animals in a farm

    The data can be represented in the in a bar graph as shown

    Practice question

    The marks obtained by students in a Java test were recorded as follow

    71, 73, 64, 58, 49, 52, 62, 68, 52, 48, 55, 63, 60, 71, 66, 61, 58, 57, 65, 64, 49, 52, 59, 53, 59, 74, 56, 57, 59,66.

    required:

    • make a frequency distribution table for the data
    • Draw a histogram to show this information

    Related Topics