Tag: grouped Data

  • Grouped Data

    Grouped Data

    In grouped data, there need a value that can be used in each group so that we can get  ∑fx for each class and eventually summation of values to get the mean. The value used to represent values of a given class is the midpoint (x) obtained by adding lower and upper class boundary and dividing them by two.

    Example

    The frequency table below shows masses in kilograms of some grade 10 students.

    classFrequency
    60-648
    65-697
    70-7412
    75-799
    80-846
    85-895
    a table showing frequency against group of ages

    Required:

    • (a) State the modal class
    • (b) Estimate the mean
    • (c) Estimate the median

    solution

    (a) The modal class is 70-74 because it has the highest frequency (12).

    (b) we redraw the table to include a column for the midpoint values

    the midpoint of class 60-64=(60+64)/2 = 62, and you will follow the same step to get midpoints of other classes which will include 67,72,77,82 and 87.

    Mean x̄ = (∑fx)/(∑f)

    hence Mean x̄ = (3449)/(47) = 574.833

    (c) The median value is the value at the 24th position of the frequency. This is because of all the data were arranged in ascending order, the middle one would be at the 24th position. On the left of 24, there would be 23 values and on the right there would be 23 values.

    if we add the frequencies cumulatively we have 8, 15, 27, 36, 42, 47 respectively. It means there are 27 values with 74 kg and below, the middle value (24th value) is found in the class 70-74.

    The lower class boundary of the median class is 69.5 kg. which best fits the mass of the 15th person.

    to get the mass of the 24th person ,we need 9 people from the median class.

    so we get 69.5 + ((24-15)/12) * 5 = 69.5+3.75 = 73.25

    Exercise

    (a) The table below shows the masses of some people that visited a hospital

    ClassMid-point(x)frequency ffx
    51-559
    56-6013
    61-6515
    66-7017
    71-7524
    a table of masses of people that visited a hospital

    (a) copy and complete the table

    (b) Find:

    1. The mean
    2. The Median

    (b) The height in centimeters of 25 people were measured as follows:

    156, 170, 185, 167, 179, 180, 174, 169, 169, 162, 159, 162, 165, 179, 174, 175, 184, 189, 183, 190, 165, 156, 158, 169, 162.

    (a) Using a class width of 5 make a grouped frequency table

    (b) from the table estimate:

    • The mean
    • The median
    solution

    (a) The lowest value is 156 and so the first group will be 156-160, the highest value is 190. The range is 190-156=34.

    34/5=6.8 ≈ 7.

    So there is about 7 groups. We develop a table as shown

    (b) The formula for calculating the mean is

    ∑fx=4265 as read from the table

    ∑f = 25

    hence

    The total number of items, which is the summation of frequency (∑f) is 25.

    The median, which is the number at the center is the 13th value

    The class that will contain 13th value is a class that starts with 10th and ends with 14th value as shown in the table. That class is 166-170 and it is the median class.

    The lower class boundary of the median class is 165.5

    The cumulative frequency above the median class is 9

    The frequency of the median class is 5

    The formula for the median is given as

    where L is the lower class boundary of the median class, n the total frequency, C the cumulative frequency below the median class, i the class width and f the frequency of the median class

    Hence the median of the data is 169


    (c) The average temperatures at a weather stations were recorded for 30 days as follow.

    29, 23, 22, 26, 35, 38, 45, 42, 30, 22, 35, 33, 34, 29, 36, 27, 38, 40, 44, 39, 38, 43, 40, 35, 30, 25, 29, 26, 28, 33, 34

    Required:

    (a) using a class width of 5, make a frequency table for the data

    (b) Find the mean and the median for the data


    Related Topics & pages


  • Grouped and Ungrouped data

    In statistics, data items can be considered as a group instead of considering an individual item especially when the number of records are huge.

    In grouping, you take few neighboring items and put them in a group, for example if you have items like 41,42,42,43,45,46, you can decide to consider a group of 41-45 instead of listing the numbers individually.

    Let us consider the data provided below that represents ages of some 20 senior workers in a company:

    63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

    we can make the frequency table as we discussed earlier

    AgeTallyFrequency
    51/1
    52/1
    53////4
    54//2
    56/1
    58////4
    63//2
    64//2
    65/1
    67//2
    Totalsummation20
    ungrouped data of senior workers in a company

    63, 53, 58, 64, 54, 64, 58, 67, 54, 54, 56, 53, 51, 52, 58, 53, 63, 65, 67, 58.

    We can reduce the size of the table by grouping the data in 5 values as shown. please note that we have changed the first column from age to class meaning it will represent a class of a certain age group.

    classTallyFrequency
    51-55//// ///8
    56-60////5
    61-65////5
    66-70//2
    Totalsummation20
    Grouped data for senior workers in a company

    Measurements such as height, mass, age, time e.t.c are usually estimates of the actual values therefore any value between 50.5 and 51.4 could be estimated as 51. Therefore we can write interval x as 50.5 ≤x< 51.5.

    A class interval 51-55 includes all masses between 50.5 t0 55.5

    The values 50.5 and 55.5 are called the class boundaries of the class 51-55.

    50.5 is the lower class boundary in this case and 55.5 is the upper class boundary.

    The difference between the class boundaries is the class width(class size). For example in the example above, class width =55.5-50.5 = 5

    when grouping data, ensure the groups are not so many, the most recommended is 5-12 groups.

    practice question

    The data below shows masses of 30 animals in animal farm.

    27, 28, 24, 25, 30, 40, 30, 28, 26,43, 27, 28, 33, 35, 36, 27, 30, 28, 31, 30, 28, 29, 30, 35, 32, 26, 25, 42, 43, 27.

    Required:

    (a) Make a grouped frequency table for the data

    (b) represent the grouped data in a bar graph and then in a pie chart4

    Solution

    The first step is deterring the number of classes. This we do by determining the range and the size of each group. let say each group should have n items and the range is R.

    The number of groups (classes=R/n) approximated to the nearest whole number that is greater than R/n.

    the range is the difference between the highest score and the lowest score. In the above data, the range = 43-24 = 19

    Assuming we want each class has five items, then number of classes = 19/5≈3.8 which should be 4 to the nearest whole number. however we said the best numbers is between 5-12. hence we can reduce the number of items per group, probably to 4.

    hence 19/4 = 4.75 classes ≈5

    five classes are better than four because fewer number of items in a group can increase accuracy when calculating the measures of central tendencies.

    the groups starts from the lowest value, and then add 3 items to get the upper boundary of that group. Note we have added 3 and not 4 because the lower boundary need 3 more items to make 4 items in the group.

    The frequency table for the grouped data should be as follow

    classesTallyFrequency
    24-27//// ////9
    28-31//// //// //12
    32-35////4
    36-39/1
    40-43////4
    Totalsummation30
    Frequency table for masses of animals in a farm

    The data can be represented in the in a bar graph as shown

    Practice question

    The marks obtained by students in a Java test were recorded as follow

    71, 73, 64, 58, 49, 52, 62, 68, 52, 48, 55, 63, 60, 71, 66, 61, 58, 57, 65, 64, 49, 52, 59, 53, 59, 74, 56, 57, 59,66.

    required:

    • make a frequency distribution table for the data
    • Draw a histogram to show this information

    Related Topics