Está en la página 1de 4

Chapter 2

Organizing Data

RAW DATA
Information obtained by observing values of a variable is called raw data. Data obtained by
observing values of a qualitative variable are referred to as qualitative dafa. Data obtained by
observing values of a quantitative variable are referred to as quantitative data. Quantitative data
obtained from a discrete variable are also referred to as discrete data and quantitative data obtained
from a continuous variable are called continuous data.

EXAMPLE 2.1 A study is conducted in which individuals are classified into one of sixteen personality types
using the Myers-Briggs type indicator. The resulting raw data would be classified as qualitative data.

EXAMPLE 2.2 The cardiac output in liters per minute is measured for the participants in a medical study. The
resulting data would be classified as quantitative data and continuous data.

EXAMPLE 2.3 The number of murders per 100,000 inhabitants is recorded for each of several large cities for
the year 1994. The resulting data would be classified as quantitative data and discrete data.

FREQUENCY DISTRIBUTION FOR QUALITATIVE DATA


A frequencv distributiou for qualitative data lists all categories and the number of elements that
belong to each of the categories.

EXAMPLE 2.4 A sample of rural county arrests gave the following set of offenses with which individuals were
charged:

rape robbery burglary arson murder robbery rape manslaughter


arson theft arson burglary theft robbery theft theft
theft burglary murder murder theft theft theft manslaughter
manslaughter

The variable, type of offense, is classified into the categories: rape, robbery, burglary, arson, murder, theft, and
manslaughter. As shown in Table 2.1, the seven categories are listed under the column entitled Offense, and each
occurrence of a category is recorded by using the symbol / in order to tally the number of times each offense
occurs. The number of tallies for each offense is counted and listed under the column entitled Frequency.
Occasionally the term absolute frequency is used rather than frequency.

Table 2.1
Offense Tally Frequency
Rape I/ 2
Robbery I// 3
Bur glary /I/ 3
Arson I// 3
Murder /I/ 3
Theft )xl/ I// 8
MansI augh te r //I 3

14
CHAP. 21 ORGANIZING DATA 15

RELATIVE FREQUENCY OF A CATEGORY

The relative frequency of a category is obtained by dividing the frequency for a category by the
sum of all the frequencies. The relative frequencies for the seven categories in Table 2.1 are shown in
Table 2.2. The sum of the relative frequencies will always equal one.

PERCENTAGE

The percentage for a category is obtained by multiplying the relative frequency for that category
by 100. The percentages for the seven categories in Table 2.1 are shown in Table 2.2. The sum of the
percentages for all the categories will always equal 100 percent.

Offense
Rape 2/25 = .08 .08 x 100 = 8%
Robbery 3/25 = .I2 . 1 2 x loo= 12%
Burglary 3/25 = . I 2 .I2 x loo= 12%
Arson 3/25 = .12 . I 2 x loo= 12%
Murder 3/25 = . I2 .I2 x loo= 12%
Theft 8/25 = .32 3 2 x 100 = 32%
Manslaughter 3/25 = .12 .12 x loo= 12%

BAR GRAPH

A bar graph is a graph composed of bars whose heights are the frequencies of the different
categories. A bar graph displays graphically the same information concerning qualitative data that a
frequency distribution shows in tabular form.

EXAMPLE 2.5 The distribution of the primary sites for cancer is given in Table 2.3 for the residents of Dalton
County.

Tab 2.3
Primary site Frequency
Digestive system 20
Respiratory 30
Breast 10
Genitals 5
Urinary tract 5
Other

To construct a bar graph, the categories are placed along the horizontal axis and frequencies are marked along
the vertical axis. A bar is drawn for each category such that the height of the bar is equal to the frequency for that
category. A small gap is left between the bars. The bar graph for Table 2.3 is shown in Fig. 2-1. Bar graphs can
also be constructed by placing the categories along the vertical axis and the frequencies along the horizontal axis.
See problem 2.5 for a bar graph of this type.
16 ORGANIZING DATA [CHAP. 2

n
Digesiive
system
Resphtory
system
unuo
Breast Genitals Urinary
tract
Other

Primary site

Fig. 2-1

PIE CHART
A pie chart is also used to graphically display qualitative data. To construct a pie chart, a circle is
divided into portions that represent the relative frequencies or percentages belonging to different
categories.

EXAMPLE 2.6 T o construct a pie chart for the frequency distribution in Table 2.3, construct a table that gives
angle sizes for each category. Table 2.4 shows the determination of the angle sizes for each o f the categories in
Table 2.3. The 360" in a circle are divided into portions that are proportional to the category s k s . The pie chart
for the frequency distribution in Table 2.3 is shown in Fig. 2-2.
Table 2.4
Primary site Relative frequency Angle sir,e
Digestive system -26 360 x .26 = 93.6"
Respiratory .40 360 x .40 = 144"
Breast .I3 360 x . I 3 = 46.8"
Genitals .07 360 x .07 = 25.2"
Urinary tract .07 360 x .07 = 25.2"
b
Other .07 360 x .07 = 25.2"

Primary cancer sites

stern

Respiratory system
(40.0%)

(13.341) (6 7 % )

Fig. 2-2
CHAP. 21 ORGANIZING DATA 17

FREQUENCY DISTRIBUTION FOR QUANTITATIVE DATA

There are many similarities between frequency distributions for qualitative data and frequency
distributions for quantitative data. Terminology for frequency distributions of quantitative data is
discussed first, and then examples illustrating the construction of frequency distributions for
quantitative data are given. Table 2.5 gives a frequency distribution of the Stanford-Binet intelligence
test scores for 75 adults.

IQ score Frequency
80-94 8
95- 109 14
110-124 24
125- I39 16
J 140- I54 13

IQ score is a quantitative variable and according to Table 2.5, eight of the individuals have an IQ
score between 80 and 94, fourteen have scores between 95 and 109, twenty-four have scores between
I I0 and 124, sixteen have scores between I25 and 139, and thirteen have scores between 140 and 154.

CLASS LIMITS, CLASS BOUNDARIES, CLASS MARKS, AND CLASS WIDTH

The frequency distribution given in Table 2.5 is composed of five classes. The classes are: 80-94,
95-1 09, 1 10- 124, 125-1 39, and 140- 154. Each class has a lobtter class limit and an irpper class limit.
The lower class limits for this distribution are 80, 95, 110, 125, and 140. The upper class limits are 94,
109, 124, 139, and 154.
If the lower class limit for the second class, 95, is added to the upper class limit for the first class,
94, and the sum divided by 2, the upper horrrrdar?l for the first class and the Iokt'er bozrrrdary for the
second class is determined. Table 2.6 gives all the boundaries for Table 2.5.
If the lower class limit is added to the upper class limit for any class and the sum divided by 2, the
class riicirk for that class is obtained. The class mark for a class is the midpoint of the class and is
sometimes called the class rniclpoirrt rather than the class mark. The class marks for Table 2.5 are
shown in Table 2.6.
The difference between the boundaries for any class gives the class width for a distribution. The
class width for the distribution in Table 2.5 is IS.

Class limits Class boundaries Class width Class marks


80-94 79.5-94.5 15 87.0
95- 109 94.5-1 09.5 15 102.0
110-124 109.5- I 24.5 15 117.0
125- 1 39 124.5-1 39.5 15 132.0
140- 154 139.5- 154.5 15 147.0

When forming a frequency distribution, the following general guidelines should be followed:
I . The riiiiziher of c*lnssesshoirld be hetweeri 5 mid 1.5
2. Each data \dire rziirst belong to m e , arid only one, class.
3. Wheii possible, all classes should be of equal Midth.

También podría gustarte