Está en la página 1de 36

Summarizing and Describing

Numerical Data

Lectures 3+4+5 Topics


Measures of Central Tendency
Mean, Median, Mode

Measures of Variation
The Range, Variance and
Standard Deviation

Shape
Symmetric, Skewed, Skewness, Kurtosis

Summary Measures
Summary Measures

Variation

Central Tendency
Mean

Median

Mode

Range
Variance

Coefficient of
Variation

Standard Deviation

Measures of Central Tendency


Central Tendency

Mean
n

xi

i 1

Median

Mode

The Mean (Arithmetic mean,


Average)
It is the Arithmetic Average of data values:

Sample Mean

xi

i 1

x i x 2 xn

The Most Common Measure of Central Tendency


Affected by Extreme Values (Outliers)
0 1 2 3 4 5 6 7 8 9 10

Mean = 5

0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 6

The Arithmetic
Mean

This is the most popular and useful


measure of central location

Sum of the observations


Mean =
Number of observations

The Arithmetic
Mean
Sample mean
x

n
n
ii11xxi i

nn

Sample size

Population mean

N
i1 x i

Population size

The arithmetic
mean

The Arithmetic
Mean
Example 4.1

The reported time spent on the Internet of 10 adults are 0, 7, 12, 5,


33, 14, 8, 0, 9, 22 hours. Find the mean time spent on the Internet.
10
i 1 xi 00x1 77x2 ... 22
x2210
11.0 hours
hours
x

11.0

10

10

Example 4.2
Suppose the telephone bills represent
the population of measurements ( 200). The population mean is
42.19
38.45
45.77
x42.19
x38.45
... x45.77
i200
1
2
200
1 x i

200
200

43.59
43.59

Weighted mean for data


grouped by categories or
variants
ik1 xi f i
x
fi

When many of the measurements have the same value, the


measurement can be summarized in a frequency table. Suppose
the number of children in a sample of 16 families were recorded
as follows:
NUMBER OF CHILDREN
NUMBER OF FAMILIES

0
3

1
4

2
7

3
2

16 families

16
x1. f1 x2 f 2 ... x16 f16 3(0) 4(1) 7(2) 2(3)
i 1 xi f i
x

1.5
16
16
16

The Median
Important Measure of Central Tendency
In an ordered array, the median is the
middle number.
If n is odd, the median is the middle number.
If n is even, the median is the average of the 2
middle numbers.

Not Affected by Extreme Values


0 1 2 3 4 5 6 7 8 9 10

Median = 5

0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5

The Median
The Median of a set of observations is the value that

falls in the middle when the observations are


arranged in order of magnitude or ranked increasingly

Example 4.3

Comment

Find the median of the time spent on the internet


for the adults of example 4.1

Suppose only 9 adults were sampled


(exclude, say, the longest time (33))

Even number of observations

8.5

0, 0, 5,
0, 7,
5, 8,
7, 8,, 9, 12,
9, 12,
14,14,
22,22,
33 33

Odd number of observations


0, 0, 5, 7, 8 9, 12, 14, 22

The Mode
A Measure of Central Tendency
Value that Occurs Most Often
Not Affected by Extreme Values
There May Not be a Mode
There May be Several Modes
Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

The Mode
The Mode of a set of observations is the

variable value that occurs most frequently.


Set of data may have one mode (or modal
class), or two or more modes.
The modal class

For large data sets


the modal class is
much more relevant
than a single-value
mode.

Approximating
Descriptive Measures
Approximating descriptive measures for
for
grouped
Data
by
grouped data may be needed in two cases:
when approximated values.suffices the needs,
classes

when only secondary grouped data are available.

x f
x
f

k
i 1 i i
k
i 1 i

x midpoint
ni frequency

Example 4.13
Approximate the mean and standard

deviation of the telephone call durations


problem (example ), as represented by
the frequency
distribution
Class
Class Frequency Midpoint

Real value :
x 10.26

Class
ii
11
22
33
.
.
66

Class Frequency Midpoint


limits
limits
nn
xxi i
xxnni i
2-5
3.5
10.5
2-5
33
3.5
10.5
5-8
6.5
39.0
5-8
66
6.5
39.0
8-11
9.5
76.0
8-11
88
9.5
76.0
.

.
.
.

.
.
17-20
18.5
37.0
17-20
22
18.5
37.0
30
nn==30

6.5

11

14

17

20

More

312.0
312.0

..

Median and Mode


Median
Me -1

1
( ni 1) - n i
2
i 1
Me x 0 K
n Me

Median and Mode


Mode
1
Mo x 0 K
1 2

Relationship among Mean,


Median, and Mode
If a distribution is symmetrical, the
mean, median and mode coincide

If a distribution is non symmetrical,


and skewed to the left or to the
right, the three measures differ.

A positively skewed distribution


(skewed to the right)

A negatively skewed distribution


(skewed to the left)

Mode
Mean
Median

Mean
Mode
Median

Summary Measures
Summary Measures

Central Tendency
Mean
n

xi

i 1

Median

Mode

xi x
s2
n 1

Variation
Range
Variance

Coefficient of
Variation

Standard Deviation

Measures of Variation
Variation

Variance
Range

Population
Variance
Sample
Variance

Standard Deviation
Population
Standard
Deviation
Sample
Standard
Deviation

Coefficient of
Variation

S
CV
100%
X

The Range
Measure of Variation
Difference Between Largest & Smallest
Observations:
Absolute Range = x Largest x Smallest
Relative Range =

( xLargest xSmallest ) / mean

Ignores How Data Are Distributed:


7

10

11

Range = 12 - 7 = 5

12

10

11

Range = 12 - 7 = 5

12

Deviation
Individual deviation from the

xi mean

mean =

X 0

Overall deviation = 0, because


2

X
i
Summing squared deviations

| xi x |
or
absolute values of the deviations

Variance
Important Measure of Variation
Shows Variation About the Mean
Computed as an arithmetic mean of
squared deviations or as a square mean of
individual deviations
2
2 Xi
For the Population:
N
Xi X
s
For the Sample:
n1

For the Population: use N in the


denominator.

For the Sample : use n - 1


in the denominator.

Standard Deviation
Most Important Measure of Variation
Shows Variation About the Mean:
2
Xi

For the Population:


N
For the Sample:

For the Population: use N in the


denominator.

Xi X
n 1

For the Sample : use n - 1


in the denominator.

Sample Standard Deviation

Xi X

s
n1
Data:
24

Xi :

10

12

n=8

s=

14

15

17

18

18

Mean =16

(10 16) 2 (12 16) 2 (14 16) 2 (15 16)2 (17 16) 2 (18 16) 2 (24 16) 2
81

= 4.2426

Comparing Standard Deviations


Data :

X i : 10

N= 8

s =

12

14

15

17

18

18

24

Mean =16

Xi X
n1
2
Xi
N
2

4.2426

3.9686

Value for the Standard Deviation is larger for data considered as a Sample.

Comparing Standard Deviations


Data A - AGE
11 12

13

14

15

16

17

18

19

20 21

Mean = 15.5
s = 3.338

20 21

Mean = 15.5
s = .9258

20 21

Mean = 15.5
s = 4.57

Data B - AGE
11 12

13

14

15

16

17

18

19

Data C - AGE
11 12

13

14

15

16

17

18

19

Coefficient of Variation
Measure of Relative Variation
Always a % or coefficient
Shows Variation Relative to Mean
Used to Compare 2 or More
Groups

Formula ( for Sample):


S
CV

100%

Comparing Coefficient of
Variation

Stock A: Average Price last year =


$50

Standard Deviation (sd) =

Average Price last year =


Coefficient of Variation:
Stock A: CV
= 10%
(sd)
= $5
100%
Stock B: CV = 5%
Both average prices are
representatives

$5
Stock B:
$100

S
CV

Shape

Describes How Data Are Distributed


between smallest and largest values
Measures of Shape:

Symmetric or skewed

Left-Skewed or
Negative SkewMean Median Mode
ness

Symmetric
Mean = Median = Mode

Right-Skewed or
Positively Skewed
Mode Median Mean

Box plot graphical


presentation of CTM

Central tendency
measures summary

Discussed Measures of Central Tendency


Mean, Median, Mode

Addressed Measures of Variation

The Range, Variance,


Standard Deviation, Coefficient of Variation

Determined Shape of Distributions


Symmetric or Skewed

Coefficient of skewness

Mean Median Mode

Mean = Median = Mode

Mode Median Mean

FRACTILES
PERCENTILES

VALUES OF VARIABLE AFTER A

CERTAIN PERCENTAGE OF DATA

También podría gustarte