Está en la página 1de 102

# Descriptive Statistics

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

What is Statistics?

“Statistics is a way to get information from data.

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

What is Statistics?

“Statistics is a way to get information from data”
Statistics

Data Information

Statistics is a tool for creating new understanding from a set of numbers.

Definitions: Oxford English Dictionary
© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

“Statistical Techniques/Methods”

Formulate Get some Visualize the
problem data data

Do some Interpret
statistical results
calculations

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

Descriptive Statistics
Descriptive statistics deals with methods of organizing,
summarizing, and presenting data in a convenient and
informative way.

One form of descriptive statistics uses graphical techniques,
which allow statistics practitioners to present data in ways that
make it easy for the reader to extract useful information.

Chapter 2 introduces several graphical methods.

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

Descriptive Statistics
Another form of descriptive statistics uses numerical
techniques to summarize data.

The mean and median are popular numerical techniques to
describe the location of the data.

The range, variance, and standard deviation measure the
variability of the data

Chapter 4 introduces several numerical statistical measures
that describe different features of the data.
© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

All Rights Reserved. © 2015 Cengage Learning. or duplicated. in whole or in part.000 students on campus. scanned. . We want to know the mean number of soft drinks consumed by all 50. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. The data are the numbers of cans of soft drinks consumed in 7 days by the 500 students in the sample. Inferential statistics The information we would like to acquire in the Case is an estimate of annual profits from the exclusivity agreement. May not be copied. To accomplish this goal we need another branch of statistics- inferential statistics.

000 students. Statistical techniques make such endeavors unnecessary. The population in question in this case is the soft drink consumption of the university's 50. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. scanned. we can sample a much smaller number of students (the sample size is 500) and infer from the data the number of soft drinks consumed by all 50. . May not be copied. We can then estimate annual profits for Cola. Inferential statistics Inferential statistics is a body of methods used to draw conclusions or inferences about characteristics of populations based on sample data. © 2015 Cengage Learning.000 students. in whole or in part. Instead. The cost of interviewing each student would be prohibitive and extremely time consuming. or duplicated.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All cola users Sample — A sample is a set of data drawn from the population. May not be copied. but less than the population. scanned. .g. All Rights Reserved. in whole or in part. — frequently very large. E. sometimes infinite.g.Key Statistical Concepts Population — a population is the group of all items of interest to a statistics practitioner. a sample of drinkers © 2015 Cengage Learning. — Potentially very large. or duplicated. E.

Key Statistical Concepts Parameter — A descriptive measure of a population. All Rights Reserved. . Statistic — A descriptive measure of a sample. or duplicated. scanned. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. © 2015 Cengage Learning.

in whole or in part. May not be copied. © 2015 Cengage Learning. . Samples have Statistics.Key Statistical Concepts Population Sample Subset Statistic Parameter Populations have Parameters. scanned. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. or duplicated.

summarizing. The actual method used depends on what information we would like to extract.Descriptive Statistics …are methods of organizing. . May not be copied. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. scanned. or duplicated. and Numerical Techniques (Chapter 4). and presenting data in a convenient and informative way. These methods include: Graphical Techniques (Chapter 2). Are we interested in… • measure(s) of central location? and/or • measure(s) of variability (dispersion)? Descriptive Statistics helps to answer these questions… © 2015 Cengage Learning.

scanned. in whole or in part.Inferential Statistics Descriptive Statistics describe the data set that’s being analyzed. © 2015 Cengage Learning. or duplicated. All Rights Reserved. Inferential statistics is also a set of methods. but it is used to draw conclusions or inferences about characteristics of populations based on data from a sample. Hence we need another branch of statistics: inferential statistics. but doesn’t allow us to draw any conclusions or make any interferences about the data. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied.

Statistical Inference Statistical inference is the process of making an estimate. prediction. All Rights Reserved. scanned. in whole or in part. . May not be copied. or decision about a population based on a sample. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Population Sample Inference Statistic Parameter What can we infer about a Population’s Parameters based on a Sample’s Statistics? © 2015 Cengage Learning.

or duplicated. All Rights Reserved. May not be copied. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. we can apply what we know about a sample to the larger population from which it was drawn! © 2015 Cengage Learning. or decision about a population based on sample data. in whole or in part. prediction. scanned. Thus. we can make an estimate.Statistical Inference We use statistics to make inferences about parameters. Therefore.

or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Types of Data • Cross-sectional data –Data collected by recording a characteristic of many subjects at the same point in time. in whole or in part. or without regard to differences in time. and countries. . –Subjects might include individuals. LO 1.3 © 2015 Cengage Learning. firms. scanned. industries. All Rights Reserved. May not be copied. regions. households.

or duplicated. monthly. May not be copied.3 © 2015 Cengage Learning.it is an example of time series data. LO 1. weekly. . GDP growth rate from 1980 to 2010 . scanned.S. in whole or in part.Types of Data • Time series data –Data collected by recording a characteristic of a subject over several time periods. –This graph plots the U. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or annual observations. –Data can include daily. quarterly. All Rights Reserved.

• A variable is the general characteristic being observed on an object of interest. scanned. age. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. .Variables and Scales of Measurement Describe variables and various types of measurement scales. weight •Discrete •Continuous © 2015 Cengage Learning. All Rights Reserved. • Types of Variables –Qualitative – gender. May not be copied. political affiliation –Quantitative – test scores. or duplicated. in whole or in part. race.

All Rights Reserved. number of points scored in a basketball game.4 © 2015 Cengage Learning. May not be copied. scanned. •Examples: Number of children in a family.Variables and Scales of Measurement • Types of Quantitative Variables –Discrete •A discrete variable assumes a countable number of distinct values. . LO 1. in whole or in part. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

Variables and Scales of Measurement • Types of Quantitative Variables –Continuous •A continuous variable can assume an infinite number of values within some interval. scanned. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. •Examples: Weight.4 © 2015 Cengage Learning. height. May not be copied. investment return. All Rights Reserved. LO 1. . or duplicated. in whole or in part.

Nominal Qualitative . All Rights Reserved.Ordinal .Ratio Quantitative LO 1. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Variables and Scales of Measurement • Scales of Measure . May not be copied. in whole or in part. . or duplicated.4 © 2015 Cengage Learning. scanned.

.Variables and Scales of Measurement • The Nominal Data –Data are simply categories for grouping the data.4 © 2015 Cengage Learning. in whole or in part. scanned. Qualitative values may be converted to quantitative values for analysis purposes. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. or duplicated. May not be copied. LO 1.

Variables and Scales of Measurement • The Ordinal Scale –Ordinal data may be categorized and ranked with respect to some characteristic or trait. scanned. poor). LO 1. instructors are often evaluated on an ordinal scale (excellent. or duplicated.4 © 2015 Cengage Learning. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied. –Differences between categories are meaningless because the actual numbers used may be arbitrary. in whole or in part. fair. . •For example. good. All Rights Reserved. •There is no objective way to interpret the difference between instructor quality.

scanned. –There is an “absolute 0” or defined starting point. in whole or in part. –Differences between values are equal and meaningful. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. . meaningful ratios may be obtained. or duplicated. All Rights Reserved. LO 1.4 © 2015 Cengage Learning. May not be copied.Variables and Scales of Measurement • The Ratio Scale –The strongest level of measurement. “0” does mean “the absence of …” Thus.

May not be copied. and Inventory Levels LO 1. . and Distance •Business Examples: Sales. in whole or in part. Profits. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. Time.4 © 2015 Cengage Learning. All Rights Reserved.Variables and Scales of Measurement • The Ratio Scale –The following variables are measured on a ratio scale: •General Examples: Weight.

Nominal Values are the arbitrary numbers that represent categories.Hierarchy of Data… Ratio Values are real numbers. . Only calculations based on the frequencies of occurrence are valid. or duplicated. scanned. © 2015 Cengage Learning. Data may be treated as ordinal or nominal. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. Data can not be treated as ordinal or ratio. All calculations are valid. May not be copied. Ordinal Values must represent the ranked order of the data. in whole or in part. Calculations based on an ordering process are valid.

scanned. in whole or in part. © 2015 Cengage Learning.Graphical & Tabular Techniques for Nominal Data… The only allowable calculation on nominal data is to count the frequency of each value of the variable. We can summarize the data in a table that presents the categories and their counts called a frequency distribution. All Rights Reserved. . A relative frequency distribution lists the categories and the proportion with which each occurs. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or duplicated.

in whole or in part. laid off 5. part time. In the 2012 survey respondents were asked the following question. keeping house. 4. Other The responses were recorded using the codes 1. or duplicated. © 2015 Cengage Learning. School 7. All Rights Reserved. and 8. Retired 6. 2. . going to school. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Unemployed. or what? The responses were 1. 5.Example 2. 6.1 Work Status in the GSS 2012 Survey [GSS2012*] In Chapter 1 we briefly introduced the General Social Survey. Keeping house 8. respectively. Working full time 2. Working part time 3. 3. May not be copied. 7. Last week were you working full time. Temporarily not working 4. scanned.

5 Temporarily not working 3 40 2.6 Other 8 54 2. or duplicated.Frequency and Relative Frequency Distributions Work Status Code Frequency Percentage Frequency (%) Working full-time 1 912 46.0 Unemployed. . All Rights Reserved.7 © 2015 Cengage Learning. May not be copied.5 Keeping house 7 210 10. scanned.1 School 6 70 3. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.2 Working part-time 2 226 11. in whole or in part. laid off 4 104 5.3 Retired 5 357 18.

May not be copied. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. . scanned. All Rights Reserved.Nominal Data (Frequency) Bar Chart 1000 912 900 800 700 600 500 400 357 300 226 210 200 104 100 70 54 40 0 1 2 3 4 5 6 7 8 WRKSTAT Bar Charts are often used to display frequencies… © 2015 Cengage Learning. or duplicated.

Nominal Data (Relative Frequency) Pie Chart 8. or duplicated. 11. May not be copied.1% 4.6% 6.3% 3. 18. in whole or in part. 3. 2. All Rights Reserved.2% 5. 2. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.5% 1.5% Pie Charts show relative frequencies… © 2015 Cengage Learning. scanned. 10. 46. 5.7% 7.0% 2. .

6.5% 3. 1000 912 900 800 700 600 (based on the same data).0% © 2015 Cengage Learning. 46. 2.3% 2. 300 226 210 200 104 40 70 54 100 0 1 2 3 4 5 6 7 8 WRKSTAT Pie Chart 8. 11.1% 4.5% 10.2% 5.7% 7. or duplicated. scanned. 500 400 357 Just different presentation. 18. May not be copied. 5. in whole or in part.6% 1. 3. All Rights Reserved. 2. .Nominal Data Bar Chart It all the same information. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

in whole or in part. scanned.Describing the Relationship between Two Nominal Variables To describe the relationship between two nominal variables. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. As a first step we need to produce a cross-classification table. All Rights Reserved. . we must remember that we are permitted only to determine the frequency of the values. May not be copied. which lists the frequency of each combination of the values of the two variables © 2015 Cengage Learning. or duplicated.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. the company developed a Web site and began selling its products over the Internet. May not be copied. All Rights Reserved. Two years ago. in whole or in part. © 2015 Cengage Learning. the time spent on the Web site. the type of browser the customer used. a sample of 50 Chocolate transactions was selected from the previous month’s sales. . or duplicated. Data showing the day of the week each transaction was made.Problem One Chocolate manufacturing company sells quality chocolate products at its plant and retail stores. Web site have exceeded the company’s expectations. and management is now considering strategies to increase sales even further. the number of Web site pages viewed. scanned. the amount spent by each of the 50 customers. To learn more about the Web site customers.

in whole or in part. May not be copied.Box Plot © 2015 Cengage Learning. All Rights Reserved. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. .

May not be copied. © 2015 Cengage Learning. several new companies were created to compete in the business of providing long-distance telephone service. Pricing a service or product in the face of stiff competition is very difficult.Example 3.1 Following deregulation of telephone service. or duplicated. In almost all cases these companies competed on price since the service each offered is similar. and the actions of competitors. . All Rights Reserved. scanned. a flat monthly rate. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. price elasticity. demand. in whole or in part. or some combination of the two. Determining the appropriate rate structure is facilitated by acquiring information about the behaviors of customers and in particular the size of monthly long-distance bills. Long-distance packages may employ per-minute charges. Factors to be considered include supply.

The general manager planned to present his findings to senior executives.1 As part of a larger study. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. The company’s marketing manager conducted a survey of 200 new residential subscribers wherein the first month’s bills were recorded. May not be copied. These data are stored in file Xm03-01. What information can be extracted from these data? © 2015 Cengage Learning. in whole or in part.Example 3. a long-distance company wanted to acquire information about the monthly bills of new subscribers in the first month after signing with the company. All Rights Reserved. scanned. or duplicated.

All Rights Reserved.1 We have chosen eight classes defined in such a way that each observation falls into one and only one class. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. . May not be copied.Example 3. These classes are defined as follows: Classes Amounts that are less than or equal to 15 Amounts that are more than 15 but less than or equal to 30 Amounts that are more than 30 but less than or equal to 45 Amounts that are more than 45 but less than or equal to 60 Amounts that are more than 60 but less than or equal to 75 Amounts that are more than 75 but less than or equal to 90 Amounts that are more than 90 but less than or equal to 105 Amounts that are more than 105 but less than or equal to 120 © 2015 Cengage Learning. scanned. or duplicated. in whole or in part.

or duplicated. .Example 3. scanned. in whole or in part. May not be copied.1 Histogram 80 70 60 Frequency 50 40 30 20 10 0 15 30 45 60 75 90 105 120 Bills © 2015 Cengage Learning. All Rights Reserved. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

May not be copied.Interpret… (18+28+14=60)÷200 = 30% about half (71+37=108) i. . © 2015 Cengage Learning. are \$90 or more. less than \$30 There are only a few telephone bills in the middle range. or duplicated.e. nearly a third of the phone bills of the bills are “small”. All Rights Reserved. scanned. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. i.e. in whole or in part.

2: With 200 observations.Building a Histogram… 1) Collect the Data 2) Create a frequency distribution for the data… How? a) Determine the number of classes to use… How? Refer to table 3. scanned. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part.3 log (n) © 2015 Cengage Learning. May not be copied. we could use Sturges’ formula: Number of class intervals = 1 + 3. or duplicated. . we should have between 7 & 10 classes… Alternative. All Rights Reserved.

that is. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied. All Rights Reserved. . [8] b) Determine how large to make each class… How? Look at the range of the data. Range = Largest Observation – Smallest Observation Range = \$119. in whole or in part.Building a Histogram… 1) Collect the Data 2) Create a frequency distribution for the data… How? a) Determine the number of classes to use.63 Then each class width becomes: Range ÷ (# classes) = 119.63 – \$0 = \$119.63 ÷ 8 ≈ 15 © 2015 Cengage Learning. or duplicated. scanned.

May not be copied. . scanned. or duplicated. in whole or in part.Building a Histogram… © 2015 Cengage Learning. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved.

All Rights Reserved. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned.Building a Histogram… © 2015 Cengage Learning. . or duplicated. in whole or in part. May not be copied.

in whole or in part. scanned. when we draw a vertical line down the center of the histogram. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied. Shapes of Histograms… Symmetry A histogram is said to be symmetric if. . or duplicated. All Rights Reserved. the two sides are identical in shape and size: Frequency Frequency Frequency Variable Variable Variable © 2015 Cengage Learning.

All Rights Reserved. Shapes of Histograms… Skewness A skewed histogram is one with a long tail extending to either the right or the left: Frequency Frequency Variable Variable Positively Skewed Negatively Skewed © 2015 Cengage Learning. May not be copied. scanned. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or duplicated. .

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. All Rights Reserved. in whole or in part. while a bimodal histogram is one with two peaks: Bimodal Unimodal Frequency Frequency Variable Variable A modal class is the class with the largest number of observations © 2015 Cengage Learning. May not be copied. . or duplicated.Shapes of Histograms… Modality A unimodal histogram is one with a single peak.

Bell Shaped © 2015 Cengage Learning. May not be copied. in whole or in part. .Shapes of Histograms… Bell Shape A special type of symmetric unimodal histogram is one that is bell shaped: Frequency Many statistical techniques require that the population be bell shaped. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Variable Drawing the histogram helps verify the shape of the population in question. or duplicated. scanned. All Rights Reserved.

or duplicated. 3. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. bimodal different histograms… © 2015 Cengage Learning. Business Statistics and Mathematical Statistics have very unimodal vs.Histogram Comparison… Compare & contrast the following histograms based on data from Ex.4: The two courses. scanned. 3. May not be copied. in whole or in part. .3 & Ex. All Rights Reserved.

Ogive… • (pronounced “Oh-jive”) is a graph of • a cumulative frequency distribution. May not be copied. in whole or in part. calculate relative frequencies: • Relative Frequency = # of observations in a class • Total # of observations © 2015 Cengage Learning. scanned. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. from the frequency distribution created earlier. or duplicated. • We create an ogive in three steps… • First.

May not be copied.355 (or 35.00 to \$15.5%) © 2015 Cengage Learning. scanned. in whole or in part. the relative frequency for this class is 71 ÷ 200 (the total # of phone bills) = 0. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Relative Frequencies… For example. All Rights Reserved. or duplicated. Thus.00). we had 71 observations in our first class (telephone bills from \$0. .

May not be copied. its cumulative relative frequency is just its relative frequency) © 2015 Cengage Learning. .  2) Calculate cumulative relative frequencies by adding the current class’ relative frequency to the previous class’ cumulative relative frequency. (For the first class. We create an ogive in three steps… 1) Calculate relative frequencies. or duplicated. scanned.Ogive… Is a graph of a cumulative frequency distribution. All Rights Reserved. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part.

or duplicated. . scanned.Cumulative Relative Frequencies… first class… : : © 2015 Cengage Learning. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. All Rights Reserved. May not be copied. in whole or in part.

scanned.Ogive… Is a graph of a cumulative frequency distribution. or duplicated. in whole or in part.  3) Graph the cumulative relative frequencies… © 2015 Cengage Learning. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied.  2) Calculate cumulative relative frequencies. All Rights Reserved. 1) Calculate relative frequencies. .

All Rights Reserved. 2. May not be copied. scanned.Ogive… The ogive can be used to answer questions like: What telephone bill value is at the 50th percentile? “around \$35” (Refer also to Fig. in whole or in part.13 in your textbook) © 2015 Cengage Learning. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. .

or duplicated. scanned. Least Squares Line © 2015 Cengage Learning. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Numerical Descriptive Techniques… Measures of Central Location Mean. Standard Deviation. Variance. Coefficient of Variation Measures of Relative Standing Percentiles. Determination. Correlation. in whole or in part. Quartiles Measures of Linear Relationship Covariance. . Median. Mode Measures of Variability Range. All Rights Reserved.

. It is computed by simply adding up all the observations and dividing by the total number of observations: Sum of the observations Mean = Number of observations © 2015 Cengage Learning. shortened to mean.Measures of Central Location… The arithmetic mean. is the most popular & useful measure of central location. All Rights Reserved. May not be copied. scanned.k. average.a. in whole or in part. or duplicated. a. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

All Rights Reserved. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. May not be copied. in whole or in part. we use lower case letter n The arithmetic mean for a population is denoted with Greek letter “mu”: The arithmetic mean for a sample is denoted with an “x-bar”: © 2015 Cengage Learning. or duplicated. . we use uppercase letter N When referring to the number of observations in a sample.Notation… When referring to the number of observations in a population.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. in whole or in part. May not be copied. . All Rights Reserved.Arithmetic Mean… Sample Mean Population Mean © 2015 Cengage Learning. or duplicated.

heights of people. etc. . or duplicated. as soon as a billionaire moves into a neighborhood.g.The Arithmetic Mean… …is appropriate for describing measurement data. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.g. E. marks of student papers. All Rights Reserved. the average household income increases beyond what it was previously! © 2015 Cengage Learning. scanned. e. in whole or in part. …is seriously affected by extreme values called “outliers”.

in whole or in part. 33} N=10 (even) Sort them bottom to top. the observation that falls in the middle is the median. 8. 8. 9.Measures of Central Location… The median is calculated by placing all the observations in order. All Rights Reserved. 7. 14. 5. 22. 5. 7. scanned. 9. 14. 0. the middle is the simple average between 8 & 9: 0 0 5 7 8 9 12 14 22 33 median = (8+9)÷2 = 8. 12. Data: {0. 22} N=9 (odd) Sort them bottom to top. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied. 0. © 2015 Cengage Learning. . find the middle: 0 0 5 7 8 9 12 14 22 Data: {0. 12.5 Sample and population medians are computed the same way.

or more modes. Mode is a useful for all data types. All Rights Reserved. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. For large data sets the modal class is much more relevant than a single-value mode. scanned. May not be copied. Sample and population modes are computed the same way.Measures of Central Location… The mode of a set of observations is the value that occurs most frequently. © 2015 Cengage Learning. in whole or in part. or two. A set of data may have one mode (or modal class). or duplicated. . though mainly used for nominal data.

g.Mode… E. 14. All Rights Reserved. 0. in whole or in part. . or duplicated. 22. How is this a measure of “central” location? A modal class Frequency Variable © 2015 Cengage Learning. 7. Data: {0. 9. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. 12. scanned. May not be copied. 5. 8. 33} N=10 Which observation appears most often? The mode for this data set is 0.

All Rights Reserved. or duplicated. © 2015 Cengage Learning.e. Excel only calculates the smallest one. trimodal. . etc.=MODE(range) in Excel… Note: if you are using Excel for your data analysis and your data is multi-modal (i.e. in whole or in part. You will have to use other techniques (i. there is more than one mode). May not be copied. histogram) to determine if your data is bimodal. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned.

scanned. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Mode… If a distribution is symmetrical. in whole or in part. All Rights Reserved. . or duplicated. median and mode may coincide… median mode mean © 2015 Cengage Learning. the mean.Mean. Median. May not be copied.

Mean. or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part.: median mode mean © 2015 Cengage Learning. scanned. May not be copied. Median.g. the three measures may differ. . say skewed to the left or to the right. All Rights Reserved. E. Mode… If a distribution is asymmetrical.

in whole or in part.Mean. which one should we use? The mean is generally our first selection. or duplicated. © 2015 Cengage Learning. May not be copied. there are several circumstances when the median is better. Mode: Which Is Best? With three measures from which to choose. . The mode is seldom the best measure of central location. One advantage the median holds is that it not as sensitive to extreme values as is the mean. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. All Rights Reserved. However. Median.

Mean. .0 n 10 10 © 2015 Cengage Learning. Median.1. The mean becomes n x i 1 i 0  7  12  5  133  14  8  0  22 210 x    21. The mean was 11. Mode: Which Is Best? To illustrate. May not be copied. All Rights Reserved. in whole or in part. scanned. or duplicated. consider the data in Example 4. Now suppose that the respondent who reported 33 hours actually reported 133 hours (obviously an Internet addict). except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.5.0 and the median was 8.

Mode: Which Is Best? This value is only exceeded by only two of the ten observations in the sample. or duplicated. When there is a relatively small number of extreme observations (either very small or very large. but not both). in whole or in part. All Rights Reserved. . making this statistic a poor measure of central location. © 2015 Cengage Learning. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Mean. The median stays the same. scanned. May not be copied. Median. the median usually produces a better measure of the center of the data.

All Rights Reserved. © 2015 Cengage Learning. For nominal data. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or duplicated. .Mean. a mode calculation is useful for determining highest frequency but not “central location”. in whole or in part. scanned. Median. May not be copied. & Modes for Ordinal & Nominal Data For ordinal and nominal data the calculation of the mean is not valid. Median is appropriate for ordinal data.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. May not be copied. scanned. . or duplicated.Measures of Central Location • Summary… Compute the Mean to • Describe the central location of a single set of interval data Compute the Median to • Describe the central location of a single set of interval or ordinal data Compute the Mode to • Describe a single set of nominal data © 2015 Cengage Learning. All Rights Reserved.

how much are the observations spread out around the mean value? For example.Measures of Variability… Measures of central location fail to tell the whole story about the distribution. . May not be copied. two sets of class grades are shown. the red class has greater variability than the blue class. that is. © 2015 Cengage Learning. The mean (=50) is the same in each case… But. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or duplicated. in whole or in part. All Rights Reserved. scanned.

calculated as: Range = Largest observation – Smallest observation E. 4. 8. 50} Range = 46 Data: {4. 24. . All Rights Reserved. 15. in whole or in part. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. 39. 50} Range = 46 The range is the same in both cases.g. 4. 4. scanned. or duplicated.Range… The range is the simplest measure of variability. Data: {4. but the data sets have very different distributions… © 2015 Cengage Learning.

Hence… © 2015 Cengage Learning. Its major shortcoming is its failure to provide information on the dispersion of the observations between the two end points. or duplicated.Range… Its major advantage is the ease with which it can be computed. May not be copied. . in whole or in part. All Rights Reserved. scanned. Hence we need a measure of variability that incorporates all the data and not just two observations. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

. in whole or in part. All Rights Reserved. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Variance… Variance and its related measure. scanned. Used to measure variability. standard deviation. Population variance is denoted by (Lower case Greek letter “sigma” squared) Sample variance is denoted by (Lower case “S” squared) © 2015 Cengage Learning. are arguably the most important statistics. they also play a vital role in almost all statistical inference procedures. or duplicated.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. . in whole or in part. scanned. All Rights Reserved. May not be copied.Variance… population mean The variance of a population is: population size sample mean The variance of a sample is: Note: The denominator is sample size (n) minus one ! © 2015 Cengage Learning. or duplicated.

Its given by: © 2015 Cengage Learning. May not be copied. scanned. there is a short-cut formulation to calculate sample variance directly from the data without the intermediate step of calculating the mean. in whole or in part.Variance… As you can see. you have to calculate the sample mean (x- bar) in order to calculate the sample variance. All Rights Reserved. . Alternatively. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. or duplicated.

7. scanned.Application… Example 4. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Finds its mean and variance. May not be copied. or duplicated. …as opposed to  or 2 © 2015 Cengage Learning. 13. 9. 15. 15. All Rights Reserved. 7. in whole or in part. 7. 9. 23. The following sample consists of the number of jobs six students applied for: 17. 13. . Finds its mean and variance. 23. What are we looking to calculate? The following sample consists of the number of jobs six students applied for: 17.

or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. in whole or in part.Sample Mean & Variance… Sample Mean Sample Variance Sample Variance (shortcut method) © 2015 Cengage Learning. . All Rights Reserved. May not be copied.

or duplicated. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. . scanned. All Rights Reserved. thus: Population standard deviation: Sample standard deviation: © 2015 Cengage Learning.Standard Deviation… The standard deviation is simply the square root of the variance. May not be copied.

Using Data > Data Analysis > Descriptive Statistics in Excel. All Rights Reserved.e. © 2015 Cengage Learning.8 [Xm04-08]where a golf club manufacturer has designed a new club and wants to determine if it is hit more consistently (i. or duplicated. with less variability) than with an old club.Standard Deviation… Consider Example 4. we produce the following tables for interpretation… You get more consistent distance with the new club. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned. May not be copied. .

Interpreting Standard Deviation…
The standard deviation can be used to compare the variability of
several distributions and make a statement about the general shape
of a distribution. If the histogram is bell shaped, we can use the
Empirical Rule, which states:

1) Approximately 68% of all observations fall within one standard
deviation of the mean.
2) Approximately 95% of all observations fall within two standard
deviations of the mean.
3) Approximately 99.7% of all observations fall within three standard
deviations of the mean.

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

The Empirical Rule…
Approximately 68% of all observations fall
within one standard deviation of the mean.

Approximately 95% of all observations fall
within two standard deviations of the mean.

Approximately 99.7% of all observations fall
within three standard deviations of the mean.
© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

Chebysheff’s Theorem…
A more general interpretation of the standard deviation is
derived from Chebysheff’s Theorem, which applies to all
shapes of histograms (not just bell shaped).

The proportion of observations in any sample that lie
within k standard deviations of the mean is at least:

For k=2 (say), the theorem states
that at least 3/4 of all observations
lie within 2 standard deviations of
the mean. This is a “lower bound”
compared to Empirical Rule’s
approximation (95%).

© 2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Interpreting Standard Deviation Suppose that the mean and standard deviation of last year’s midterm test marks are 70 and 5. May not be copied. or duplicated. and at least 88. in whole or in part. If the histogram is not at all bell-shaped we can say that at least 75% of the marks fell between 60 and 80.) © 2015 Cengage Learning. respectively.9% of the marks fell between 55 and 85. and approximately 99. scanned. approximately 95% of the marks fell between 60 and 80. If the histogram is bell-shaped then we know that approximately 68% of the marks fell between 65 and 75.7% of the marks fell between 55 and 85. All Rights Reserved. . (We can use other values of k.

Coefficient of Variation… The coefficient of variation of a set of observations is the standard deviation of the observations divided by their mean. that is: Population coefficient of variation = CV = Sample coefficient of variation = cv = © 2015 Cengage Learning. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. or duplicated. scanned. All Rights Reserved. May not be copied.

but only moderately large when the mean value is 500. A standard deviation of 10 may be perceived as large when the mean value is 100.g. May not be copied.Coefficient of Variation… This coefficient provides a proportionate measure of variation. © 2015 Cengage Learning. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. or duplicated. e. All Rights Reserved. scanned.

scanned. . All Rights Reserved.Measures of Relative Standing & Box Plots Measures of relative standing are designed to provide information about the position of particular values relative to the entire data set. that means 60% of the other scores were below yours. © 2015 Cengage Learning. Percentile: the Pth percentile is the value for which P percent are less than that value and (100-P)% are greater than that value. or duplicated. in whole or in part. while 40% of scores were above yours. May not be copied. Suppose you scored in the 60th percentile on the GMAT. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

namely quartiles. and 75th percentiles. © 2015 Cengage Learning. All Rights Reserved. in whole or in part. Q2 = 50th percentile (which is also the median). 50th. The second quartile. or duplicated. . The third or upper quartile. The first or lower quartile is labeled Q1 = 25th percentile. May not be copied. scanned. Q3 = 75th percentile. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.Quartiles… We have special names for the 25th. We can also convert percentiles into quintiles (fifths) and deciles (tenths).

. Q3. that doesn’t mean you scored 80% on the exam – it means that 80% of your peers scored lower than you on the exam. It is about your position relative to others. © 2015 Cengage Learning. in whole or in part. or duplicated. All Rights Reserved. May not be copied. = 50th percentile Third quartile.Commonly Used Percentiles… First (lower) decile = 10th percentile First (lower) quartile. Q1. scanned. = 25th percentile Second (middle)quartile. = 75th percentile Ninth (upper) decile = 90th percentile Note: If your exam mark places you in the 80th percentile.Q2. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied. All Rights Reserved. in whole or in part.Location of Percentiles… The following formula allows us to approximate the location of any percentile: © 2015 Cengage Learning. scanned. or duplicated.

75 The 25th percentile is three-quarters of the distance between the second (which is 0) and the third (which is 5) observations. in whole or in part. scanned. or duplicated. .75)(5 – 0) = 3. Three-quarters of the distance is: (. the 25th percentile is 0 + 3. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. at which point are 25% of the values lower and 75% of the values higher? 0 0 5 7 8 9 12 14 22 33 L25 = (10+1)(25/100) = 2.75 Because the second observation is 0.75 © 2015 Cengage Learning.1: 0 0 5 7 8 9 12 14 22 33 Where is the location of the 25th percentile? That is.Location of Percentiles… Recall the data from Example 4. May not be copied. All Rights Reserved.75 = 3.

respectively. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. One-quarter of the distance is: (. scanned. which means the 75th percentile is at: 14 + 2 = 16 © 2015 Cengage Learning.Location of Percentiles… What about the upper quartile? L75 = (10+1)(75/100) = 8. All Rights Reserved. May not be copied. . which are 14 and 22. in whole or in part. or duplicated.14) = 2.25)(22 .25 0 0 5 7 8 9 12 14 22 33 It is located one-quarter of the distance between the eighth and the ninth observations.

not the value of the percentile itself.75 16 0 0 | 5 7 8 9 12 14 | 22 33 position 3. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. May not be copied. scanned.Location of Percentiles… Please remember… position 2.25 Lp determines the position in the data set where the percentile value lies. . or duplicated.75 8. © 2015 Cengage Learning. All Rights Reserved.

the interquartile range.Interquartile Range… The quartiles can be used to create another measure of variability. which is defined as follows: Interquartile Range = Q3 – Q1 The interquartile range measures the spread of the middle 50% of the observations. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. © 2015 Cengage Learning. Large values of this statistic mean that the 1st and 3rd quartiles are far apart indicating a high level of variability. in whole or in part. or duplicated. scanned. . All Rights Reserved. May not be copied.

. Box Plots… The box plot is a technique that graphs five statistics: • the minimum and maximum observations. and third quartiles. second. scanned. Any points that lie outside the whiskers are called outliers.5*(Q3–Q1)) • the first. The whiskers extend outward to the smaller of 1. or duplicated.5 times the interquartile range or to the most extreme point that is not an outlier. and Whisker Whisker (1. in whole or in part. © 2015 Cengage Learning. May not be copied. All Rights Reserved. The lines extending to the left and right are called whiskers. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

scanned.15 A large number of fast-food restaurants with drive-through windows offering drivers and their passengers the advantages of quick service. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. in whole or in part. All Rights Reserved. © 2015 Cengage Learning. Compare the five sets of data using a box plot and interpret the results. or duplicated. an organization called QSR planned a study wherein the amount of time taken by a sample of drive- through customers at each of five restaurants was recorded. May not be copied. To measure how good the service is.Example 4. .

Box Plots… These box plots are based on data in Xm04-15. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. Wendy’s service time is shortest and least variable. Hardee’s has the greatest variability. All Rights Reserved. scanned. © 2015 Cengage Learning. or duplicated. while Jack-in- the-Box has the longest service times. May not be copied. .

All Rights Reserved. scanned. May not be copied. 3. • Mean-variance analysis:  The performance of an asset is measured by its rate of return. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.  The rate of return may be evaluated in terms of its reward (mean) and risk (variance). or duplicated. LO 3.5 © 2015 Cengage Learning. . • The Sharpe ratio uses the mean and variance to evaluate risk.  Higher average returns are often associated with higher risk.5 Mean-Variance Analysis and the Sharpe Ratio Explain mean-variance analysis and the Sharpe Ratio.

the Sharpe ratio is computed as: x  R Sharpe Ratio  s where is the mean return for the investment is the mean return for a risk-free asset is the standard deviation for the investment LO 3. All Rights Reserved. or duplicated. scanned.5 © 2015 Cengage Learning.  For an investment І . Mean-Variance Analysis and the Sharpe Ratio • Sharpe Ratio  Measures the extra reward per unit of risk. in whole or in part. . except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied.

or duplicated.41. in whole or in part. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. May not be copied.5 © 2015 Cengage Learning. All Rights Reserved.   Since 0. the Metals fund offers more reward per unit of risk as compared to the Income fund. . LO 3. scanned. Mean-Variance Analysis and the Sharpe Ratio • Sharpe Ratio Example  Compute the Sharpe ratios for the Metals and Income funds given the risk free return of 4%.56 > 0.

Parameters and Statistics Population Sample Size N n Mean Variance S2 Standard Deviation S Coefficient of Variation CV cv Covariance Sxy Coefficient of Correlation r © 2015 Cengage Learning. . All Rights Reserved. in whole or in part. or duplicated. May not be copied. except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. scanned.