Documentos de Académico
Documentos de Profesional
Documentos de Cultura
1.0 INTRODUCTION
What is Statistics?
Statistics is the science that deals with the collection, classification, analysis and interpretation of
information/data in order to make decisions.
Statistics Definition
1. Population
A population is any entire collection of objects from which we may collect data.
This could be people, animals, microchips and so on.
3. Variable
A variable is a particular characteristic of the object being studied. This characteristic can
take on different values as we measure/gather it from one object to another
Example: The variable measures could be fuel consumption, models of cars and seating
capacity
4. Random
Randomness means unpredictability. One of the requirements in sampling process is to
conform to randomness. Hence, the variable being measures is called random variables.
Example: A sample of 10 students that cross the school gate within 7.30 until 7.35 am
was randomly selected to be interviewed by the prefect.
5. Data
Data are basically numbers derived from measuring a variable. However data can also be
non-numeric
Example: Height (153.4 cm, 141 cm) and favorite color (red, blue)
8. Parameter
A parameter is a value used to represent a certain population characteristic. Parameters
are assigned Greek letters.
Example: mean = and standard deviation =
9. Statistics
A statistics is a number summarizing some aspects of the data calculated using the data
collected from the sample. They are assigned Roman letters.
Example: mean = and standard deviation =
Any data that you first gather is ungrouped data. Ungrouped data is data in the raw. An
example of ungrouped data is a any list of numbers that you can think of.
Grouped data is data that has been organized into groups known as classes. Grouped
data has been 'classified' and thus some level of data analysis has taken place, which means that
the data is no longer raw.
Example
Ungrouped Data Grouped Data
Data on Minutes Spent on the Phone by 30 Respondents
Hana
Example A:
Suppose a researcher wished to do a study on the ages of the top 50 wealthiest people in the
world. The raw data collected was listed below:
57 90 81 73 61
59 57 69 65 60
56 85 78 68 85
85 81 61 69 52
81 43 43 37 78
82 68 67 64 48
56 49 79 77 65
40 69 80 59 54
71 76 69 61 74
83 35 74 87 49
Construct the frequency distribution with class limit 35 41, 42 48, and so on.
Solution:
Number of class = 8
Highest value = 90
Lowest value = 35
90 35
Size of class = = 6.875 7
8
Frequency Table:
Minutes Tally Frequency
35-41 /// 3
42-48 /// 3
49-55 //// 4
56-62 //// //// 10
63-69 //// //// 10
70-76 //// 5
77-83 //// //// 10
84-90 //// 5
Total=50
1.2.1 Mean
Mean is the sum of the values, divided by the total number of values.
Example B:
The data represent the number of days off per year for a sample of individuals selected from
seven different countries. Find the mean:
15 21 16 17 25 30 27
Solution:
15 + 21 + 16 + 17 + 25 + 30 + 27 151
Mean, x = = = 21 .57
7 7
The mean of the number of days off is 21.57 days.
Example C:
Find the mean of the Science Test marks for the selected 10 students in Class Colorful below.
40 51 60.5 45 46
53 59 44 35 53
Solution:
40 + 51 + 60 .5 + 45 + 46 + 53 + 59 + 44 + 35 + 53 486 .5
Mean, x = = = 48 .65
10 10
Example D:
The number of students in all the Bachelor of Technology in four programmes was listed below
by a educational officer from UTHM. Find the mean.
Multimedia : 140 Web Technology : 165
Internet Security : 210 Software Engineering : 200
Solution:
140 + 165 + 210 + 200 71500
Mean, = = = 178 .75
4 4
The mean number of students in all the Bachelor of Technology is 178.75.
Example E:
A sample of 30 automobiles was tested for fuel efficiency (in miles per gallon). Find .
Fuel efficiency (in miles per gallon) Frequency
8-12 3
13-17 5
18-22 15
23-27 5
28-32 2
Solution:
Class Class midpoint, Frequency, 0- -
- 0-
8-12 10 3 30
13-17 15 5 75
18-22 20 15 300
23-27 25 5 125
28-32 30 2 60
0=30 0=590
*-.' 0- - 590
= * = = 19.67
-.' 0- 30
Example F:
Find the mean for the repetition of the word knowledge in a story book.
Word knowledge Number of page
10 14
15 16
23 23
40 33
41 11
45 3
Solution:
Class midpoint, Frequency, 0- -
- 0-
10 14 140
15 16 240
23 23 529
40 33 1320
41 11 451
45 3 135
0=100 0=2815
*-.' 0- - 2815
= * = = 28.15
-.' 0- 100
The mean for the repetition of word knowledge is 28.15 in every page.
1.2.2 Median
Median is the most centrally located (middle) value.
Example G:
The number of rooms in the eight hotels in Malaysia is listed below. Find the median.
650 450 700 550 655 350 400 500
Solution:
Arrange the data in increasing order:
350 400 450 500 550 650 655 700
Select the middle value:
350 400 450 500 550 650 655 700
JKLLMN
OPPQOOP
Median, 9 =
(
1050
=
2
= 525
The median for the number of rooms in the eight hotels in Malaysia is 525 rooms
Example H:
The number of children with asthma during a specific year in seven zones in Johor is shown.
Find the median.
253 125 328 417 201 70 90
Solution:
70 90 125 201 253 328 417
JKLLMN
The median for the number of children with asthma during a specific year in seven zones in
Johor is 201
Example I:
Six customers purchased these numbers of books in a month;
7 1 3 2 8 13
Solution:
1 2 3 7 8 13
JKLLMN
RQS
Median, 9 =
(
10
=
2
=5
The median for the number of books that six customers purchased in a month is 5
Example J:
Find the median for the data in the following frequency table
Class Frequency
30 39 4
40 49 12
50 59 17
60 69 9
70 79 7
80 89 4
90 99 2
100 109 1
Solution:
Class Frequency,f Cumulative frequency,F
30 39 4 4
40 49 12 16
50 59 17 33
60 69 9 42
70 79 7 49
80 89 4 53
90 99 2 55
100 109 1 56
U 0 = 56
This value is located in the third 50 59 interval; we call this interval as median class.
OPQVW
Lower boundary for median class, :; = (
= 49.5
28 16
= 49.5 + 10 X Y
17
= 56.56
Example K:
A sample of 30 automobiles was tested for fuel efficiency (in miles per gallon).
Fuel efficiency (in miles per gallon) Frequency
8-12 3
13-17 5
18-22 15
23-27 5
28-32 2
Find the median.
Solution:
< = 13 8 = 5
G=8
0; = 15
>
@A
Median, 9 = :; + <( ? )
BC
15 8
= 17.5 + 5 X Y = 19.83
15
Example L:
A random sample of 35 states shows the number of specialty coffee shops for a specific
company. Find the median.
Class boundaries Frequency
0.5-19.5 13
19.5-38.5 8
38.5-57.5 6
57.5-76.5 5
76.5-95.5 3
Solution:
Class boundaries Frequency,f Cumulative frequency,F
0.5-19.5 13 13
19.5-38.5 8 21
38.5-57.5 6 27
57.5-76.5 5 32
76.5-95.5 3 35
U 0 = 35
+ = 35,
*
= 17.5,
(
17.5 13
= 19.5 + 19 X Y = 30.19
8
1.2.3 Mode
The third measure of central tendency is mode. The mode is the value that occurs most
often in the data set. It is sometimes said to be the most typical case.
Example M:
Find the mode of the signing bonuses of eight football player for a year 2013. The bonuses in
millions of dollars are
17 12 34.5 11 11.3 11 13.5 11
Solution:
Mode = 11
unimodal
Example N:
Find the mode for the number of branches that six banks have:
Solution:
Mode = no mode
Example O:
Find the mode of the number of books purchased by twelve customers in a month;
7 7 4 4 4 9 7 7 4 7 4 9
Solution:
Mode = 7 ^+] 11
bimodal
Example P:
Find the mode for the data in the following frequency table
Class Frequency
30 39 4
40 49 12
50 59 17
60 69 9
70 79 7
80 89 4
90 99 2
100 109 1
Solution:
Class Frequency,f
30 39 4
40 49 12
50 59 17
60 69 9
70 79 7
80 89 4
90 99 2
100 109 1
U 0 = 56
5
Mode, 9P = 49.5 + 10( )
5+8
= 53.35
Example Q:
A sample of 30 automobiles was tested for fuel efficiency (in miles per gallon).
Solution:
Class Frequency,f
8-12 3
13-17 5
18-22 15
23-27 5
28-32 2
U 0 = 30
U 0 = 35
13
Mode, 9P = 0.5 + 19( )
13 + 5
= 14.22
Another important statistic for describing a data set is a measure of dispersion or spread in the
data. This measure tells you how much different are the values in the data set from the middle of
the data set.
For the spread or variability of a data set, three measures are commonly used; range, variance
and standard deviation.
1.3.1 Range
Example S:
Find the range for the data on hand phone usage, for a call and texting (minutes) for eight
students in a month below:
Solution:
Range = 3000 - 17
= 2983
Example T:
The salaries for the ABC Companys staff are listed below. Find the range.
Variance is the average of the squares of the distance each value from the mean. The symbol for
the population variance is ( ,sigma square. Whereas for sample variance is ( .
Standard deviation measure the distance of each value from the mean. It is the square root of the
variance. The symbol for the population variance is ,sigma and for sample standard
deviation.
ii) n 2 n
N x i xi
2
Example U:
A testing lab wishes to test two experimental brands of outdoor paint to see how long each will
last before fading (in month). The testing lab makes 4 gallons of each paint to test. Since
different chemical agents are added to each brand and only four cans are involved, these two
brands constitute small populations. Find the variance and standard deviation of each brand and
make a conclusion.
Brand ABC: 12 35 37 26
Brand XYZ: 17 32 37 24
Solution:
The sentences ., these two brands constitute small populations. shows the data given is
population data.
Using Formula C
Brand ABC
Formula C(i) Formula C(ii)
12 + 35 + 37 + 26 110 - - (
= = = 27.5
4 4 12 144
- - (- )(
35 1225
37 1369
12 12 27.5 = -15.5 240.25
26 676
35 7.5 56.25
37 9.5 90.25 U 110 3414
26 -1.5
*
2.25 =4
U(- )(
Therefore,
389 2
-.' n 2 n
N xi xi
=4 i =1 i =1 4(3414) (110)
2
Therefore, =
2
=
N2 42
*-.'(- )( 389
=
(
= = 97.25 = 97.25
4
= b ( = 97.25 = 9.86 = b ( = 97.25 = 9.86
Variance, ( = 97.25 and Standard Variance, ( = 97.25 and Standard
deviation, = 9.86 deviation, = 9.86
Brand XYZ
17 + 32 + 37 + 24 110 - - (
= = = 27.5
4 4 17 289
- - (- )(
32 1024
37 1369
17 17 27.5 = -10.5 110.25
24 576
32 4.5 20.25
37 9.5 90.25 U 110 3258
=4
24 -3.5 12.25
*
U(- )(
Therefore,
233 2
-.' n 2 n
=4 N xi xi
4(3258) (110)
2
Therefore, 2 = i =1 2 i =1 =
*-.'(- )( 233 N 42
( = = = 58.25
4 = 58.25
= b ( = 58.25 = 7.63
Variance, ( = 58.25 and Standard deviation, = = b ( = 58.25 = 7.63
7.63 Variance, ( = 58.25 and Standard deviation, =
7.63
Conclusion
Brand ABC : Variance, ( = 97.25 and Standard deviation, = 9.86
Brand XYZ : Variance, ( = 58.31 and Standard deviation, = 7.64
When the means are equal, the larger the variance or standard deviation, the more variable the data are. Since the standard
deviation of Brand ABC is 9.86 and the standard deviation of Brand XYZ is 7.64, the data are more variable for Brand ABC.
Data in Brand ABC is spreading widely compare to data in Brand XYZ.
Example V:
The sample of six students in UTHM was selected by the accountant to examine their pocket
money in a month. Find the variance and standard deviation from the salaries listed below.
Using Formula A
U(- )(
-.' 71083.33 n 2 n
2
n xi xi
+=6 i =1 = 6(357100) 1310
2
s 2 = i =1
Therefore, n(n 1) 6(6 1)
*-.'(- )( 71083.33 = 1421.67
( = =
+1 5
= 14216.67 = b ( = 14216.67 = 119.23
Example W:
A sample of seven zones in Johor was selected by the Ministry of Health to check the number
children with asthma during a specific year in each zones. Find the ( and .
Solution:
253 + 125 + 328 + 417 + 201 + 70 + 90 1484
= = = 212
7 7 x X^2
253
- - (- ) ( 64009
253 41 1681 125
125 -87 7569 15625
328 116 13456 328
107584
417 205 42025 417
201 -11 121 173889
70 -142 20164 201
90 -122 14884 40401
* 70
U(- )(
4900
99900 90
-.' 8100
+=7
Therefore, sum 1484 414508
*-.'(- )( 99900
( = = = 16650
+1 6
+=7
= b ( = 16650 = 129.03
Variance, ( = 16650 and Standard deviation, = 129.03
Therefore,
2
n 2 n
n xi xi
7(414508) 14842
s 2 = i =1 i =1 =
n(n 1) 7(7 1)
= 16650
= b ( = 16650 = 129.03
Variance, ( = 16650
Standard deviation, = 129.03
Example X:
A sample of 30 automobiles was tested for fuel efficiency (in miles per gallon).
Solution:
Using Formula B
Formula B(i)
- 0- 0- - (- )( 0- (- )(
10 3 30 93.51 280.53
15 5 75 21.81 109.04
20 15 300 0.11 1.63
25 5 125 28.44 142.04
30 2 60 106.71 213.42
Total 30 590 746.66
*-.' 0- - 590
= * = = 19.67
-.' 0- 30
*
1 1
( = U 0- (- )( = (746.66) = 25.75
0 1 30 1
-.'
= b ( = 25.75 = 5.07
Formula B(ii)
- 0- 0- - - ( 0- - (
10 3 30 100 300
15 5 75 225 1125
20 15 300 400 6000
25 5 125 625 3125
30 2 60 900 1800
Total 30 590 12350
* (
1 ( 0- - ) 1 590(
=
(
[U 0- - ( ]= f12350 g = 25.75
0 1 0 29 30
-.'
= b ( = 25.75 = 5.07
Example Y:
A random sample of 35 states shows the number of specialty coffee shops for a specific
company.
Solution:
- 0- 0- - (- )( 0- (- )( or - ( 0- - (
10 13 130 650.76 8459.88 100 1300
29 8 232 42.38 339.04 841 6728
48 6 288 156 936 2304 13824
67 5 335 991.62 4958.1 4489 22445
86 3 258 2549.24 7647.72 7396 22188
Total 35 1243 22340.74 66485
*-.' 0- - 1243
= = = 35.51
*-.' 0- 35
* * (
1 1 ( 0- - )
=
(
U 0- (- )( ( = [U 0- - ( ]
0 1 0 1 0
-.' -.'
1 1 1243(
= (22340.74) = f66485 g
35 1 34 35
= 657.08 = 657.08
= b ( = 657.08 = 25.63