Está en la página 1de 41

Probabilidad y estadstica

Captulo 10 Inferencia con muestras pequeas


Some graphic screen captures from Seeing Statistics Some images 2001-(current year) www.arttoday.com

Introduccin
Cuando el tamao de la muestra es pequeo, la estimacin y los procedimientos de pruena del captulo 8 no son apropiados. Existe un test de muestras pequeas equivalente y procedimientos de estimacin para , la media de la poblacin normal 1 2 , la diferencia entre dos medias poblacionales 2, la varianza de una poblacin normal La proporcin entre dos varianzas poblacionales.

La distribucin muestral de la media muestral

Cuando tomamos una muestra de una poblacin normal, la media de la muestra x tiene una distribucin normal para cualquier tamao n, y

x z= / n

x no es normal! s/ n

Tiene distribucin normal estndar. Pero si es desconocida, y debemos estimarla, el estadstico no es normal. normal

Afortunadamente, este estadstico posee una distribucin muestral que es bien conocida para los estadsticos, llamada distribucin de Student, con n-1 grados de libertad.

Distribucin de Student

x t= s/ n
Podemos utilizar esta distribucin para crear procedimientos de prueba para la media de la Copyright 2006 Brooks/Cole poblacin A division of Thomson Learning, Inc.

Propiedades de t de Student
Forma de monte y simtrica alrededor de 0 Ms variable que z, con colas ms pesadas

La forma depende del tamao de la muestra n o de los grados de libertad, n-1 MY APPLET A medida que n se incrementa, las formas de las distribuciones t y z se tornan casi idnticas. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.

La tabla 4 da los valores de t que excluyen ciertos valores crticos en la cola de la distribucin t Indexa df y el rea apropiada de la cola a para encontrar ta, el valor de t con rea a a su derecha.
Para una muestra al azar de tamao n = 10, encuentre el valor de t que deja 0,025 en la cola derecha. Fila = df = n 1 = 9 Subndice columna = a =0,025 t0,025 = 2,262

Usando la tabla t

Inferencia de una muestra chica para la media poblacional


Los procesos bsicos son los mismos a los utilizados para muestras grandes. Para un test de hiptesis: Pruebe H 1: = 1versus H a : una o dos colas
utilizando el test estadstico x 1 t= s/ n usando valores - p o una regin de rechazo basada en una distribucin t con df = n 1

Inferencia de una muestra chica para la media poblacional


Para un intervalo de confianza de 100(1 % para ) la media poblacional :

s x t / 1 n donde t / 1es el valor de t que deja un rea /1 en la cola de una distribucin t con df = n 1

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Ejemplo
Un sistema de aspersin est diseado de tal forma que el tiempo promedio de los picos para activarse luego de haber sido encendidos es no mayor a 15 seg Una prueba de 5 sistemas di los tiempos siguientes: 17, 31, 12, 17, 13, 25 Est trabajando el sistema como se especific? Pruebe usando = 0,05

H 1: = 1 (trabajando como se especific) 1 H a : > 1 (no trabajando como se especific) 1

Ejemplo
Datos: 17, 31, 12, 17, 13, 25 Primero calcule la media y la desviacin estndar muestral:

xi 11 1 x= = = 1 1 ,11 1 n 1 ( x) 11 1 x 11 11 n = 1 =11 s= ,11 n 1 1


1 1 1

Ejemplo
Datos: 17, 31, 12, 17, 13, 25 Calcule el test estadstico y encuentre la regin de rechazo para = 0,05
Test estadstico : Grados de libertad : x 1 1 1 ,11 1 1 1 t= = = 1 1 df = n 1 1 1 1 ,1 = = s / n 1 11 / 1 ,1
Regin rechazo: Rechace H00si tt Regin rechazo: Rechace H si > 2,015 Si el test estadstico cae > 2,015 Si el test estadstico cae en la regin de rechazo, su valoren la regin de rechazo, su valorp debe ser menor aa = 0,05 p debe ser menor = 0,05
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Conclusin
Datos: 17, 31, 12, 17, 13, 25 Compare el test estadstico con la regin de rechazo, y saque conclusiones.

H 1: = 1 1 Ha : > 1 1

Test estadstic o : t = 1 1 ,1 Regin de rechazo : Rechace H 1si t > 1 1 ,11

Conclusin: Para nuestro ejemplo, t = 1,38 no cae en la regin de rechazo y H0 no es rechazada. La evidencia es insuficiente para indicar que el tiempo promedio de activacin es mayor a 15.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Slo puede aproximar el valor-p para la prueba usando la Tabla 4.

Aproximando el valor-p

Since the observed value of t = 1.38 is smaller than t.10 = 1.476, p-value > .10.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

MY

APPLET

El valor-p exacto

Puede obtener el valor-p usando algunas calculadoras o la PC.


Valor-p = .113 que es mayor que .10 que habamos aproximado usando Tabla 4.
One-Sample T: Times One-Sample T: Times
Test of mu = 15 vs > 15 Test of mu = 15 vs > 15 95% 95% Lower Lower Variable N Mean StDev SE Mean Bound T P Variable N Mean StDev SE Mean Bound T P Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113 Times 6 19.1667 7.3869 3.0157 13.0899 1.38 0.113
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Probando la diferencia entre dos medias


Como en el captulo 1las muestras independientes al azar de tamao n y n se extraen , 1 1 1 1 de poblaciones 1 1 medias y y varianzas y y con 1 1 1 1 Dado que los tamaos de las muestras son pequeos, las dos poblaciones deben ser normales.

Para probar: H0: 1 2 = D0 versus Ha: una de tres donde D0 es alguna diferencia que se ha tomado como hiptesis, generalmente 0
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Testing the Difference between Two Means


The test statistic used in Chapter 9
z x1 x1
1 1

s s + n1 n1

1 1

does not have either a z or a t distribution, and cannot be used for small-sample inference. We need to make one more assumption, that the population variances, although unknown, are equal. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.

Testing the Difference between Two Means


Instead of estimating each population variance separately, we estimate the common variance with (n1 1 1 + (n1 1s1 And the resulting )s 1 ) 1 s 1= test statistic, n +n 1
1 1

t=

x1 x1 D1 1 1 s + n n 1 1
1

has a t distribution with n1+n2-2 degrees of freedom.


Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Estimating the Difference between Two Means


You can also create a 100(1-)% confidence interval for 1-2. Remember the three
( x1 x1) t / 1 1 1 s + n n 1 1
1

Remember the three assumptions: assumptions:

1. Original populations 1. Original populations normal normal 2. Samples random and 2. Samples random and independent independent 3. Equal population 3. Equal population variances. variances.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

(n1 1 1 + (n1 1 1 )s 1 )s 1 with s 1 = n1+ n1 1

Time to Assemble

Two training procedures are compared by measuring the time that it takes trainees to assemble a device. A different group of trainees are taught using each method. Is there a difference in the two methods? Use = .01. H : = 1
Method 1 10 35 Method 2 12 31 4.5
1 1 1

Example

Sample size Sample mean

H a : 1 1 1 Test statistic : t= x1 x1 1 1 1 s + n n 1 1
1
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Sample Std Dev 4.9

MY

APPLET

Example
Sample size 10 35 12 31 4.5

Solve this problem by approximating the pMethod 1 Method 2 value using Time to Assemble Table 4.
Sample mean Sample Std Dev 4.9

Calculate :

Test statistic : 1 1 (n1 1 1 + (n1 1 1 )s )s 1 11 1 1 s = t= n1+ n1 1 1 1 1.1 1 1 + 1 1 1 11 ) + 1(1 ) ( .1 1 .1 1 1 1 1 = = 1 1 . 11 1 11 =1 1 .1


Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

p - value : P (t > 1 1+ P (t < 1 1 .1) .1) 1 P (t > 1 1 = ( p - value) . 1) 1


df = n11+ n22 2 = 10 + 12 2 = 20 df = n + n 2 = 10 + 12 2 = 20 .025 < ( p-value) < .05 .025 < ( p-value) < .05 .05 < p-value < .10 .05 < p-value < .10 Since the p-value is Since the p-value is greater than = .01, H0 is greater than = .01, H0 is not rejected. There is not rejected. There is insufficient evidence to insufficient evidence to indicate aadifference in indicate difference in the population means. the population means.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example

Testing the Difference between Two Means


How can you tell if the equal variance assumption is reasonable?
Rule of Thumb : larger s 1 If the ratio, 1 , 1 smaller s the equal variance assumption is reasonable. larger s 1 If the ratio, >1 , 1 smaller s use an alternative test statistic. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.

Testing the Difference between Two Means


If the population variances cannot be assumed equal, the test statistic
t x1 x1
1 1 s1 s1 + n1 n1

s s + n n df 1 1 11 ( s1 / n1 1 ( s1/ n1) 1 ) + n1 1 n1 1
1 1

1 1 1

has an approximate t distribution with degrees of freedom given above. This is most easily done by computer. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.

The Paired-Difference Test


Sometimes the assumption of independent samples is intentionally violated, resulting in a matched-pairs or paired-difference test. test By designing the experiment in this way, we can eliminate unwanted variability in the experiment by analyzing only the differences, di = x1i x2i to see if there is a difference in the two population means, 1 2.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
Car Type A Type B 1 10.6 10.2 2 9.8 9.4 3 12.3 11.8 4 9.7 9.1 5 8.8 8.3

One Type A and one Type B tire are randomly assigned to each of the rear wheels of five cars. Compare the average tire wear for types A and B using a test of hypothesis. But the samples are not independent. The pairs of H 1: 1 1 = 1 responses are linked because H a : 1 1 1 measurements are taken on the same car.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

The Paired-Difference Test


using the test statistic d 1 t= sd / n

To test H 1: 1 1 = 1 test H 1: d = 1 we

where n = number of pairs, d and sd are the mean and standard deviation of the differences, d i . Use the p - value or a rejection region based on a t - distribution with df = n 1 .
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
Car Type A Type B Difference 1 10.6 10.2 .4 2 9.8 9.4 .4 3 12.3 11.8 .5 4 9.7 9.1 .6 5 8.8 8.3 .5

H 1: 1 1 = 1 H a : 1 1 1
di Calculate = d = .11 n sd =

Test statistic : d 1 .1 1 1 t= = = 1.1 1 sd / n .11 / 1 11

( di ) 1 d 1
i

n n 1

= .1111
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
Car Type A Type B Difference 1 10.6 10.2 .4 2 9.8 9.4 .4 3 12.3 11.8 .5 4 9.7 9.1 .6 5 8.8 8.3 .5

Rejection region: Reject H0 if t > 2.776 or t < -2.776. Conclusion: Since t = 12.8, H0 is rejected. There is a difference in the average tire wear for the two types of tires.

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Some Notes
You can construct a 100(1-)% confidence interval for a paired experiment using sd d t / 1 n Once you have designed the experiment by pairing, you MUST analyze it as a paired experiment. If the experiment is not designed as a paired experiment in advance, do not use this procedure.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Inference Concerning a Population Variance


Sometimes the primary parameter of interest is not the population mean but rather the population variance 2. We choose a random sample of size n from a normal distribution. The sample variance s2 can be used in its standardized form: (n 1 1 )s 1 = 1 which has a Chi-Square distribution with n - 1 degrees of freedom. Copyright 2006 Brooks/Cole
A division of Thomson Learning, Inc.

Inference Concerning a Population Variance

Table 5 gives both upper and lower critical values of the chi-square statistic for a given df.

For example, the value of chi-square that cuts off . 05 in the upper tail of the distribution with df = 5 is 2 =11.07.

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Inference Concerning a Population Variance


we use the test statistic
1 1

1 To test H 1: 1 = 1 versus H a : one or two tailed

(n 1 )s = with a rejection region based on 1 1 a chi - square distribution with df = n 1 . Confidence interval : (n 1 1 )s (n 1 1 )s 1 < < 1 1 / 1 (1 / 1 )

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
A cement manufacturer claims that his cement has a compressive strength with a standard deviation of 10 kg/cm2 or less. A sample of n = 10 measurements produced a mean and standard deviation of 312 and 13.96, respectively.
A test of hypothesis: A test of hypothesis: H00:22= 10 (claim is H : = 10 (claim is correct) correct) Haa:22> 10 (claim is H: > 10 (claim is wrong) wrong) uses the test statistic: uses the test statistic:

(n 1 1 1 1 .1 1) )s (1 1 1 = = = 1.1 1 1 1 1 111
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
Do these data produce sufficient evidence to reject the manufacturers claim? Use = .05.
Rejection region: Reject H0 if 2 > 16.919 ( = .05). Conclusion: Since 2= 17.5, H0 is rejected. The standard deviation of the cement strengths is more than 10.

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Approximating the p-value


p - value : P ( > 1 1 .1 with df = n 1 1 ) =
1

.025 < p-value < .05 .025 < p-value < .05 Since the p-value is less Since the p-value is less than = .05, H00is not than = .05, H is not rejected. There is rejected. There is sufficient evidence to sufficient evidence to reject the manufacturers reject the manufacturers claim. claim.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Inference Concerning Two Population Variances


We can make inferences about the ratio of two population variances in the form a ratio. We choose two independent random samples of size n1 and n2 from normal distributions. If the two population variances are equal, the 1 statistic s1
F=
1 s1

has an F distribution with df1 = n1 - 1 and df2 = n2 - 1 degrees of freedom. Copyright 2006 Brooks/Cole

A division of Thomson Learning, Inc.

Inference Concerning Two Population Variances


Table 6 gives only upper critical values of the F statistic for a given pair of df1 and df2.
For example, the value of F that cuts off .05 in the upper tail of the distribution with df1 = 5 and df2 = 8 is F =3.69.

Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Inference Concerning Two Population Variances


1 1 To test H 1: 1 = 1 versus H a : one or two tailed we use the test statistic 1 s1 1 F = 1 where s1 is the larger of the two sample variances. s1 with a rejection region based on an F distribution with df1= n1 1 df 1 = n1 1 and .

Confidence interval :
1 1 1 1 1 1 1 1 1 1 1 1

s 1 s < < Fdf 1,df1 s Fdf1,df 1 s


Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
An experimenter has performed a lab experiment using two groups of rats. He wants to test H0: 1 = 2, but first he wants to make sure that the population variances are equal.
Standard (2) 10 13.64 2.3 Experimental (1) 11 12.42 5.8 Sample size Sample mean Sample Std Dev

Preliminary test :
1 1 1 1 H 1: 1 = 1 versus H a : 1 1
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
Standard (2) Sample size Sample Std Dev 10 2.3
1 1

Experimental (1) 11 5.8

H 1: =
1 1

Test statistic :
1 s1 1 1 .1 F = 1= =11 .1 1 s1 1 .1

Ha :
1 1

1 1

We designate the sample with the larger standard We designate the sample with the larger standard deviation as sample 1, to force the test statistic deviation as sample 1, to force the test statistic into the upper tail of the F distribution. into the upper tail of the F distribution.
Copyright 2006 Brooks/Cole A division of Thomson Learning, Inc.

Example
1 1 H 1: 1 = 1

Test statistic : s 1 .1 F= = =11 .1 1 s 1 .1


1 1 1 1 1

Ha :
1 1

1 1

The rejection region is two-tailed, with = .05, but we only The rejection region is two-tailed, with = .05, but we only need to find the upper critical value, which has /2 = .025 to need to find the upper critical value, which has /2 = .025 to its right. its right. From Table 6, with df11=10and df22= 9, we reject H00if F > From Table 6, with df =10 and df = 9, we reject H if F > 3.96. 3.96. CONCLUSION: Reject H00.There is sufficient evidence to CONCLUSION: Reject H . There is sufficient evidence to indicate that the variances are unequal. Do not rely on the indicate that the variances are unequal. Do not rely on the assumption of equal variances for your tttest! 2006 Brooks/Cole assumption of equal variances for your test! Copyright
A division of Thomson Learning, Inc.

También podría gustarte