Documentos de Académico
Documentos de Profesional
Documentos de Cultura
1.) When and how do you use a chi-square distribution to test if two variables
are independent? Also, provide an example showing how to use the contingency
table to find expected frequencies.
You can use the chi-square test to see if two categorical variables are
independent of each other. The null hypothesis is always that there is no
relationship between the two variables. Other conditions are:
To calculate the expected frequencies in the contingency table, you multiply the
row total (r) by the column total (k) for each cell, and then divide by the table total
State the H0 and Ha: the null hypothesis states that the two variables are
statistically independent. This means that knowledge of the one variable does
not help in predicting the other variable. H0 – the variables are independent (no
relationship). Ha – the variables are dependent (is a relationship)
Select the level of significance (a)
State the decision rule by defining the rejection region.
You must know the level of significance (a) and the degrees of freedom (df) to
find the critical value from the x2- table.
df= (number of rows – 1) * (number of columns – 1) from the contingency
table
= (r-1)(k-1)
The top row of the x2-table shows the significance level and the first column
contains the number of degrees of freedom. Because the x2-distribution is
positively skewed, the critical value will always be positive and in the right-hand
tail of the curve. The non-rejection region for H0 goes from the left tail of the
curve to the x2- critical value. To the right lies the rejection area. We will reject
H0 if the x2-test > x2-table value.
2.) How do you use the chi-square distribution to test if a frequency distribution
fits a claimed distribution? Provide an example showing the steps of the chi-
square Goodness-of-Fit Test.
Goodness-of-fit tests can be used when one has more than 2 categories. Also,
goodness-of-fit tests can be used when you expect something other than equal
frequencies in groups.
Suppose that we already know that in the US, 50% of the population has
brown/black hair, 40% has blonde hair, and 10% has red hair
We want to see if this is true in Europe too: we sample 80 Europeans and record
their hair color:
Calculated: χ2 = ∑( (O-E)2/E)
If the calculated χ2 does NOT equal or exceed the critical value, then we fail to
reject H0
d.f. = 3 – 1 = 2
Our obtained χ2 of 1.1 does not equal or exceed this value. Therefore we fail to
reject H0.
Our conclusion:
“The distribution of hair color in Europe is not different than the distribution of hair
color in the US, χ2(2, N = 80) = 1.1, p > .05.”
The χ2 distribution:
For example, it’s very probable that you could demonstrate a strong positive
correlation between the variable of average daily temperature and the variable
number of juvenile offenses committed in most communities in the U.S. Having
discovered such as strong positive correlation, however, does not mean that a
high temperature causes the commission of juvenile offenses. In fact, we know
from a broad array of sociological studies
4.) What is the difference between strong positive and strong negative
correlation? Is there value in a strong negative correlation? If so, what?
When a correlation coefficient is not statistically different from zero, the two
measures are said to be uncorrelated. Technically, this means that knowing that
the value of one measure does not allow you to predict the value of the second
measure with accuracy greater than chance. If the correlation between tow
variables is zero, no statistical relationship is present – a value on one measure
reveals nothing about the other measure.
A zero correlation tells us that the variables are not related at all and finding a
zero correlation would not be helpful for drawing any types of inferences. Such a
result could be the product of many influences, including the lack of reliability of
your assessment instrument. When the correlation coefficient is zero it
corresponds to cases where the two estimated regression lines are parallel to the
axes. Some examples of data having zero correlation are shown below:
y y y
x x x
The last of these diagrams serves to emphasize that the correlation coefficient is
a statistic that is specifically concerned with linear relationships; zero correlation
does not necessarily imply that x and y are unrelated.
Zero correlation means that there is no connection at all. If one correlated the
length of people’s noses with their intelligence, the result would be neither
positive nor negative and is an example of zero correlation. The connection is
meaningless, and the correlation would be zero as intelligence and nose length
are not correlated.
7.) What is the relationship between the algebraic equation for a straight line
and the linear regression equation?
For example if there is zero correlation between SAT scores and GPA, then
knowing a person’s SAT score tells you nothing about that person’s GPA.