Documentos de Académico
Documentos de Profesional
Documentos de Cultura
sabinarodriguez
Estadística II
1º Grado en Psicología
Facultad de Psicología
Universitat de València
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
2
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
3
Index
Test 1. Causality 5
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
4
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
5
Test 1. Causality
1. Ivtzan, I., Young, T., Martman, J., Jeffrey, A., Lomas, T., Hart, R., & Eiroa-Orosa, F. J.
(2016). Integrating mindfulness into positive psychology: A randomized controlled trial of an
online positive mindfulness program. Mindfulness, 7(6), 1396-1407.
Search the article to answer the following questions. You will find most of the answers in the
method section but some of them will be in other parts of the article.
2. Ivtzan, I., Young, T., Martman, J., Jeffrey, A., Lomas, T., Hart, R., & Eiroa-Orosa, F. J.
(2016). Integrating mindfulness into positive psychology: A randomized controlled trial of an
online positive mindfulness program. Mindfulness, 7(6), 1396-1407.
Search the article to answer the following questions. You will find most of the answers in the
method section but some of them will be in other parts of the article.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
6
a. The study does not find that the control group is different from the experimental
group in the sociodemographic variables.
b. As all participants are volunteers, the results may not apply to people who would
never want to participate in a mindfulness program.
c. References seem adequate.
d. The percentage of women participating was higher than that of men.
e. Most of the participants had university studies.
f. The effect on mindfulness was measured using the FMI scale (Freiburg mindfulness
inventory).
g. The effect on happiness was measured using the Pemberton Happiness Index (PHI).
h. The results may be confounded: perhaps those who dropped out of the program
were the most unsatisfied with it, and only those who were doing well remained, so
the evaluations are positively biased.
i. With these results, we can recommend this positive mindfulness program to
everyone.
j. With these results, it is not proven that this program works.
- FALSE: Actually, the study does not give information about that. This is probably the
biggest problem with this study: since many subjects did not finish the program
—perhaps because they felt that it was not working— and, consequently, those who
stayed were most likely the ones who were more satisfied with it, and the average
score of the responses was pumped up incorrectly.
- TRUE: Since all the participants are volunteers, we don’t know if for someone who
wouldn’t be interested in participating in this type of program it would be of any use.
This is a problem with many therapies: if those who start them are not predisposed
enough, they may not be of any use to them and will not work.
- TRUE: The references seem adequate.
- TRUE: Indeed, the percentage of women who volunteered to participate was higher
than that of men.
- TRUE: Indeed, most of the participants had university studies.
- TRUE: The effect on mindfulness was measured using the FMI (Freiburg mindfulness
inventory) scale. It puts it on page 1401 below, in the Measures section.
- TRUE: The effect on happiness was measured using the Pemberton Happiness
Index (PHI). It is mentioned on the page 1401 below, in the Measures section. By the
way, the authors of the referenced article (Hervás and Vázquez, 2013) are from the
Complutense University, so that scale is available in Spanish.
- TRUE: I am afraid that, if I myself have understood the article correctly, the results
may be muddled because those who dropped out of the program might be the ones
who did the worst and only those who did well remained. That would cast doubt on
the results unless we studied the reasons for abandoning the program earlier and
found that they were unrelated to the outcome.
- FALSE: This study did not manage well the people who dropped out of the program.
For this reason, it is not very clear if the program works or not, since we only have
data from a part of the participants in the treatment group which may be biased.
- TRUE: Indeed, although this study offers promising results, it is not conclusive (at
least this is my opinion, but let me know yours if you feel otherwise)
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
7
3. Mensink, M.C., & Dodge, L. (2014). Music and memory: Effects of listening to music while
studying in college students.
Search the article to answer the following questions, indicating if the answer is correct
(several alternatives may be correct). Most of the answers are in the method section but
others there may be in other parts of the article
- TRUE: Yes. The study has a reference section that seems quite reasonable.
- TRUE: The study reports the mean effect of treatments on page 209.
- TRUE: The article makes two comparisons: between reading in silence and listening
to some kind of music, and then between the two styles of music and silence.
- FALSE: There are no significant differences in reading comprehension between the
music condition and the silence condition.
- TRUE: It sounds like a student’s report but her competence and professionalism
seem to be excellent.
- TRUE: The authors of the article say so in the discussion (p. 210).
- TRUE: The author is a student.
- TRUE: Indeed, it seems that pop music did not affect the participants as much as
classical music. Perhaps familiarity is an important factor that was not taken into
account in the study. The authors mention this factor too.
- FALSE: I wouldn’t say that.
- FALSE: They only read for five minutes, and the authors point out that this is a
limitation of the study. It would be interesting to do this study with longer stretches of
studying and music listening.
4. Mensink, M.C., & Dodge, L. (2014). Music and memory: Effects of listening to music while
studying in college students.
Search the article to answer the following questions, indicating if the answer is correct
(several alternatives may be correct). Most of the answers are in the method section but
there may be others in other parts of the article.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
8
5. Click here and read the paper in order to answer the following questions, indicating if the
answer is correct (several alternatives may be correct).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
9
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
10
- FALSE: Only people who listen to music are used in the study, so we have
information on those who do not listen to music and without comparison, no causal
inferences can be drawn.
- FALSE: I do not see treatment and control anywhere.
- FALSE: There is no group control that I know of.
- FALSE: There is no treatment group.
- TRUE: Indeed, the only thing that is done is to describe the academic performance of
those who listen to music.
- FALSE: There is no description of the differences between the control group and the
treatment group that I see.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
11
Student’s great discovery was this formula σX¯=sn−1n√ and which is used to calculate the
standard error. Here s is the estimate of the standard deviation in the population and n is the
sample size.
- False: The square root of the mean has nothing to do with the standard error.
- False: No, the mean is not part of the formula for calculating the standard error of the
mean (weird, isn’t it?).
- False: No. Whether the data was collected well or not has little to do with this matter.
- Correct: Indeed.
The count of all individuals in a country with additional information such as social, economic,
or home conditions. You can read more about the Spanish census here (if you want).
- False: This is the electoral census.
- False: No, this is the censor.
- Correct: Congratulations!
- False: No, this is censorship
Taking population samples is the most realistic way we have to study a population (unless
you have superpowers).
- False: The best is an ambiguous word, the best for what?
- False: Scientists often use samples, but non-scientists do too.
- Correct: Sampling is the realistic way to study populations.
- False: Optimal is an ambiguous word: Optimal in which sense?
4. Does reducing the sample variance have an effect on the standard error?
a. Yes. But only when the sample is small.
b. Yes. The larger the variance the larger the standard error.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
12
c. Yes. The larger the variance, the larger the standard error, so the smaller the
variance, the smaller the standard error.
d. Yes. But only when the sample is greater than 1,000 cases.
If you remember the standard error formula you will see that we divide the standard
deviation of the sample (although using n-1) by the square root of the number of cases.
σX¯=sn−1n√
The larger the variance, the larger the standard deviation, and since it is in the numerator of
the standard error formula, the standard error will also augment.
- False: It is true that, if the sample is small, a larger variance will have a higher impact
on the result of an experiment. However, although this has consequences for
experimental design, it is a topic for another course (one about Experimental
Design).
- False: see the explanation in the solution.
- Correct: As the solution indicates.
- False: Variance always has an effect, but if the sample is large, the effect of variance
diminishes, so the truth is the opposite of what is written in this alternative.
Only one sample is taken in each study (although sometimes we speak of subsamples when
it is possible to distinguish between subgroups within the full sample). However, sometimes,
the same study or a similar one is replicated, but this is exceptional as there are always
small variations in the conditions, so we usually do not say that we are drawing a new
sample, but that we are undertaking a new study. Of course, this is in Psychology: It is
possible that drawing different samples from a population under comparable conditions is
carried out in other sciences (think of Chemistry or Biology).
6. To get “the probable error of a mean” you can draw many samples of a given size from a
population and:
a. Calculate the mean on each sample and then take the means of those means.
b. Calculate the mean in each sample and then take the standard deviation of those
means.
c. Calculate the probability of each mean in the sample.
d. Calculate the mean for each sample
Calculate the mean in each sample and then take the standard deviation of those means:
Student called it the empirical solution and carried it out with a table of correlations of the
measurements of the middle finger of 3000 criminals, which makes me wonder whether
someone at some point in history thought that this measurement might be related to being a
criminal.
- False: The mean of the means is equal to zero.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
13
9. Those who make political predictions make up for the variability in the results in different
samples by:
a. Taking the average of several surveys.
b. Increasing the size of the sample until it is equal to that of the population.
c. Trusting experts who know how to interpret people’s feelings.
d. Making up the result (as everyone in the media usually does).
The (weighted) average of several polls has lately established itself as the method of making
electoral predictions in the media. These averages are more accurate than individual survey
results because they incorporate more information and make up for errors or biases in
individual surveys.
- Correct: Congratulations!
- False: Working with the entire population is generally unfeasible, although of course
there are always exceptions
- False: Maybe yes, but I hope not.
- False: Hmm. Very conspiranoic, isn't it?
Although the mean and standard deviation are the parameters that we will use the most, the
sampling error problem applies to any situation where we want to estimate something about
the population and we use samples: the correlation, the percentiles, the median.
- False: This is correct but there is an even better one.
- False: This is correct but there is an even better one.
- Correct: Bravo! Indeed we can calculate the standard error of many more things
besides the standard error of the mean, but this is the one we use to introduce the
concept.
- False: This is correct but there is an even better one.
11. To calculate a 95% confidence interval of the value of the mean in a population from the
mean in a sample:
a. We multiply 2.56 and -2.56 by the standard error and then add and subtract the
obtained value from the sample mean.
b. We calculate the standard error and multiply it for the values of t for the sample size
(minus 1) that leave 95% of the possible values in the middle (that value is usually
about 2). Then we add and subtract the obtained value from the sample mean.
c. We multiply 2.56 and -2.56 by the standard error and then add and subtract the value
obtained from the population mean.
d. I would request the statistical package to calculate it for me.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
14
To make inferences about populations from values in samples, we use the value of t
multiplied by the standard error for a given level of confidence (usually 95%). Adding and
subtracting the sample value we calculate an interval in which we are confident the
population value will be within.
- False: 2.56 and -2.56 are the values that leave 99% of the values within when using
the z distribution. Sometimes we will use those values to calculate confidence
intervals because the t distribution with very large samples ends up being the z
distribution, but this is not the right answer in general.
- Correct: This is the correct answer.
- False: In no case can we subtract the values from the mean value in the population
because that value is always unknown.
- False: Nice try.
12. In a study on the effects of online positive psychology, volunteers were randomly
assigned either to treatment or to the waiting list (control group). The study found that
practically all of the two samples were women with a high educational level. This poses a
problem of:
a. Something but I don’t know what.
b. Of representativeness.
c. Of confounding variables.
d. From concept.
13. The difference between the means obtained in small sample sizes (say 4 cases) differ
with respect to the mean’s population more than the means obtained in a large sample size
(say 100 cases) because:
a. Small values are more likely when the sample is small than when it is large.
b. The differences between the value of the sample mean and that of the population
mean when the sample size is only 4 cases can be much larger than when the
sample size is 100 cases..
c. The differences between the value of the sample mean and that of the population
mean when the sample size is 4 cases is always larger than when the sample size is
100 cases.
d. The sample size of 4 cases has values more
clustered than when the sample is of 100 cases.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
15
- False: No. Small values are not more likely when the sample is small than when it is
large, what made you think that this could be the case?
- Correct: When the sample is small, the differences from the population value can be
larger (although not necessarily) than when the sample is large.
- False: It says always and that is not correct for sure. It may happen that by chance
the mean of a sample of 100 cases is away from the population’s mean, but it is not
normal.
- False: I don’t really know how “clustered” can be interpreted, so I hope you haven’t
answered this alternative.
Student’s t distribution tells us what percentage of values will be below a value normally
expressed in standard scores. From that we can calculate the percentage of values that will
be between two standard (z) scores. Since the distribution of means calculated in samples of
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
16
a certain size drawn from a population follows the t distribution, we can say that it is the right
model
- False: 95% is the size of the confidence interval that we most commonly use, but it is
not the t distribution.
- Correct: If you have been led by the intuition that the longest alternative is the correct
one, this time it has worked out (but do not put too much trust in that rule because I
also know it).
- False: -2 and 2 are values that we often use for t because they allow us to calculate
an approximate 95% confidence interval without looking at tables. But they are not
the distribution t.
- False: We use the value of t (as an approximation we can use 2 and -2 for 95%
confidence) to calculate 95% confidence intervals
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
17
The results part is where the statistical analyzes performed on the data are explained. It is a
fairly technical part that often uses plenty of Statistics analyzes.
- Correct: The contents of the Statistics subject are usually applied in this section, both
to learn how to write it and to understand the analyzes carried out by others in their
reports.
- False: The results should not be but brief comments about the results of the
statistical tests: the conclusions part is where you discuss your results in relation to
the theory presented in the introduction and even possible future work.
- False: Psychometrics is related to measurement and, unless the work is focused on
improving or developing a questionnaire, the part where measurement is described is
usually in the method section. And so it happens with Research Design.
- False: If you are a Statistics professor answering this exam, this answer is maybe
correct: for the rest of the world, the answer is false, the other parts of a study are
also important.
In many sciences, theories are not so well specified that a single study can completely refute
them. Therefore, there can always be some attuning to the theory to make the results fit into
it. However, if the negative results continue piling up then it’s probably high time to start
afresh with a new theory.
- False: Although this alternative is marked as false, under certain conditions and on
certain occasions it could be correct. For example, a) if we have very little confidence
in our theory, or b) if the theory is very well detailed because then the test can be
more conclusive.
- False: This is very radical. You usually require more than one study to reject a theory
(if it has a serious one, of course).
- Correct: This is the correct alternative.
- False: That would be killing the messenger, wouldn’t it?
3. Research design:
a. It is usually taught in courses other than Statistics.
b. It is irrelevant as far as the data are well analyzed.
c. It is a part of the Statistics courses.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
18
In Psychology, Statistics and Research Designs are regarded as different topics, although
closely related to each other. If an investigation is poorly designed, not even the most
sophisticated statistical analyzes will lead to sound conclusions.
- Correct: Indeed, Statistics and Research Designs are usually two separate topics in
Psychology (although perhaps they should not be)
- False: If a research design is not correct, statistical analysis alone will seldom save
the day.
- False: Normally, the Statistics courses usually do not deal with how to carry out
research designs in detail.
- False: Although the brilliance of some researchers often achieves wonderful research
designs that no professor could ever have taught them, there are aspects of design
that are well established and can be learned in the right books and courses
4. In the material of the course, in the section about the scientific method, there are several
journalistic articles that refer to surveys or studies at a national or international level. Using
this link related to water El mito de los 8 vasos de agua al día (nationalgeographic.com.es),
answer the question below:
According to that article, there is a myth that it is necessary to drink 8 glasses of water daily.
Suppose that —although it is not true— we suspect that young people have bought it and
college students have developed the habit of drinking 8 or more glasses of water a day…
What null hypothesis would we use in this case to test that using a random sample of
students?
The question says “if they drink eight or more glasses of water” so the null hypothesis to test
is to drink 7 or fewer than 7 glasses of water.
- True: The question says “if they drink eight glasses of water or more” so the null
hypothesis to test is to drink 7 or fewer than 7 glasses of water.
- False: The null hypothesis to test is to drink less than 8 glasses of water, so this
answer is not correct. To support that they drink 8 or more we should put in the null
hypothesis that they drink 7 or fewer than 7 glasses of water.
- False: This would be very wrong. We think they drink 8 or more so the null
hypothesis should be 7 or fewer than 7 glasses of water
5. In the material of the course, in the section about the scientific method, there are several
journalistic articles that refer to surveys or studies at a national or international level. Using
this related to smoking El consumo de tabaco en España y el mundo, en datos y gráficos
(epdata.es), answer the following question:
If we believed that university students are heavier smokers than the normal population, what
value would we place as the null hypothesis of a hypothesis test about the percentage of
students who smoke daily?
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
19
The text indicates that 22% of the Spanish population claims to smoke daily (the source is
AECC/INE, so it seems reasonable to trust it). The null hypothesis is the opposite of what we
think so in this case it is 22% or less (since we think they smoke more and we would aim to
reject that).
- False: 25% are those who declare themselves ex-smokers, so you have looked at
the wrong percentage. Also, this would be a test of a specific value and not whether
the students smoke more than a particular value so this response is also wrong for
that.
- False: This would be the study hypothesis or alternative (what we want to confirm).
- False: 25% are those who declare themselves ex-smokers, so you have looked at
the wrong percentage
- True: Well done
6. In the material of the course, in the section about the scientific method, there are several
journalistic articles that refer to surveys or studies at a national or international level. Using
this related to smoking El consumo de tabaco en España y el mundo, en datos y gráficos
(epdata.es), answer the following question
If we believed that the difference between males and females studying at the university with
regard to smoking is higher than it is in the general population, what hypothesis of the study
could we use to test this?
a. That university male students smoke more than university female students in a
percentage greater than 25%.
b. That university male students smoke the same as university female students in a
percentage of 7% or less.
c. That university male students smoke more than university female students in a
percentage greater than 7%.
d. That university male students smoke less than university female students in a
percentage greater than 25%
As you can see in the text, a quarter of men in Spain declare themselves to be smokers,
versus 18% of women. This puts men 7% above women men in smoking. If we think that
male university students smoke more in relation to university female students than that 7%,
we should put as a hypothesis of the study that university students smoke more than
university women in a percentage of 7% or more, and as a null hypothesis that they smoke
the same or less.
- False: 25% is the percentage of male smokers but the hypothesis is about the
difference with females.
- False: This would be the null hypothesis.
- True: Good answer!
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
20
- False: 25% is the percentage of male smokers but the hypothesis is about the
difference with females. Besides the less does not make much sense either
7. As the teacher is a bit of a smartass, he has decided to change the name of one of the
hypotheses and use a different one in his classes: Do you remember which was which he
renamed?
a. Everyone except the teacher calls the study hypothesis the alternative hypothesis.
b. The teacher would never rename the hypotheses because their names perfectly
reflect their intrinsic meanings.
c. Everyone except the teacher calls the null hypothesis the alternative hypothesis.
d. Everyone except the professor calls the study hypothesis the null hypothesis
Everyone except the professor calls the study hypothesis the alternative hypothesis.
- Correct: This alternative would be correct.
- False: Null hypothesis more or less reflects the intrinsic sense of being the
hypothesis of no effect or no difference, but the teacher believes that alternative
hypothesis, the name traditionally used for the hypothesis with which we most agree,
is not quite right and it seems to imply that it is an alternative to the one we preferer
when in fact it is the one that we prefer.
- False: Everyone, including the professor, calls the null hypothesis the null hypothesis
(although it’s a name that he is not very happy with either).
- False: The study hypothesis is called the alternative hypothesis by everyone except
the professor.
8. Induction means:
a. Deducing the consequences of the laws of science.
b. Inferring theories from repeated observations of reality.
c. Proving that the theories are true.
d. Rejecting hypotheses that are not true.
IMRD are the initials of Introduction, Method, Results, and Discussion which is the basic
structure used in scientific-empirical articles in sciences such as biology, psychology, or
chemistry.
- False: Theoretical articles do not follow these steps.
- Correct: This is the correct alternative.
- False: Statistical papers don’t use these steps so often.
- False: Although IMRD is a scheme that has a lot to do with the scientific method
itself, they are not the same thing.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
21
The results part is where the statistical analyzes performed on the data are explained. It is a
fairly technical part that often uses plenty of Statistics analyzes.
- False: The results should not be but brief comments about the results of the
statistical tests: the conclusions part is where you discuss your results in relation to
the theory presented in the introduction and even possible future work.
- Correct: The contents of the Statistics subject are usually applied in this section, both
to learn how to write it and to understand the analyzes carried out by others in their
reports.
- False: Psychometrics is related to measurement and, unless the work is focused on
improving or developing a questionnaire, the part where measurement is described is
usually in the method section. And so it happens with Research Design.
- False: If you are a Statistics professor answering this exam, this answer is maybe
correct: for the rest of the world, the answer is false, the other parts of a study are
also important.
How the significance value is calculated is by extracting samples from a population with
parameters equal to the null hypothesis, therefore it indicates the probability of obtaining the
results that have occurred in a sample if the null hypothesis (for the entire population ) it’s
true. Note that we usually want that probability to be low because that way we will reject the
null hypothesis.
- False: In a study, the hypothesis we test is the null hypothesis, not the study
hypothesis.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
22
- Correct: Indeed, the significance value is the probability of obtaining the results that
have occurred in a sample if the null hypothesis (for the entire population) is true.
- False: In the hypothesis testing procedure we use, we cannot calculate whether a
hypothesis, null or study, is true. What we can do is calculate the probability of
obtaining the results that have occurred in a sample if the null hypothesis (for the
entire population) is true.
- False: In the hypothesis testing procedure we use, we cannot calculate whether a
hypothesis, null or study, is true. What we can do is calculate the probability of
obtaining the results that have occurred in a sample if the null hypothesis (for the
entire population) were true.
The steps of the scientific method are Theory, hypothesis, method, analysis of results and
conclusions
- The scientific method is not only applied within the laboratory. There is also science
beyond the laboratory.
- This alternative doesn’t really make much sense: proving the null and study
hypotheses is not the goal of the scientific method.
- This alternative is the correct one.
- This option is not complete: statistical analyzes alone are usually not enough to
reach conclusions, and there is not mention of this last step. Also, the hypotheses
are drawn from from theory so those two steps are not in the correct order
Psychology is an empirical science because its theories need to be tested against reality to
determine if they are valid.
- True
- False
- False
- False
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
23
1. You have a file with data from a survey on social issues in the USA (GSS93) in the
course’s material. (There is a version in Spanish and one in English, if you can’t find the one
in your favorite language, please let me know). You can open the file with JASP or Jamovi,
but SPSS has an advantage because it allows you to see the information about the variables
in a more compact way.
Let’s suppose that you want to analyze if the zodiac sign (zodiac) influences the region in
which you live (region), what statistical test would you use?
a. Pearson’s correlation.
b. Pearson’ Chi-Square test.
c. Simple regression.
d. An analysis of variance test
The two variables are categorical so, in this case, Pearson’s Chi-Square test is the
appropriate technique for testing the association between them.
- False: This alternative is for numerical variables and the two variables are
categorical.
- Correct: This would be the best.
- False: This alternative is for numerical variables and the two variables are
categorical.
- False: No. This alternative would not work
2. You have a file with data from a survey on social issues in the USA (GSS93) in the
course’s material. (There is a version in Spanish and one in English, if you can’t find the one
in your favorite language, please let me know). You can open the file with JASP or Jamovi,
but SPSS has an advantage because it allows you to see the information about the variables
in a more compact way.
Suppose that you want to analyze whether marital status (married, single, etc.) influences
the level of life satisfaction of the people (measured with a Likert scale). What statistical
technique would be appropriate to analyze the relationship between these two variables?
The marital status variable is categorical with several categories and the level of satisfaction
is categorical ordinal. In this case, looking at the table in the theory’s chapter, the
Kruskal-Wallis test is the one usually recommended but, as we will see later, there is not
much difference in using a test with a dependent numerical variable, which would be the
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
24
analysis of variance, and using that instead of the Kruskal-Wallis test has some advantages.
For this reason, there are two alternatives that are acceptable, but the one that combines
both tests is the best.
- False: This alternative is not bad but there is a better one.
- False: This alternative is not bad but there is a better one.
- Correct: This would be the best.
- False: The Friedman test works for repeated samples, which is a concept that I
haven’t explained yet but will come up later in the course
3. You have a file with data from a survey on social issues in the USA (GSS93) in the
course’s material. (There is a version in Spanish and one in English, if you can’t find the one
in your favorite language, please let me know). You can open the file with JASP or Jamovi,
but SPSS has an advantage because it allows you to see the information about the variables
in a more compact way.
Let’s say that you want to analyze whether age makes people spend more hours watching
television. What statistical technique would be suitable for this goal?
We want to study how TV hours depend on age, so TV is the dependent variable (and the
other one is the independent variable).
The TV Hours variable is numerical, and so it is the age of the respondents. In this case,
simple regression is the appropriate technique.
- False: No. This alternative would not work.
- False: This alternative would tell us the association, but we want to analyze the
dependency (although both tests are closely linked TBH).
- False: Multiple regression with one independent variable is simple regression after
all, but it is better to use the correct names.
- Correct: This would be the best
4. You have a file with data from a survey on social issues in the USA (GSS93) in the
course’s material. (There is a version in Spanish and one in English, if you can’t find the one
in your favorite language, please let me know). You can open the file with JASP or Jamovi,
but SPSS has an advantage because it allows you to see the information about the variables
in a more compact way.
Let’s suppose that you want to analyze whether the Academic level is related to the level of
satisfaction of people with their lives (life). What statistical technique would be appropriate to
analyze the relationship between these two variables?
a. Pearson correlation.
b. Simple regression.
c. Ordinal correlation (Spearman).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
25
5. You have a file with data from a survey on social issues in the USA (GSS93) in the
course’s material. (There is a version in Spanish and one in English, if you can’t find the one
in your favorite language, please let me know). You can open the file with JASP or Jamovi,
but SPSS has an advantage because it allows you to see the information about the variables
in a more compact way.
Suppose that you want to analyze if the average age of the respondents is 25 years…what
statistical test would be appropriate in this case?
a. Pearson’s correlation.
b. The two-sample t-test.
c. A binomial test.
d. The one-sample t-test.
We only have one sample on one variable so the proper technique is the one-sample t-test.
- False: This alternative would test the association between two variables and we only
have one.
- False: This alternative is not correct.
- False: No. This alternative would not work.
- Correct: This would be the correct one
6. Congratulations, you have been hired as the person in charge of the students’ satisfaction
at a Valencian university. As a first step, you are planning to carry out a survey in which you
are going to ask a sample of 1000 students how satisfied they are with the vegetarian menu
in the cafeteria. Please, indicate which format would be the most suitable for this question.
a. An open question in which you would allow each student to answer verbally for a
maximum of 15 minutes, you will record what they say, and then you will listen to the
answers calmly afterward to draw your conclusions.
b. A numerical scale called a satisfaction-meter in which students must accurately
indicate their level of satisfaction on a scale of 1 to 100.
c. A series of questions using a Likert-type scale with five points of evaluation (from not
at all satisfied to very satisfied): For example, a question could be: how would you
rate the quality of the broccoli on the vegetarian menu from one to five?
d. A list of adjectives with different emotional content that you will then interpret
depending on whether the tone is positive or negative.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
26
Unless your life is really boring and you are thinking that listening to the recordings of 1000
students is the way to make it more interesting, or you want to develop a new psychological
theory about the relationship between words and food, I recommend that you use a scale
Likert. Using a numerical scale is not unreasonable, but do you think anyone can be that
accurate on broccoli?
- False: 15 minutes per thousand students? By the time you have finished listening to
the recordings, the course will be over.
- False: This alternative is not unreasonable, but do you think that someone can be
very precise in something like this? It is better to ask with an ordinal scale.
- Correct: This is the most boring option of all, but it is the one that makes the most
sense.
- False: Inventing theories is what psychologists are most famous for, but that doesn’t
mean it’s a good idea to feed the flames.
7. In this link there is a compressed data file that has a series of files by country that come
from the European Social Survey and that you can use to write the report if you find
something that interests you. If you open any file and examine the variables you will see that
I have selected many that have to do with mood, life satisfaction, etc. In addition, starting
with one called ipcrtiv (Important to think of new ideas and being creative) you will find a
series of questions about how important the subject considers that value in their life. Those
questions correspond to the Schwartz values questionnaire, which is a well-known theory of
values that you may have seen in Social Psychology class but that is explained in many
places anyway.
By the way, if the file does not include anything that convinces you, you can see all the
topics covered in the survey at this link.
For this exercise I only ask you the following: In the this document it is indicated that
questions about values number 3 (ipeqopt: Important that people are treated equally and
have equal opportunities), 8 (ipudrst: Important to understand different people) and 19
(impenv: Important to care for nature and environment) score on the universalism value. If
we calculate a sum of these three questions that indicates universalism and we want to see
how it depends on the age of the respondent, what statistical technique can we use?
When we add several questions from a Likert scale that are on the same scale, we can treat
it as if it were numerical. Since age is numerical we can calculate a Pearson correlation.
- Correct: Great!
- False: When we add several questions from a Likert scale that are on the same
scale, we can treat it as if it were numerical.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
27
- False: When we add several questions from a Likert scale that are on the same
scale, we can treat it as if it were numerical.
- False: Well. Ops, this…
8. In this link 2021-22 Estadística II Gr.B (36245) (uv.es) there is a compressed data file that
has a series of files by country that come from the European Social Survey European Social
Survey | European Social Survey (ESS) and that you can use to write the report if you find
something that interests you.
If you open any and examine the variables, you will see that I have selected many that have
to do with mood, life satisfaction, etc. In addition, starting with one called ipcrtiv (Important to
think new ideas and being creative) you will find a series of questions about how important
the subject considers that value in their life. Those questions correspond to the Schwartz
values questionnaire, which corresponds to a well-known theory of values that you may have
seen in Social Psychology class but that is explained in many places anyway.
By the way, if the file does not include anything that convinces you, you can see all the
topics covered in the survey at this link Data and Documentation by Theme | European
Social Survey (ESS)
For this exercise I only ask you the following: In the this document
ESS_computing_human_values_scale.pdf (europeansocialsurvey.org) it is indicated that
questions about values number 3 (ipeqopt: Important that people are treated equally and
have equal opportunities), 8 (ipudrst: Important to understand different people) and 19
(impenv: Important to care for nature and environment) score on the universalism value.
How would we know if someone is considered high in that value? Note: In the previous
document there is a more complicated method to obtain these scores that we will not use for
now.
If you look at the response scale, you will see that answering 1 means feeling that you are
close to that value, while 5 means not feeling that you are as indicated. Adding the questions
will produce a score that summarizes the universalist value of people but higher scores
would mean lower universalism.
- False: Well, it’s not false, but you want to pass the course, not to go to Rome.
- False: Well. Yes but…
- False: The response categories are inverted, so high values indicate lower
universalism.
- Correct: Great!
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
28
1. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Draw a bar chart of the life variable and select which alternative below is correct.
a. There must be many people that did not respond to this question.
b. The total number of responses is 400.
c. Many more people have exciting lives
versus routine lives.
d. The number of people with Dull lives is
higher than those with routine lives
2. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Draw a boxplot of the income91 variable split by the life satisfaction categories to check if
there is something relevant about it.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
29
a. There is absolutely no relationship between life and family income (but we need to
test for significance before claiming it).
b. It looks like people who rate their lives as
less exciting, are in families that have lower
incomes (but we need to test for significance
before claiming it).
c. It looks like people who rate their lives as
more exciting, are in families that have lower
incomes (but we need to test for significance
before claiming it).
d. The total number of responses is 400.
3. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Suppose that you have the theory that 50 years old is the limit to have an exciting life and
consequently you set the hypothesis that people who claims to have a Dull life are older than
those who have an Exciting life. Looking at the table below, do you think that the results
support it?
a. Looking at the second line, I see that I can reject the null hypothesis so the results do
support this theory.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
30
b. Looking at the third line, I see that I can not reject the null hypothesis so the results
do not support this theory.
c. Looking at the first line, I see that I can reject the null hypothesis so the results
support this theory.
d. Looking at the first line, I see that I can reject the null hypothesis so the results do not
support this theory.
The right line for this test is the second and the reason is that the study hypothesis
associated with our theory would be μDull−μExciting>0 and consequently the null hypothesis
would be μDull−μExciting≤0. In this case, the difference is of 8 years and is significant, so
you can reject the null hypothesis and consequently people who affirm that they have a Dull
life is older than people who affirm that have an Exciting life.
- Correct: Yes, you are right!
- False:
- False:
- False
4. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Suppose that you have the theory that 50 years old is the limit to have an exciting life and
consequently you set the hypothesis that people who claims to have an Routine life are older
than those who have an Exciting life. Looking at the table below, do you think that the results
support it?
a. The differences are significant but the null hypothesis is not the right one.
b. The differences are significant, the null hypothesis is the right one but the effect size
is not high.
c. The differences are significant, the null hypothesis is the right one and the effect size
is very high.
d. The differences are significant, the studio hypothesis is not the right one but the
effect size is not high.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
31
The null hypothesis of the table is the right one according to the question and consequently
the study hypothesis too. The differences are significant but the d that measures effect size
is not very large (remember that a d below 2 is regarded as small).
- False:
- Correct: Yes, you are right!
- False:
- False
5. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Suppose that you have the theory that people who have a Dull life has a very sad life in
general compared with people who have an Exciting life, and you want to show that so it
happens in several aspects, namely, their income, family income, the number of hours they
watch television per day and the age at which they got married.
The differences are significant in all cases and they are in the direction that supports that
people that label their lives as Dull have less income, live in families with less income, watch
more TV and get married a little bit younger (although you might dispute this is bad, of
course). Besides, the effect sizes are large in general. All together, you might claim people
who affirm that their lives are Dull may have a point.
- Correct: Yes, you are right
- False:
- False: Well…
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
32
- False
6. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Exploring the variable tv hours you decide to test if the average hours of the people that
affirm that their lives are Dull life is five. Using the wondrous graphic that the teacher has
shown a couple of times in class, what could you say about the results of this test?
a. The effect size is very low so the results are not reliable.
b. The average number of hours of the sample is five because the red “thing” is just
over 5.
c. We can not reject the null hypothesis that the average number of hours of the sample
is five. Also, there must be somebody who does not ever turn the TV off.
d. The sample size is 400
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
33
7. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you
learned how to make at the first part of the course,
but if you do not know or you do not remember, let
me know.
The variable looks asymmetric, with the peak on the right side of the data. This suggests that
There are fewer families earning low salaries than high salaries.
- False: I do not see that.
- False: The middle of the scale is ten but the bars there are not the tallest. More
families earn between 17 to 21.
- Correct: Yes, you are right!
- False: This answer was not right for the previous question and it is not for this one.
8. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Suppose that you have the theory that 50 years old is the limit to having an exciting life and
consequently that people who claim to have a Dull life are on average over 50 years old.
Looking at the table below, do you think that the results support it?
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
34
a. Looking at the first line, I see that I can reject the null hypothesis so the results do not
support this theory.
b. Looking at the second line, I see that I can not reject the null hypothesis so the
results do not support this theory.
c. Looking at the first line, I see that I can reject the null hypothesis so the results
support this theory.
d. Looking at the third line, I see that I can not reject the null hypothesis so the results
do not support this theory.
The right line for this test is the second and the reason is that the study hypothesis
associated with our theory would be μ1>50 and consequently the null hypothesis would be
μ0≤50. In this case, although the “Dull” people in the sample has an average age over 50,
you can not reject the null hypothesis in any case (but all the results are non-significant so
this is the easiest part of this question).
- False:
- Correct: Yes, you are right!
- False:
- False
9. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Let’s say that you have the theory that the people with Exciting lives are younger than those
with Dull lives
a. You are not sure of the differences so consequently your test should be bilateral
b. The hypotheses should be unilateral because you believe that young people have
more exciting lives than olders (BTW, I beg to differ).
c. The total number of responses is 400.
d. There should be a hypothesis of the type null and another of the type alternative
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
35
The hypothesis should be unilateral because you believe that young people have more
exciting lives than old people (but I do not).
- False: You may not be sure of the result, but you should be of your hypotheses.
- Correct: Yes, unilateral it is.
- False: Still not true.
- False: There will be always one null an one alternative but I ask specifically in this
case.
10. The GSS93 is a survey we have used for previous exercises that you should be able to
find and open with your favorite data analysis software. One of the questions in this survey is
one called life that reads “Is your life Exciting or Dull?” and has three alternatives for
responding: Dull(1), Routing(2), and Exciting(3). The goal of this exercise is to explore this
variable and others related to it as the first step before performing statistical tests. This step
will be carried out with graphics that I assume you learned how to make at the first part of
the course, but if you do not know or you do not remember, let me know.
Suppose that you have the theory that 50 years old is the limit to have an exciting life and
you set the hypothesis that people who affirm to have an Exciting life are on average under
50 years old. Looking at the table below, do you think that the results support your theory?
a. Looking at the second line, I see that I can not reject the null hypothesis so the
results do not support this theory.
b. Looking at the second line, I see that I can reject the null hypothesis so the results do
not support this theory.
c. Looking at the third line, I see that I can reject the null hypothesis so the results do
support this theory.
d. Looking at the first line, I see that I can reject the null hypothesis so the results
support this theory.
The right line for this test is the third and the reason is that the study hypothesis associated
with our theory would be μ1<50 and consequently the null hypothesis would be μ0≥50. In
this case, you can reject the null hypothesis supporting that the average age of people who
affirmed having an Exciting life was under 50.
- False:
- False:
- Correct: Yes, you are right!
- False
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
36
1. In the Data section of the course, there is a data file called ESSEspanya.sav, which has
variables belonging to the European Social Survey referring to Spain. In it there are
variables related to the state of mind of the Spaniards, happiness, values, attitudes towards
social issues (gender, emigration, etc.), politics, and many others. That file should be a good
source for your course report.
Look at this news and find out what percentage of young people between 15 and 19 years
old suffer from anxiety according to the WHO (if you do not know how to select those
younger than 19 is time to ask). Check if that percentage is the same in Spain. For this, you
can use the question fltanx. I understand that if someone answers that question “Most of the
time” or “All, or almost all the time” then he/she is anxious. NOTE: I have combined the
responses from the two previous responses into the “Most of the time” alternative in the data
file).
Check if the percentage of adolescents aged between 15 and 19 years who say in the link
that they are anxious coincides with the value that appears in the ESS for Spain.
a. The percentage of people aged 15-19 suffering anxiety in the article is 10% but that
of the sample is only 0.03, so the information is clearly wrong.
b. The percentage of people aged 15-19 suffering anxiety in the article is 4.6%, and in
the sample it is 3.4%. We cannot reject the null hypothesis of no difference.
c. There are only 3 subjects who claim to be anxious in the sample, so we cannot draw
valid conclusions.
d. The effect size is very large, so there must be differences between the two values.
As we can see in the table below, 3 out of 87 subjects under 19 years of age stated that they
had feelings of depression in the previous week. That makes 3.4% of the total which is close
to the information of the WHO (4.6%). Rejecting the null hypothesis is not possible as p=0.8
so we can say that our sample coincides with the WHO’s value for the world.
- FALSE.
- TRUE. This is correct.
- FALSE.
- FALSE
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
37
2. One of the problems that the aliens have is that their cholesterol level goes up a lot when
they eat every day in the spaceship canteen because they can only eat processed food. To
solve this problem, a sample of aliens who had recently made interspatial trips in different
ships was put on a diet. This sample represented a random sample of all aliens doing this
type of trip. Blood samples were taken on days 2, 4, and 14 just after their trips to examine if
they passed the recommended maximum cholesterol levels (which interestingly enough
matches the levels for humans: 200mgs/dL). Cholesterol was also measured before setting
off for the trip(CONTROL). In this case, our aim is to show that the cholesterol levels at the
CONTROL moment were correct (that is, reject that
H0>200) and the opposite in the rest of the tests, that is,
that their levels were incorrect. Cholesterol test results
are found in the course material (Alien Cholesterol).
Box plots give the impression that all variables are fairly
symmetric. There are some extreme values but they are
not extremely large and consequently we can be
confident in the results.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
38
T-test table
For DAY2, DAY4, and DAY14, the alien sample’s cholesterol is above 200. Now, as our
questions are about the population from which those samples were drawn we need
hypothesis tests to confirm that the differences are significant. The null hypothesis is that the
sample mean would be below 200, but as can be seen, in reality, the sample mean is above
that value of 200 in all three cases. This suggests that the null hypothesis must not be true
since the probability of obtaining the sample means obtained if the null hypothesis were true
is very low (<0.05) in all the tests.
For CONTROL the result is different. First of all, the null hypothesis we test is that the
cholesterol level of the aliens is high. We would clearly reject that null hypothesis if the
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
39
sample’s mean were much lower than 200, but in this case, the result is not very conclusive.
The sample mean is 193.13, somewhat below 200 but not too much. Since the value of the
sample mean is 193.13 and it is quite close to 200, the probability of rejecting the null
hypothesis is 0.051, which is quite close to the value that we usually use (0.05) but it does
not reach it. Furthermore, looking at the 95% confidence interval, we can see that the
cholesterol value of the alien population could even reach a value of 200.05 according to this
result. Thus, we cannot completely rule out that aliens as a whole do not have slightly high
cholesterol values even before space traveling, and that it would therefore be a good idea
that all of them, and not just those who travel to space, watch their diet a little bit more.
Finally, note that the cholesterol levels of those on a diet seem to go down progressively. It
looks like a diet free of processed foods has the desired effect. However, at this time it is not
possible to make a diagnosis of whether the change is really statistically significant. You will
learn about this type of analysis in other exercises of the course.
- This answer is FALSE since it is based on two-tailed hypothesis testing and the
statement of the exercise requires one-tailed testing.
- This answer is FALSE. The part of the strange values, although it is true that there
are some marks like this in the graph, is somewhat overstated. There are also
missing values but in this course we do not give them too much importance. In the
hypothesis tests, the cholesterol level on day 14 is still high (we reject H0<=200 with
p=0.022) and on the other days it is even worse.
- This answer is FALSE although just barely: the cholesterol of the population of
subjects that represents the CONTROL sample borders on excessive cholesterol
levels. Look at the solution of the exercise to see the explanation of the result in more
detail.
- This answer is FALSE. The results show that the aliens always had their cholesterol
too high, not too low, during the time they were dieting, and they were very close to
having it too high when they were measured for the CONTROL measure.
- This answer is CORRECT. The results show that the variables are sufficiently
symmetric and do not show very strange values. The results of the hypothesis tests
show that the alien population has high cholesterol on days 2, 4, and 14, although it
shows a decreasing trend.
3. In the Data section of the course, there is a data file called ESSEspanya.sav, which has
variables belonging to the European Social Survey referring to Spain. In it there are
variables related to the state of mind of the Spaniards, happiness, values, attitudes towards
social issues (gender, emigration, etc.), politics, and many others. That file should be a good
source for your course report.
Look at this piece of news and find out what percentage of young people below 19 years of
age suffer from depression. To make this calculation, keep in mind that you have to select
only adolescents (if you don’t know how to do it, it’s time to ask). I have selected those under
19 years old and I get 65. In that file there is a variable called fltdpr that corresponds to the
question: “Felt depressed, how often past week”. I understand that if someone answers that
question “Most of the time” or “All, or almost all the time” then he/she is depressed. NOTE: I
have combined the responses from the two previous responses into the “Most of the time”
alternative.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
40
Check if the percentage of adolescents who say in the news that they are depressed in
Spain coincides with the value that appears in the ESS.
a. The percentage of depressed people in the article is 10% but that of the sample is
only 0.01, so it is very conspicuous that fewer persons are depressed in the sample
and, therefore the press should not be trusted at all.
b. Although the difference between the proportions is small, the effect size is very large,
so there must be differences between the two values.
c. There are only 6 subjects who claim to be depressed in the sample, so we cannot
draw valid conclusions.
d. The proportion of depressed people in the article is 0.1 and in the sample, it is 0.095.
Since the difference is very small, we cannot reject the null hypothesis that there is
no difference.
As we can see in the table below, 6 out of 63 subjects under 19 years of age stated that they
had feelings of depression in the previous week. That makes almost 10% (9.5% to be exact)
which almost completely coincides with the headline of the newspaper (1 out of 10).
Therefore, rejecting the null hypothesis is not possible and we could say that our sample
coincides (as it should) with the estimate published in the newspaper. The risk is also very
close to one, which means that the proportion in the null hypothesis (symbolized by p0) is
very similar to the observed one (symbolized by p1).
- FALSE. This alternative mixes proportions with percentages. Do not forget that the
analyzes are usually carried out with proportions, but percentages are often used to
communicate the results.
- FALSE. The size of the effect is measured by the relative risk. To that extent, a
relative risk of 1 means no effect. We have a large effect well when it is greater than
1 or less than one and we don’t have strict criteria about when that effect is large or
small (it depends on how good or bad it is that we are talking about).
- FALSE. I do not see why.
- TRUE. This statement is valid.
4. In the Data section of the course, there is a data file called ESSEspanya.sav, which has
variables belonging to the European Social Survey referring to Spain. In it there are
variables related to the state of mind of the Spaniards, happiness, values, attitudes towards
social issues (gender, emigration, etc.), politics, and many others. That file should be a good
source for your course report.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
41
In this document from the WHO (Salud mental del adolescente (who.int)) there is a mention
of the percentage of depressed people between 15 and 19 years old. You can check if this
percentage is the same in the ESS for Spain(although we do not have 15-year-olds in our
sample but this is OK). There is a variable called fltdpr in that file that corresponds to the
question: “Felt depressed, how often past week”. I understand that if someone answers that
question “Most of the time” or “All, or almost all the time” then he or she is depressed.
NOTE: I have combined the responses from the two previous responses into the “Most of
the time” alternative. NOTE: I have combined the responses from the two previous
responses into the “Most of the time” alternative. To make this calculation, keep in mind that
you have to select only those under 20 years of age (if you don’t know how to do it, it’s time
to ask).
Check if the percentage of adolescents mentioned in the piece of the news who are
depressed in the world coincides with the value that appears in the ESS.
a. The proportion of depressed people according to WHO .8% and in the sample is
9.5%. The difference is not significant and Spain has lower percentage of youth
people suffering depression than mentioned by the WHO.
b. There are only 6 subjects who claim to be depressed in the sample, so we cannot
draw valid conclusions.
c. The proportion of depressed people in the article is 2.8% and in the sample it is
9.5%. The difference is not significant and in the Spanish sample the risk of
depression is similar to that indicated by the WHO.
d. The proportion of depressed people in the link is 2.8% and in the Spanish sample is
9.5%. The difference is significant and in the Spanish sample the risk of depression
would be more than three times greater than that indicated by the WHO (ug!)
As we can see in the table below, 6 of 63 subjects under 19 years of age stated that they
had feelings of depression in the previous week. That makes practically 10% (9.5% to be
exact) which is quite different from what appears in the WHO link. In this case, we reject the
null hypothesis and see that the risk is much greater in the Spanish sample.
5. For aliens, totamine is part of what is sometimes called the happiness quartet, mediating
feelings such as love, pleasure, and sexuality, though it may also have to do with addictions.
Low totamine levels can make aliens less likely to work for a purpose. After studying in depth
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
42
the work problems of a group of aliens who work mainly the night shift,
psychologists wonder if their average levels of totamine are altered due to
their lifestyle. The average totamine level of normal aliens is set at 10
mgs/dl.
Note that the purpose of this study is to test whether the night shift aliens
can be considered “normal” aliens or not: that is, whether they can be
considered a random sample that could have been drawn from a population
of normal subjects. Would you say that is so?
a. The subjects that were captured are a biased sample that we can
say included those with less totamine.
b. The mean totamine of the subjects is 10.33 and a significance test
with H0:μ<=10 would lead to no rejection of this null hypothesis with
(p=.190).
c. The mean of the sample is 10, but since there is a very high extreme value, we
should not trust that value since it may not generalize well to the population.
d. Since the sample only includes night shift subjects, this study cannot be performed.
e. The sample of subjects evaluated has a normal totamine since there are several
subjects who have less than 15.
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
43
6. Returning to our planet, the data in this exercise correspond to one of the darkest
moments in human history. In 1945, in the city of Nuremberg, the first trial began against the
highest officials of the Nazi government (those who were captured alive, of course). This trial
had many ramifications, both legal and political, as such a trial posed challenges from the
point of view of international law… but it also aroused much interest among psychologists
struggling to understand explanations for behavior as malignant as those shown by those
leaders during that period. For this reason, two specialists in human behavior carried out
tests and interviews with these leaders to gain an in-depth understanding of their
psychological characteristics. A summary of these studies can be seen for example here but
if you do an internet search you will find many other sources.
The IQ_Nuremberg.sav has the results of the IQ measurements carried out by two
specialists. In it you can see the names of the subjects and their level of intelligence (IQ). IQ
is a measure of general intelligence that can be measured with different tests and that,
despite all its caveats, is widely used in a variety of places. In general, an average IQ has a
value of 100, and the standard deviation is usually normalized to 15. Assuming that
intelligence follows the normal distribution, then 115 is high, 130 is very high, 145 is
extremely high, and 187 is Sheldon.
One of the arguments used by some of the defendants in their defense was simply “playing
dumb” by saying that they did not know, that they did not suspect, that they had no
decision-making capacity, etc. One way to answer those arguments would be to study those
intelligence tests, but that raises some issues, and therefore not all answers in the list below
are equally correct, so do your analysis of the data in IQ_Nuremberg.sav and answer the
alternative question or alternatives that seem most correct to you.
a. The mean intelligence of the subjects is 128 and a significance test with H0:μ<=100
would lead to rejecting the null hypothesis that they had low intelligence. However,
these results are based on considering that these data are a random sample drawn
from a population of Nazis with characteristics similar to those captured, which is a
questionable assumption.
b. The sample of subjects evaluated has a normal intelligence since there are several
subjects who have IQs lower than 115.
c. The mean of the sample is 128, but since there is a very low extreme value, we
should not trust that value since it may not generalize well to the population.
d. The subjects that were captured are a biased sample that we can say included the
least intelligent, since the most intelligent were surely able to escape and set up
secret organizations that are responsible for climate change and COVID.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
44
The number of cases in the sample is 21. There is an extreme case with intelligence of
“only” 106 but it does not seem that this should affect the average too much. This can be
seen in the box plot below.
First of all, the sample mean is 128 and therefore it seems that the IQ they showed is quite
high. Considering them as “fools” does not seem to be right.
However, using inferential statistics in this case seems a bit complicated to justify since it is
not clear which population the sample represents. Does the sample refer to the entire
population of leading Nazis? Or only those who were initially selected to stand trial? Or since
there were many who died or escaped before, were those who were captured special in
some way? Keep in mind that all these considerations, are very relevant when it comes to
appreciating research that justifies psychological theories: many times, the subjects to which
researchers or therapists have access are very limited (for example, patients with very
specific problems like Freud’s only treating aristocratic upper-class people) and therefore
their conclusions should be assessed within that context. For this reason, although studying
the type of Nazis who were tried in Nuremberg is not without interest, we must be cautious
and avoid making generalizations to populations that perhaps do not exist.
Anyway, below I have put the results of a hypothesis test with the null hypothesis that the
subjects were below average. It is interesting to see that Cohen’s d is very high, which
shows that this group of subjects greatly exceeded the level of 100. I have also put 115 as
the null hypothesis to test if they were more than one standard deviation above the mean
and the results are still significant as you will see. Finally, a test using 130 (two standard
deviations) gives non-significant results, which means that we cannot reject that these
subjects had high intelligence on average.
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
45
- TRUE.
- FALSE. Just because there are some subjects below 115 does not mean that the
sample is normal as a whole.
- FALSE. The outlier value is not so large.
- FALSE. I can recommend you a good psychologist if you wish.
7. In the Data section of the course, there is a data file called ESSEspanya.sav, which has
variables belonging to the European Social Survey referring to Spain. In it there are
variables related to the state of mind of the Spaniards, happiness, values, attitudes towards
social issues (gender, emigration, etc.), politics, and many others. That file should be a good
source for your course report.
This article provides data on the percentage of people age 65 and older who reported having
chronic sleep problems. In the slprl question you can see similar information for Spain. I
understand that if someone answers that question “Most of the time” or “All, or almost all the
time” he or she has a chronic sleep problem. NOTE: I have combined the responses from
the two previous responses into the “Most of the time” alternative.
Check if the percentage of people over 65 years old who say in the article that they have
sleep problems coincides with the value that appears in the ESS. I hope you know how to
filter out the group of people over 65, but if you do not, let me know.
a. There are only 6 subjects who state that they have sleep problems in the sample, so
we cannot draw valid conclusions.
b. The size of the effect is small, so although the differences are significant, it must be
considered that they are not important.
c. The percentage of people who have sleep problems according to the article is 50%,
but in the sample, it is lower (22% rounded), and the differences are significant.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
46
d. The percentage of people who have sleep problems according to the article is 50%
but in the sample, it is higher, although the differences are not significant.
As we see in the table below, the proportion of people who answered that they had sleep
problems in the sample was 22% compared to 50% in the article mentioned above. The
differences appear significant and the size of the effect is quite large (there would be more or
less half the risk in the sample than that indicated in the article.
8. This exercise uses some data about a race of aliens that, interestingly enough, have
many similarities to the beings that inhabit the planet Earth. So even though many of the
theories, hypotheses, and results mentioned might resemble what we have about human
beings, don’t trust your instincts as aliens may be different and therefore we need to do
statistical tests to make sure of that. Below is an example of the above.
For aliens, totamine is part of what is sometimes called the happiness quartet, mediating
feelings such as love, pleasure, and sexuality, though it may also have to do with addictions.
Low totamine levels can make aliens less likely to work for a purpose. An alien psychologist
believes that the inhabitants of the planet Terrum have totamine levels too low and that is
why their work performance is very low. To solve this problem, he has designed a treatment
based on meditation and concentration that, according to him, would increase the totamine
levels of the aliens, but before applying this method, he needs to demonstrate that the
problem actually exists, and for this he has measured the totamine of a sample of 10,000
subjects from that planet. The average totamine level of normal aliens is set to 10 mgs/dl.
Would you say that the aliens living on the planet Terrum have too low average totamine
levels from the results shown below?
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
47
The hypothesis test using the population value is below. Note that I have put all the possible
hypotheses that could be used in this case so you must pay attention anc choose the
appropriate one for the problem.
a. Mean totamine is NA mgs/dl. Using $H_0$10 we would reject the null hypothesis with
a significance value of < .001 which would mean that the mean totamine is indeed
too low. However, since we see that the sample is very large and that the difference
is actually very small (-0.11, it is appropriate to take a look at Cohen’s d to assess
this result. Since this value is very small, our conclusion should be that the difference,
although it exists, is too small to believe that it has consequences.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
48
The mean totamine is NA mgs/dl. Using $H_0$10 we would reject the null hypothesis with a
significance value of < .001 which would mean that the mean totamine is indeed too low.
However, since we see that the sample is very large and that the difference is actually very
small (-0.11, it is appropriate to take a look at Cohen’s d to assess this result. Since that
value is very small, our conclusion should be that the difference, although it exists, is too
small to believe that it has consequences.
- TRUE. This statement is valid. Although there is a small difference that is significant
due to the large sample size, it is necessary to look at the effect indicators when the
sample is very large.
- FALSE. The mean is not correct if you look closely at the results.
- FALSE. The sample is very large. You have to look at the size of the effect to be able
to assess the result.
- FALSE. Although this answer is valid with respect to the hypothesis test, the effect
size part needs to be assessed: When the sample is very large, you have to look at
the effect size to be able to assess the result.
- FALSE. Although there are some extreme values, with a sample of that size there is
no need to worry that they will alter the result. The t-test is robust to this type of
deviation.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
49
may also have to do with addictions . Low totamine levels can make aliens less likely to work
toward a goal. An alien psychologist believes that the inhabitants of the planet Terrum have
totamine levels too low and therefore their work performance is very low. To solve this
problem, he has designed a treatment based on meditation and concentration that,
according to him, would increase the totamine levels of the aliens, but before applying this
method, he needs to demonstrate that the problem actually exists, and for this he has
measured the totamine of a sample of 50 subjects of that planet. The average totamine level
of normal aliens is set to 10 mgs/dl.
T-test table
a. Mean totamine is 7.67 mgs/dl. Using $H_0$10 we would reject the null hypothesis
with a significance value of < .001 which would mean that the mean totamine is
indeed too low.
b. Mean totamine is 7.67 mgs/dl. Using $H_0$10 we see that we cannot reject the null
hypothesis.
c. The sample of aliens tested has a mean totamine of 17.67 which is clearly above
normal totamine levels. The conclusion is that they have too much totamine and they
should follow the recommended treatment.
d. The mean totamine of the aliens is NA but since there is a value highlighted in the
results, that value is not well estimated and we should not carry out the analysis.
e. The 95% confidence interval of the mean IQ of a supposed population from which the
sample would have been drawn is [7.4,7.94], this means that we are not sure that the
sample of aliens has less totamine than necessary.
The mean totamine is NA mgs/dl. Using $H_0$10 we would reject the null hypothesis with a
significance value of < .001 which would mean that the mean totamine of this group of aliens
is indeed too low. The standardized effect size is also very high, which suggests that if the
only thing that differentiates this group of aliens from the rest of the aliens is the diet they
follow, they are really going to feel very sad about having such a low totamine.
- TRUE. This statement is valid.
- FALSE. The null hypothesis that we should reject is $H_0$10, not $H_0$10 as stated
in the alternative.
- FALSE. That average is not correct if you look closely at the results.
- FALSE. With 50 cases, the t-test is sufficiently robust if there are any extreme values,
which in this case may or may not be true.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
50
- FALSE. The confidence interval for the mean does not include 10 so we are
confident that the subjects in the sample have less totamine than 10 overall.
10. In the GSS93sp.sav file there is a variable called zodiac that collects the zodiac sign of
the participants. Although my ignorance about astrology is total and my lack of interest in it is
so great that I haven’t even bothered to Google the subject, I believe that the distribution of
the signs of the zodiac is homogeneous so that in any random sample of people there
should be a similar percentage of people of the same zodiac sign. To check it, calculate a
binomial test using the corresponding test value.
a. The test value to use is 1/12=0.08. Using this value, it can be observed that there are
two signs of the zodiac that occur in a percentage significantly higher than expected
(Pisces and Leo) and another that occurs less than expected (Taurus).
b. The test value to use is 1/12=0.08. Using this value, it can be observed that there are
two signs of the zodiac that occur in a percentage significantly higher than expected
(Pisces and Leo) and another that occurs less than expected (Taurus), but it gives
me that this answer must not be the correct one but I don’t know why.
c. Looking at the size of the effect, it is straightforward to see that there are five signs
that occur in a percentage significantly higher than expected, and seven that occur
less.
Indeed, this question is tricky since it introduces a new concept that I have not presented in
class but that we will see on other occasions: When we make many comparisons with an
error level of 5% (it is called the error level because even though there were no differences
in the population there is still the possibility of erroneously rejecting the null hypothesis in a
5% of the times), the probability that we make such error goes up and consequently the error
level becomes higher than originally set.
That effect is clearly seen in the table below. Despite the fact that in principle it is expected
that about 8% of the sample will be of a given sign, and, therefore there should be 124
people with each sign out of the total of 1500, we can observe that there are some signs
under/over that number. These variations are unsurprising since a sample will always
present variations on the ideal. Besides, some of the numbers exceed what is expected so
much that the differences appear as statistically significant.
One solution to this problem is to adjust the significance level based on the number of
comparisons (categories in this case). When the categories are few, this adjustment is not
very important, but, in this case, in which we have twelve signs, making this adjustment is
quite effective. There are several methods but in this case I have used holm’s method which
is in the p.adj (adjusted p value) section and as you will see none of the comparisons is
significant in that column, and, in fact, many of them give a value close to the maximum of
p=1.
Zodiac signs
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
51
- FALSE. This alternative is not correct but it is true that until now I have not taught
how to do this analysis correctly.
- TRUE. This statement is valid. See the solution for explanation
- FALSE. Effect size is not used to test whether a difference is significant.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
52
1. One study compared the cholesterol levels of two groups of people living in different parts
of a Central American country. One group lived in a rural setting and the other lived in the
city. The theory behind this comparison is that those who eat a more “natural” diet will tend
to have better health and consequently lower cholesterol levels. The data is on the website
in the file Cholesterol.sav. Note that the variable cholesterol appears as the logarithm of the
cholesterol value (using logarithms is a way to reduce the asymmetry of a variable that we
will not see in detail in class but that will occasionally appear in some examples). Apply the
appropriate statistical technique to compare the two groups and indicates whether the initial
hypothesis is true.
a. There are several outliers in the data so the differences, while significant, should not
be taken seriously.
b. The group that comes from the urban environment has higher cholesterol than the
rural group, also the effect size is quite large.
c. The average cholesterol of the two groups is similar, so the differences are not
significant.
d. The group that comes from the urban environment has higher cholesterol than the
rural group but the effect is very small since the difference is one-third of a point.
The two groups have similar variance and are of the same size as we see in the graphs. The
theory indicates that the cholesterol levels in the urban world will be higher than in the rural
world, and, indeed, the averages indicate that this is how it happens. The differences
between the means are significant and, furthermore, the effect size is very large. No doubt
the urban group would see some benefits from a changed diet.
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
53
Note that the unstandardized effect size (that is, the difference between the means) may
seem small because it is shown in logarithms, but it is actually larger than it seems. If you do
the reverse transformation 5.36 becomes 212.7249464 and 5.05 becomes 156.0224645.
The difference is 56.702482. Keep in mind that above 200 ml/cm doctors advise taking
medicines, so in the city there would be quite a few that would get this recommendation, as
you can see in the plot below.
- FALSE. I don’t see many outliers and besides the t-test is robust with sample sizes
like the ones shown in the study.
- RIGHT. This statement is valid. See the solution for an explanation.
- FALSE. The differences are clearly significant. A look at the graph should have
shown this to you. Do not forget that cholesterol is measured on a logarithmic scale,
which leads to small values.
- FALSE. The difference may be apparently small by using logarithms but if you scale
the value to the original you will see that they are not dif=56.702482
2. Taking Ginkgo has been associated with cognitive improvements and is sold as a
“traditional” medicine (the quotes are because I refuse to believe that something that comes
out of a bottle can be called “traditional”). In this experiment it was tested with a randomized
clinical trial if this drug really had an effect.
The Memory variable is the difference in the recall of a series of elements before and after
following the treatment, so both positive and negative values can appear.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
54
T-test table
- FALSE. No. In this case the correct null hypothesis is H0:μ1−μ2=0, which is the one
that appears in the data table.
- TRUE. This statement is valid. See the solution for explanation
- TRUE. This statement is valid. See the solution for explanation
- FALSE. There are no outliers as far as I can see.
3. A series of tests and psychological tests were carried out in a hypothetical school to
improve the way in which certain topics are approached in the subject of sports. Specifically,
there was interest in physical performance and its perception by students. Four variables
were collected in a sample of 100 12-year-old boys and girls.
BMI: Body Mass Index, a body mass indicator about which there is a lot of information on the
internet along with how to calculate it.
Body satisfaction: The results of a test in which questions about that topic are asked and
then added up.
Self-assessment of resistance: A questionnaire in which questions are asked about how
capable each one feels of enduring in effort and then added up.
PACER: A physical test that consists of going around a circuit for a given time. The
measurement is the number of turns.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
55
It is convenient that you look at the statistical graphs before proceeding with the tests. There
are some outliers and lack of homogeneity of variance, but in general it is not excessive.
The two variables in which the differences are significant are PACER, with a fairly high
standardized effect, and Resistencia, with a somewhat lower effect. In both cases, males
have higher scores. In the other two variables the differences are not significant
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
56
4. Congratulations, you have been hired as the personnel manager of a famous telephone
company and since the directors are interested in creating a system of incentives for workers
linked to productivity, they have decided to entrust you to design how these bonuses will be
distributed. In principle, the two fundamental criteria in which the company is interested are
the satisfaction of the customers with the service and the length of the calls. However, when
the proposal has been brought to the union representatives, they have found it inappropriate,
since it does not take into account the different working conditions, seniority, gender, the type
of call that workers receive, their qualification, if shifts vary, etc. In addition, they say that
using satisfaction as answered by those who have received the call is not a good indicator
and they demand that a more in-depth evaluation be performed based on the judgment of
experts assessing how well the phone calls are managed by part of the worker.
To study whether the unions are right, you decide to randomly sample 110 calls and evaluate
them anonymously by company experts to judge their quality (on a scale of 1 to 7).
Satisfaction with the call is collected by asking customers to provide a score of 1 to 10 when
ending the call. The duration of the call is recorded automatically (in minutes). The gender of
the client is deduced from his/her voice and the rest of the variables are information about
the workers you can obtain easily.
Your task is to decide which factors of those indicated by the unions affect the satisfaction
ratings, the quality scores, and the time of the call. From your statistical analysis, indicate if
there are arguments that support the thesis of the unions by demonstrating it with the
corresponding statistical results. The data is in the SatProducti.sav file. You have a link to
this data in section 9.3 of the course.
Analyze the effect of being a permanent or temporary worker in this case. Answer the
questions below after doing the analyses.
a. Temporary workers are more attentive and get better quality according to experts and
better satisfaction ratings from customers.
b. The results show that the resolution time and satisfaction according to the customer
ratings is greater in temporary workers than in permanent ones, but not in terms of
quality.
c. The satisfaction variable is highly asymmetric so no conclusions can be drawn from
the hypothesis tests.
d. The permanent workers are slower than the temporary workers but they get better
satisfaction ratings than those. The experts do not find that any group has better
quality in the solutions they provide.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
57
There are significant differences in time and satisfaction, but not in quality. The standardized
effect size is also quite large for the two variables with significant differences between
groups. The temporaries are faster but customers are less satisfied with them. The quality
according to the experts does not present differences between groups.
T-test table
- FALSE. Nope.
- FALSE. Nope.
- FALSE. There is some asymmetry in some variables but it is not a big deal.
- TRUE. This statement is valid. See the solution for explanation
5. Congratulations, you have been hired as the personnel manager of a famous telephone
company and since the directors are interested in creating a system of incentives for workers
linked to productivity, they have decided to entrust you to design how these bonuses will be
distributed. In principle, the two fundamental criteria in which the company is interested are
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
58
the satisfaction of the customers with the service and the length of the calls. However, when
the proposal has been brought to the union representatives, they have found it inappropriate,
since it does not take into account the different working conditions, seniority, gender, the type
of call that workers receive, their qualification, if shifts vary, etc. In addition, they say that
using satisfaction as answered by those who have received the call is not a good indicator
and they demand that a more in-depth evaluation be performed based on the judgment of
experts assessing how well the phone calls are managed by part of the worker.
Analyze the effect of gender of the workers in this case. Answer the questions below after
doing the analyses.
a. The results show that the resolution time and satisfaction according to the clients’
score is different between men and women. Women are valued better than men.
b. The results show that the quality of care assessed by experts is different between
men and women. Women are valued better than men.
c. The results show that there are no differences between men and women in any of
the three criteria.
d. The results show that there is a relationship between the quality evaluated by experts
and the satisfaction with the customer calls.
There are significant differences in quality but not in the other variables. The standardized
effect size is also quite large.
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
59
- FALSE. Nope.
- TRUE. This statement is valid. See the solution for explanation.
- FALSE. Nope.
- FALSE. This alternative has nothing to do with the question so even if it is true, which
I have not checked yet, it is beside the point
6. Congratulations, you have been hired as the personnel manager of a famous telephone
company and since the directors are interested in creating a system of incentives for workers
linked to productivity, they have decided to entrust you to design how these bonuses will be
distributed. In principle, the two fundamental criteria in which the company is interested are
the satisfaction of the customers with the service and the length of the calls. However, when
the proposal has been brought to the union representatives, they have found it inappropriate,
since it does not take into account the different working conditions, seniority, gender, the type
of call that workers receive, their qualification, if shifts vary, etc. In addition, they say that
using satisfaction as answered by those who have received the call is not a good indicator
and they demand that a more in-depth evaluation be performed based on the judgment of
experts assessing how well the phone calls are managed by part of the worker.
To study whether the unions are right, you decide to randomly sample 110 calls and evaluate
them anonymously by company experts to judge their quality (on a scale of 1 to 7).
Satisfaction with the call is collected by asking customers to provide a score of 1 to 10 when
ending the call. The duration of the call is recorded automatically (in minutes). The gender of
the client is deduced from his/her voice and the rest of the variables are information about
the workers you can obtain easily.
Your task is to decide which factors of those indicated by the unions affect the satisfaction
ratings, the quality scores, and the time of the call. From your statistical analysis, indicate if
there are arguments that support the thesis of the unions by demonstrating it with the
corresponding statistical results. The data is in the SatProducti.sav file. You have a link to
this data in section 9.3 of the course.
Analyze the effect of having a university diploma (Licenciado2) in this case. Answer the
questions below after doing the analyses.
a. The results show that quality and satisfaction as rated by customers is not different
between those with a degree and those without.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
60
b. The satisfaction produced by the graduates is greater than that of the non-graduates
and the size of the effect is quite good. The quality is also better and the
standardized effect is also quite good. There are no significant differences in
resolution time with a two-sided
test.
c. The results show that the
unions were wrong as
qualification does not lead to
better work.
d. The results show that the
graduates are better than the
non-graduates in general and
that the effect is robust in all
cases.
T-test table
- FALSE. Nope.
- TRUE. This statement is correct. Look at the solution for the explanation
- FALSE. This alternative has nothing to do with the question so even if it is true, which
has not been verified at the moment, it is beside the point.
- FALSE. In general they are quite good but not faster (which is an important point for
the company, of course)
7. One of the oldest psychological theories in history is the one that relates physical aspects
of people’s heads to their character, intelligence, or whatever. At the end of the 19th century,
phrenology became quite popular and more recently neuroscience has applied that same
idea by taking advantage of new technologies.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
61
The following data test a specific hypothesis about the relationship between brain and
intelligence. The description of the experiment can be found here and in summary allow us
to test the hypothesis of whether being big
headed (pun intended) is related to intelligence.
a. The results show that the height of the people is not related to their intelligence.
b. The results show that the brain size of the people is not related to their intelligence.
c. There are significant differences in intelligence and verbal intelligence.
d. There are no significant differences in intelligence and verbal intelligence.
The results show that there are no significant differences between genders in the variables
of intelligence and verbal intelligence, but there are in weight, height and brain size.
T-test table
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
62
1. An area of research and application in Psychology is the one related to perception through
our senses. Perception requires mental processing and many psychological disorders that
affect mental capacity alter perception. Likewise, having problems with perception and
sensation can have psychological effects of many kinds. As an example, it seems that
people’s personality can influence their perception of whether a hearing aid works well
(Personality, Hearing Problems, and Amplification Characteris... : Ear and Hearing
(lww.com))
Technological aids can be a lifesaver for many, but tuning them up is often not as easy as
desired.
Hearing-enhancing devices must be individually fitted. One way to check that a device is
working well for a patient is to have them listen to a recording with 25 words spoken clearly
but loudly. However, there are words that are easier to recognize than others, so it is
important that the lists have the same difficulty. Another problem is that hearing aids amplify
both the correct sound and background noise. Four lists were tested to see if they were
equally difficult to recognize when there is background noise. As having heard a list before
makes it easier to recognize the words, any time that adjustments are made to the
apparatus, the list must be changed to do the tests. In the experiment, 96 subjects with
normal hearing listened to lists of words in English to verify that they were of the same
difficulty. Each group of 24 subjects heard a different list.
Analyze the data and indicate what you would do with the words’ lists (there is more than
one possible solution).
The independent variable is the type of list and the number of words recognized, a numeric
variable, is the dependent variable. The appropriate technique to see if there are differences
in recognition between words is the analysis of variance.
The first step is to make a box plot to see if there are outliers, equality of variances,
asymmetry, etc.
The plot shows that list 1 seems to be easier than the other three since, on average, the
subjects who listened to it recognized more words. The second list is better than the third
and fourth, and the last two seem to be very similar.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
63
It is interesting to see the descriptive statistics of the data. We see that the means follow the
order shown in the boxplots. The main part, the boxes, display similar variability, and both
the standard deviation and the standard error in the descriptives table below confirm this
impression. The means of lists 3 and 4 are practically the same, but the list’s 2 mean is
somewhat higher and the list’s 1 mean is quite different from the others, with a difference of
7 words more recognized on average than lists 3 and 4.
The plot shows that the differences are significant and what would be the distribution of F
and η2 if the null hypothesis were true. The box plots are similar to the ones we have seen
before but we also see a representation of the variability of the mean of all the data (the red
almond) and the position of the means of the lists. This plot illustrates that the lists are
further apart than might be expected by chance, but it’s best to refer to the value in the
analysis of the variance table shown below to confirm it. The distribution of F and η2 also tell
us the same thing: the value of F and η2 is greater than 95% of the valuesof F or η2 that we
could find at random if the differences between the groups were zero. This means that there
are indeed significant differences between the groups.
However, we still need to check the results in the tables and we must also determine
between which groups the differences occur, and although it can be seen that both 1 and 2
could be different from the other two, we need to confirm such observation with hypothesis
tests.
The analysis of variance confirms that there are differences between the groups. This
appears both in the F test, which does not control for non-homogeneity of variance, and in
the Welch test, which corrects for non-homogeneity of variance. In this case, we can see
that both results are very similar.
The value of η2 is 0.14. That means that, although there are differences, the effect size is
not very large.
Which group is different from the others is determined using the pairwise tests. We can
display the result on the boxplot to see it more clearly. The table shows that list 1 is different
from 3 and 4, but not from 2. List 2 is not different from any of the lists since it is in an
intermediate position. The boxplot shows only the differences that are significant.
IN SUMMARY, we could do two things: keep lists 2, 3, and 4 since there are no significant
differences between them, or, group 1 and 2 on the one hand since there are no differences
between them, and group 3 and 4 on the other.
OR ALSO, most likely in this case, we might exchange words from one list to another
seeking to make them more similar, although for this we would need a theory about which
words are the most difficult to understand (we would have to review scientific knowledge
about what makes some words more difficult than others).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
64
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
65
One study tested the effectiveness of three drugs (medication) in reducing headache pain
(labeled A, B, and C). High scores indicated greater pain. Twenty-seven patients randomly
assigned to each of the drugs were used. Subjects had to take the drug at the next migraine
attack and indicate their pain level 30 ms later, from 1 to 10 where 1 is no pain and 10 is
extreme pain.
Drug A is the cheapest of all. The second in price is the B, and the most expensive is the C.
What medication would you recommend that doctors in your autonomous community
prescribe taking into account the results that can be seen below?
The data is in the file DrogaPain.sav (Section 9.4 Pain and pills)
a. Medications are useless to remove the pain because they do not go to the root of the
problem and if one endures in the end it winds up going by itself.
b. The A.
c. The C in first place and the B in second place as long as the money lasts.
d. The B in first place and the C in second place as long as the money lasts.
e. It is best to take one pill of each because this method never fails.
Looking at the box plots, drug A seems to work best. Drug B has an outlier. It could be
interesting to repeat the analysis without that subject to see what happens (I have done it but
I have not shown it, I leave it as an individual exercise to verify that the results are still
similar).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
66
El análisis de varianza confirma que las diferencias son significativas. El tamaño del efecto
es grande. Parece que las drogas realmente tienen efectos diferentes.
The pairwise tests show that drug A produces lower levels of pain than the other two, and
that these differences are significant. The differences in the pain scale are two points
between drug A and the other two, which seems to be a fairly important difference.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
67
3. The use of a new method of teaching mathematics in fourth grade is proposed in a school
district. There were fifteen schools and you were interested in checking if the schools have
different or similar levels before applying the teaching method. To verify this, a test was
applied to 120 students (8 randomly selected students per school). Check if the schools
have a similar level in that test.
There are some outliers in the boxplots. In addition, there are schools that have less
variability than others, so in this case it is especially interesting to use the formulas that
correct for the lack of equality of variances.
The analysis of variance shows that the differences between the schools are not significant.
Interestingly, the p-value calculated using Fisher’s method is different from Welch’s, but it
does not change the conclusions.
Normally it would not be necessary to carry out pairwise tests when the analysis of variance
has given non-significant results, but I include them so that you can see that none of the
comparisons is significant (the table is very long).
- RIGHT. Bingo!
- FALSE: What a surprise, right? There is always one who is the best.
- FALSE: What a surprise, right? There is always one who is the worst.
- FALSE: That’s what they told Gates and they spent a lot of money for nothing.
- FALSE: I don’t know where you get that from.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
68
4. In a study conducted at a university in the United States (Bauman and Jones, Purdue
university, cited in Moore and McCabe 1989), the effect of three study methods on improving
reading comprehension in children was studied. Participants were assessed on two
measures of reading comprehension before being trained on the methods (Pre1 and Pre2)
and three measures after being trained (Post1, Post2, Post3). In this case, we want to check
if the subjects in the groups that were going to be trained in each of the methods were
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
69
similar to each other before being trained with them, in order to be more certain of the effects
of the methods.
The three methods that were tested were called the Basal (Control group), the DRTA, and
the Strat. I have no idea what they consist of, but we will see which one works best
assuming that higher scores on the Pre1 and Pre2 measures are better than lower scores.
The exercise now is to test if Pre1 and Pre2 are similar in the three groups or not.
There are some outliers in the boxplots. In addition, there are schools that have less
variability than others, so in this case it is especially interesting to use the formulas that
correct for the lack of equality of variances.
The analysis of variance shows that the differences between the schools are not significant.
Interestingly, the p-value calculated using Fisher’s method is different from Welch’s, but it
does not change the conclusions.
Normally it would not be necessary to carry out pairwise tests when the analysis of variance
has given non-significant results, but I include them so that you can see that none of the
comparisons is significant (the table is very long).
- RIGHT. Bingo!
- FALSE: What a surprise, right? There is always one who is the best.
- FALSE: What a surprise, right? There is always one who is the worst.
- FALSE: That’s what they told Gates and they spent a lot of money for nothing.
- FALSE: I don’t know where you get that from.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
70
5. There are three things in life…the song says…health, money and love. A popular theory
about happiness is that having these three aspects well covered in life are the best path to
happiness (and they are probably not far wrong).
This theory can be tested in Spain using data from the European Social Survey archive
(ESSEspanya.sav). In this survey there is a question called “happy” and that is the answer to
the question “How happy are you?” on a scale of 0 to 10, which, although it is obviously an
ordinal variable, it is acceptable to analyze it as a numerical variable.
In this case I will analyze the variable marsts (Legal marital status) as an indicator of “love”,
or at least, as an indicator of sentimental relationship. There are five states in the database
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
71
and the hypothesis that could be tested is that those who are in a romantic relationship will
be happier than those who have ended or failed in a romantic relationship. Not having been
in any relationship…I don’t have a clear hypothesis drawn from my “theory”.
The data is in the file ESSEspanya.sav (Section 15 European Social Survey Spain).
a. It doesn’t matter if you’re married or single if you know how to manage yourself.
b. Singles are happier than married people.
c. Divorced people are the happiest.
d. Singles are happier than divorcees and widowers.
e. Being separated or divorced is worse than being married
The box plots show quite similarity between the groups. Singles and married seem to be a
little above the others although the differences do not stand out much.
The descriptives confirm that the means are quite similar but we need to confirm it with the
significance tests. As a curiosity, there are very few legally married or separated in the
sample and many never married.
The analysis of variance shows that there are significant differences, although the
standardized effect size is very small. Since there are many cases in the survey it is almost
inevitable that the value of p will be significant so we will be better off paying attention to the
value of η2 rather than to the value of p.
Pairwise comparisons show that the only significant differences are between those who are
divorced and those who are single, and between those who are widowed and those who are
single. Obviously, in the latter case there is an effect of age that should be taken into
account: for example, these analyzes could be done by age groups to compare the effect of
widowhood or other variables with people of a similar age who are still married. . Similarly,
being single at certain ages may be associated with happiness and not at others.
Actually, as you can imagine, happiness is a somewhat more complicated matter than the
song suggests and therefore, in order to make your report on the subject, my suggestion is
that you document yourself in a more serious than what I have done in this case.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
72
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
73
1. Nicotine goes to nicotinic receptors in the brain, increasing the release of numerous
neurotransmitters. Cotinine is a product of the transformation of nicotine by the body and
that remains in the body for a long time, so it is used to measure exposure to nicotine since it
disappears more quickly from the body (that is why nicotine tests are usually actually
cotinine tests).
Nicotine may play a role in certain mental illnesses. For example, people with schizophrenia
have much higher rates of tobacco use than the normal population, and although the causes
of this are difficult to discern, there is some evidence of underlying biological factors.
The amount of nicotine that a smoker metabolizes can depend on their genetics and also
their diet. One factor that can affect is menthol products, which are put in certain cigarettes.
There are suspicions that these products can increase tobacco consumption, or make it
more addictive, so it is interesting to study their influence on the body.
In this study the effect of a mentholated drink on the amount of nicotine, cotinine, and their
ratio (nicotine/cotinine) in smokers was investigated. Subjects spent a week with three mint
drinks a day and then a week without any menthol (well, it’s a bit more complicated but you
can read the details in the article if you wish). The urine samples were analyzed and there
was interest in comparing whether there was a difference in the amount of nicotine, cotinine,
and the Nicotine/Cotinine ratio of the subjects.
a. Mint drinks lead to more nicotine, less cotinine, and a higher Nicotine Cotinine ratio
than drinks without menthol.
b. As the samples are very large, you have to look at the size of the effect, which in this
case is negative, and therefore taking mint drinks does not affect the nicotine tests at
all.
c. Mint drinks lead to less nicotine, more cotinine, and a lower Nicotine to Cotinine ratio,
than drinks without menthol.
d. Mint drinks do not affect the metabolism of nicotine
e. Chewing gum is the best way to hide that you have smoked
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
74
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
75
It is interesting that you look at how the data is organized. For analysis of paired, dependent,
or related measures (these names are equivalent), the data is most often than not organized
in columns as shown below.
To do the analysis, the first thing is to make a box plot to see if there are strange values,
equality of variances, asymmetry, etc. Note that each variable is plotted next to its pair since
we have three sets of variables.
The output below shows the descriptives for all the variables and the tests between the pairs
of variables. All comparisons are significant.
The cotinine level is higher in the no mint condition (which would mean that the body does
not metabolize nicotine so quickly when mint drinks have been taken than when not since
the body converts nicotine into cotinine).
The level of nicotine is lower when you do not take mint drinks for the same reason that
cotinine is higher: without mint drinks, nicotine is metabolized faster.
The ratio between nicotine and cotinine is where the differences are largest (we see it in the
effect size which is 1.21, the largest of all). There is more nicotine relative to cotinine with
mint drinks than without them.
There are several types of therapies for anorexia, among which we will focus in this exercise
on family therapy.
In a study by Professor Brian Everitt* * and described in Hand, DJ et al. (p. 229)** the
weights in kilograms of a group of young women who received three types of treatment for
anorexia were analyzed. Unfortunately, there is not much information about these data come
from or the conditions of the study, apart from what has been said, but Professor B. Everitt
worked all his life at the Institute of Psychiatry at King’s College London and I suppose that
these data come from some study carried out in this center.
The file is in section 15 and is called Anorexia. The goal of this analysis is to compare the
weights of the patients before and after receiving family therapy.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
76
The first step of the analysis is to make a box plot to check if there are strange values,
equality of variances, asymmetry, etc. In this case, it seems that the effect is important
although there are three or four patients who still have very low weight after the therapy.
A graph that is interesting to check is the so-called parallel coordinates graph. In this graph,
we link each patient with a line before and after the therapy to visualize the change.
In this plot, you can see that some of the subjects did not gain weight, but actually lost it, and
there are four particularly worrying cases that ended up with quite low weight. Although the
median weight after therapy is higher as we saw in the box plot, there are some cases where
it does not seem to work at all.
After therapy, the patients were 3.27 pounds heavier on average, and the differences were
significant. The effect is very large so we can be satisfied with the result.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
77
positive. In this case, we reject the null hypothesis and also that the difference is on
the benefit side, so this statement is false.
- FALSE: See the solution for the explanation.
- RIGHT. This is correct. See the solution for the explanation
There are several types of therapies for anorexia, among which we will focus in this exercise
on cognitive therapy.
In a study by Professor Brian Everitt* * and described in Hand, DJ et al. (p. 229)** the
weights in kilograms of a group of young women who received three types of treatment for
anorexia were analyzed. Unfortunately, there is not much information about these data come
from or the conditions of the study, apart from what has been said, but Professor B. Everitt
worked all his life at the Institute of Psychiatry at King’s College London and I suppose that
these data come from some study carried out in this center.
The file is in section 15 and is called Anorexia. The goal of this analysis is to compare the
weights of the patients before and after receiving cognitive therapy
The first step is to make a box plot to see if there are strange values, equality of variances,
asymmetry, etc. In this case, it seems that the effect is not too large since the medians are
quite similar. In addition, there is a little more variance after therapy.
A graph that is interesting to check is the so-called parallel coordinates graph. In this graph,
we link each patient with a line before and after the therapy to visualize the change.
In this graph, it can be seen that several patients had a fairly notable increase in weight, but
many remained more or less stable and they even reduced their weight.
After therapy, the patients were 1.35 pounds heavier on average, and the differences were
significant. The effect is moderate compared to the family therapy so it seems that family
therapy worked better.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
78
There are several types of therapies for anorexia, among which we will focus in this exercise
on NO therapy.
In a study by Professor Brian Everitt* * and described in Hand, DJ et al. (p. 229)** the
weights in kilograms of a group of young women who received three types of treatment for
anorexia were analyzed. Unfortunately, there is not much information about these data come
from or the conditions of the study, apart from what has been said, but Professor B. Everitt
worked all his life at the Institute of Psychiatry at King’s College London and I suppose that
these data come from some study carried out in this center.
The file is in section 15 and is called Anorexia. Compare the weights of patients who did not
receive therapy (i.e, they were in the Control group) before and after therapy.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
79
The first step is to make a box plot to see if there are strange values, equality of variances,
asymmetry, etc. In this case, it seems that the effect is null with similar weights in the
patients before and after.
A graph that is interesting to check is the so-called parallel coordinates graph. In this graph,
we link each patient with a line before and after the therapy to visualize the change.
Although the average effect is null, what we see in this graph is a large number of different
trajectories, with subjects improving a lot and others getting much worse than were before.
After therapy, patients were 0.2 pounds less on average in the sample and the differences
were not significant. The effect is null, so it seems that family and cognitive therapy worked
better than no therapy.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
80
1. 12 aliens are given a list of 10 objects to memorize and are asked to repeat it in one
minute. The exercise is performed three times: in the morning, in the afternoon and then
again in the evening.
a. There are no significant differences between the times of the day in terms of memory.
b. There are differences, but they are only between morning and night.
c. The best time of day for intellectual work is in the morning, as everybody knows
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
81
2. 10 aliens are offered a meditation course aimed at improving their reasoning abilities. This
is measured by a reasoning questionnaire that, as it is alien, we are not able to understand.
These measures are taken every two weeks until the sixth week (4 measures in total).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
82
3. The aliens are quite sensitive to temperature so they often complain that when the
spacecraft has the thermostat turned down too low, “it’s hard for them to think.” An alien
psychologist wants to test this problem and has asked a random sample of aliens to perform
a series of mental tests under three conditions (hot, wet, and cold). The measurements are
taken with tests of similar difficulty and the subjects go through the three conditions in
different series to reduce the effect of the order in which the exercises are performed.
Contrary to what the aliens say, cold weather seems to be the best thing for them. Humid
weather on the other hand shows a lot of variability.
The results of the repeated measures Anova are shown below.
The results show that the differences are not significant. There is no point in performing post
hoc tests.
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
83
4. The resistance of the skin of the aliens is associated with sweat and this with states of
psychological activation. That is why this measure is used in Psychology to, for example,
teach subjects to control their anxiety. This study wanted
to test whether five different types of electrodes worked
the same way or produced different measurements. The
data is in the Resistencia file (Resistencia.sav).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
84
The results show that the differences are not significant but it would be interesting to see if
they remain significant without subject number 15.
Without subject number 15 the differences are still not significant so, in principle, we cannot
say that the electrodes work differently. However, the effect size is medium, which suggests
that by increasing the sample size, the results could possibly become significant.
5. In a study of the effectiveness of training to improve the sexual attitudes and behavior of a
group of adolescent aliens, the frequency of unprotected sex was evaluated 6 months before
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
85
The variables that we will use are Pre (six months before), Post
(six months after), FU6 (from 6 to 12 months), and FU12 (12 to 18
months).
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582
86
Reservados todos los derechos. No se permite la explotación económica ni la transformación de esta obra. Queda permitida la impresión en su totalidad.
a64b0469ff35958ef4ab887a898bd50bdfbbe91a-5511582