Está en la página 1de 18

Chapter 9

Measurement and Scaling: Noncomparative Scaling Techniques


True/False Questions
1. Comparative techniques are comprised of continuous and itemized rating scales.
(False, moderate, page 255)
2. Non-comparative scales are often referred to as monadic scale.
(True, moderate, page 256)
3. Respondents using a non-comparative scale employ whatever rating standard seems
appropriate to them.
(True, moderate, page 256)
4. Burger King used the Perception Analyzer to measure responses to a series of sliceof-life commercials.
(False, moderate, page 257)
5. In an itemized rating scale, the respondents are provided with a scale that has a
number or brief description associated with each category.
(True, moderate, page 257)
6. Itemized rating scales are widely used in marketing research and form the basic
components of more complex scales.
(True, moderate, page 257)
7. Typically, each Likert scale item has seven response categories, ranging from
strongly disagree to strongly agree.
(False, moderate, page 258)
8. A total (summated) score can be calculated for each respondent by summing across
items.
(True, easy, page 258)
9. Profile analysis involves determining the average respondent ratings for each item.
(True, moderate, page 258)
10. The semantic differential scale is also referred to as a summated scale.
(False, moderate, page 258)
11. The semantic differential is a five-point rating scale with endpoints associated with
bipolar labels that have semantic meaning.
(False, moderate, page 259)

115

12. The Stapel scale is usually presented horizontally.


(False, easy, page 261)
13. An advantage of the Stapel scale is it can be administered over the telephone.
(True, difficult, page 261)
14. Of the three itemized rating scales considered, the semantic differential scale is used
the least.
(False, moderate, page 261)
15. The researcher must make four major decisions when constructing non-comparative
itemized rating scales.
(False, moderate, page 261)
16. The smaller the number of scale categories, the finer the discrimination among
stimulus objects that is possible.
(False, moderate, page 262)
17. When determining the number of scale categories to use in a non-comparative
itemized rating scale, the nature of the object is relevant.
(True, moderate, page 262)
18. When determining the number of scale categories to use in a non-comparative
itemized rating scale, if individual responses are of interest, or the data will be
analyzed by sophisticated statistical techniques, five or more scale categories may be
required.
(False, difficult, page 262)
19. The Likert scale is a balanced rating scale with an odd number of categories and a
neutral point.
(True, easy, page 262)
20. A forced rating scale forces the respondents to express an opinion because no
opinion or no knowledge options are not provided.
(True, easy, page 263)
21. In situations where the respondents are expected to have no opinions, as opposed to
simply being reluctant to disclose it, the accuracy of data may be improved by a nonforced scale that includes a no opinion category.
(True, easy, page 263)
22. It has been found that providing a verbal description for each scale category
consistently improves the accuracy or reliability of the data.
(False, moderate, page 263)

116

23. Non-comparative itemized rating scales with weak adjectives as anchors (1=generally
disagree, 7=generally agree) result in less variable and more peaked response
contributions.
(False, difficult, page 263)
24. A construct is the theory being measured.
(False, moderate, page 263)
25. The scale development process is an iterative one.
(True, easy, page 265)
26. XO = XT + XS + XR represents the pure score model.
(False, difficult, page 266)
27. Reliability refers to the extent to which a scale produces valid results if repeated
measurements are made.
(False, moderate, page 267)
28. Systematic sources of error do have an adverse impact on reliability because they
affect the measurement in a constant way and do not lead to inconsistency.
(True, moderate, page 267)
29. Reliability can be defined as the extent to which measures are free from random error,
XR.
(True, easy, page 267)
30. When assessing the test-retest reliability, the higher the correlation coefficient,
between the two measurements, the greater the reliability.
(True, moderate, page 267)
31. In alternative-forms reliability, the same respondents are measured at two different
times, usually one to three weeks apart, with a different scale form being administered
each time.
(False, difficult, page 268)
32. With alternative forms reliability, a low correlation may reflect either an unreliable
scale or nonequivalent forms.
(True, moderate, page 268)
33. An important property of coefficient alpha is that its value tends to increase with an
increase in the number of scale items.
(True, moderate, page 268)
34. Coefficient alpha assists in determining whether the averaging process used in
calculating coefficient beta is masking any inconsistent items.
(False, moderate, page 268)

117

35. Perfect validity requires that there be no measurement error, therefore:


(XO = XT, XR = 0, XS = 0).
(True, difficult, page 269)
36. Given its subjective nature, content validity alone is a sufficient measure of the
validity of a scale.
(False, moderate, page 269)
37. Construct validity is the simplest and easiest type of validity to establish.
(False, moderate, page 269)
38. Using several scale items to measure the characteristic of interest provides more
accurate measurement than a single-item scale.
(True, easy, page 271)
39. The semantic differential scale may be said to be pan-cultural or free of cultural bias.
(True, moderate, page 271)
40. The researcher can bias the scales by either biasing the wording of the statements
(Likert type scales), the scale descriptors, or other aspects of the scale.
(True, easy, page 272)
Multiple Choice Questions
41. In a _____, respondents rate the objects by placing a mark at the appropriate position
on a line that runs from one extreme of the criterion variable to the other.
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(c, easy, page 256)

118

42. How would you rate Sears as a department store?


Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - -

- - - Probably the best

Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - Probably the best
0 10 20 30 40 50 60 70 80 90 100
The above scales are all examples of a _____.
a. continuous rating scale
b. Stapel scale
c. Semantic differential scale
d. Likert scale
(a, moderate, page 256)
43. Scores assigned to continuous rating scales by the researcher are typically treated as
_____ data.
a. nominal
b. ordinal
c. ratio
d. interval
(d, moderate, page 256)
44. Which of the following statements does not pertain to non-comparative scales?
a. Comparative scales are often referred to as monadic scales.
b. Respondents using a non-comparative scale employ whatever rating standard
seems appropriate.
c. Data must be interpreted in relative terms and have only ordinal or rank order
properties.
d. Non-comparative techniques consist of continuous and itemized rating scales.
(c, moderate, page 256)
45. Which non-comparative scale has the advantage of being easy to construct and the
disadvantage of cumbersome scoring unless the scoring is computerized (Table 9.1)?
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(c, difficult, page 257)

119

46. The Perception Analyzer utilizes a _____.


a. continuous rating scale
b. staple scale.
c. semantic differential scale
d. Likert scale
(a, moderate, page 257)
47. Which scale is not an itemized rating scale?
a. Stapel scale
b. semantic differential scale
c. Likert scale
d. continuous rating scale
(d, easy, page 257)
48. A _____ is a measurement scale with five response categories ranging from strongly
disagree to strongly agree, which requires the respondents to indicate a degree of
agreement or disagreement with each of a series of statements related to the stimulus
objects.
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(b, easy, page 258)
49.

Neither
agree
Strongly
nor
Strongly
disagree Disagree disagree Agree agree
1. Sears has poor in-store service.
1
2X
3
4
5
2. I like to shop at Sears.
1
2X
3
4
5
The above scale is an example of a _____.
a. continuous rating scale
b. Stapel scale
c. semantic differential scale
d. Likert scale
(d, moderate, page 258)

50. Which non-comparative scale is analyzed using profile analysis?


a. Likert scale
b. semantic differential scale
c. Stapel scale
d. all of the above
(d, difficult, pages 258, 260, 261)

120

51. Which itemized rating scale takes longer to complete than other itemized rating scales
because respondents have to read each statement?
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(b, difficult, page 259)
52. A _____ is a seven point rating scale with endpoints associated with bi-polar labels
that have semantic meaning.
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(a, easy, page 259)
53. Sears is:
Powerful ::::-X-::: Weak
Unreliable :::::-X-:: Reliable
The above scale is an example of a _____ scale.
a. continuous rating
b. Stapel
c. semantic differential
d. Likert
(c, easy, page 260)
54. The _____ is known for its versatility and is very popular with marketing researchers.
a. continuous rating scale
b. Stapel scale
c. semantic differential scale
d. Likert scale
(c, difficult, page 260)
55. Which non-comparative scale is widely used in comparing brand, product, and
company images?
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(a, moderate, page 260)

121

56. A _____ is a scale for measuring attitudes that consists of a single adjective in the
middle of an even-numbered range of values, from -5 to +5, without a neutral point
(zero).
a. semantic differential scale
b. Likert scale
c. continuous rating scale
d. Stapel scale
(d, moderate, page 261)
57. Which scale asks the respondent to indicate how accurately or inaccurately each term
describes the object by selecting an appropriate numerical response category?
a. continuous rating scale
b. Stapel scale
c. semantic differential scale
d. Likert scale
(b, difficult, page 261)
58. The data obtained by using a Stapel scale can be analyzed in the same way as a_____.
a. continuous rating scale
b. Stapel scale
c. semantic differential scale
d. Likert scale
(c, difficult, page 261)
59. The _____ is confusing and difficult to apply. It is the least used of the itemized
scales.
a. continuous rating scale
b. Stapel scale
c. semantic differential scale
d. Likert scale
(b, difficult, page 261)
60. Which of the following statements is not a consideration when making noncomparative itemized rating scale decisions?
a. the number of scale categories to use
b. forced versus non-forced choice
c. balanced versus unbalanced scales
d. all are considerations
(d, moderate, page 261)

122

61. Which statement is not true if deciding the number of scale categories to use in a noncompensatory itemized rating scale?
a. Traditional guidelines suggest that the appropriate number of categories should be
seven plus or minus two: between five and nine.
b. The smaller the number of scale categories, the finer the discrimination among
stimulus objects that is possible.
c. If the respondents are not very knowledgeable or involved with the task, fewer
categories should be used.
d. How the data are to be analyzed and used should also influence the number of
categories.
(b, difficult, page 262)
62. Which statement is not true when deciding on whether to use balanced or unbalanced
scales when developing a non-comparative itemized rating scale?
a. The scale should be balanced to obtain objective data.
b. In a balanced scale, the number of favorable and unfavorable categories are equal.
c. If the distribution of responses is likely to be skewed, either positively or
negatively, a balanced scale with more categories in the direction of skewness
may be appropriate.
d. If an unbalanced scale is used, the nature and degree of unbalance in the scale
should be taken into account in data analysis.
(c, difficult, page 262)
63. Which statement is not true when deciding on whether to use an odd or even number
of categories when developing a non-comparative itemized rating scale?
a. With an odd number of categories, the middle scale position is generally
designated neutral or impartial.
b. The decision to use an odd or even number of categories depends on whether
some of the respondents may be neutral on the response being measured.
c. A rating scale with an even number of categories should be used if the researcher
wants to force a response.
d. All of the above statements are true.
(d, easy, pages 262-263)
64. Deciding whether to present scales as vertical or horizontal is related to which of the
non-comparative itemized rating scale decisions?
a. number of scale categories
b. physical form or configuration
c. odd or even number of categories
d. nature and degree of verbal description
(b, easy, page 263)

123

65. _____ is the first step in developing a multi-item scale. _____ is the last step.
a. Generate an initial pool of item; Prepare the final scale
b. Develop a theory; Prepare a final scale
c. Develop a theory; Develop a purified scale
d. Generate an initial pool of items; Develop a purified scale
(b, moderate, page 265)
66. Validity can be assessed by examining all of the following except:
a. item validity
b. content validity
c. criterion validity
d. construct validity
(a, moderate, page 266)
67. Which of the following is not an approach to assess multi-item scale reliability?
a. test-retest reliability
b. construct reliability
c. alternative forms reliability
d. internal consistency reliability
(b, moderate, page 266)
68.

69.

_____ is the variation in the information sought by the researcher and the information
generated by the measurement process employed.
a. Systematic error
b. Measurement error
c. Random error
d. Variable error
(b, difficult, page 266)
XO = XT + XS + XR
In the true score model shown above, XT represents:
a. random error
b. the observed score or measurement
c. the true score of the characteristic
d. systematic error
(c, moderate, page 266)

70.

Situational factors, such as the presence of other people, noise, and distractions and
mechanical factors, such as poor printing, overcrowding of items in the questionnaire,
and poor design are both _____ in measurement.
a. random error
b. potential sources of reliability
c. potential sources of error
d. systematic error
(c, difficult, page 267)

124

71. _____ represents stable factors that affect the observed score in the same way each
time the measurement is made, such as mechanical factors (see Fig. 9.6).
a. Systematic error
b. Measurement error
c. Random error
d. Variable error
(a, moderate, page 267)
72. _____ is not constant. It represents transient factors that affect the observed score in
different ways each time the measurement is made, such as transient personal or
situational factors.
a. Systematic error
b. Measurement error
c. Random error
d. Variable error
(c, moderate, page 267)
73. _____ is the extent to which a scale produces consistent results if repeated
measurements are made on the characteristic.
a. validity
b. generalizability
c. reliability
d. none of the above
(c, difficult, page 267)
74. A measure is perfectly reliable if:
a. XO = 0
b. XT = 0
c. XS = 0
d. XR = 0
(d, moderate, page 267)
75. _____ is an approach for assessing reliability in which respondents are administered
identical sets of scale items at two different times under as nearly equivalent
conditions as possible.
a. Internal consistency reliability
b. Split-half reliability
c. Test-retest reliability
d. Alternative-forms reliability
(c, moderate, page 267)

125

76. There are several problems associated with the test-retest approach to determining
reliability. If measuring respondents attitude toward low-fat milk may cause them to
become more health conscious and develop a more positive attitude toward low-fat
milk, then there is a problem with:
a. the time interval between testing.
b. the initial measurement altering the characteristic being measured.
c. it being impossible to make repeated measurements.
d. the first measurement having a carryover effect to the second or subsequent
measurements.
(b, difficult, page 267)
77. _____ is an approach for assessing reliability that requires two equivalent forms of
the scale to be constructed and then the same respondents are measured at two
different times.
a. Internal consistency reliability
b. Split-half reliability
c. Test-retest reliability
d. Alternative-forms reliability
(b, moderate, page 268)
78. Which of the following is not a problem with alternative-forms reliability?
a. The results will depend on how the scale items are split.
b. It is time consuming and expensive to construct an equivalent form of the scale.
c. It is difficult to construct two equivalent forms of a scale.
d. Both b and c are correct.
(a, difficult, page 268)
79. _____ is an approach for assessing the internal consistency of the set of items when
several items are summated in order to form a total score for the scale.
a. Internal consistency reliability
b. Split-half reliability
c. Test-retest reliability
d. Alternative-forms reliability
(a, easy, page 268)
80. _____ is a form of internal consistency reliability in which the items constituting the
scale are divided into two halves and the resulting half scores are correlated.
a. Internal consistency reliability
b. Split-half reliability
c. Test-retest reliability
d. Alternative-forms reliability
(b, easy, page 268)

126

81. _____ is a measure of internal consistency reliability that is the average of all possible
split-half coefficients resulting from different splittings of the scale items.
a. Coefficient delta
b. Coefficient alpha
c. Coefficient beta
d. Coefficient eta
(b, moderate, page 268)
82. _____ is the extent to which differences in observed scale scores reflect true
differences among objects on the characteristics being measured, rather than
systematic or random errors.
a. Validity
b. Generalizability
c. Reliability
d. None of the above
(a, difficult, page 269)
83. _____ is a type of validity, sometimes called face validity, that consists of a subjective
but systematic evaluation of the representativeness of the content of a scale for the
measuring task at hand.
a. Construct validity
b. Content validity
c. Criterion validity
d. Internal consistency validity
(b, difficult, page 269)
84. A scale designed to measure store image would be considered inadequate if it omitted
any of the major dimensions (quality, variety, assortment of merchandise, etc.). This
inadequacy would be reflected in the _____ of the scale.
a. construct validity
b. content validity
c. criterion validity
d. internal consistency validity
(b, difficult, page 269)
85. _____ is a type of validity that examines whether the measurement scale performs as
expected in relation to other variables selected as meaningful criteria.
a. Construct validity
b. Content validity
c. Criterion validity
d. Internal consistency validity
(c, difficult, page 269)

127

86. _____ is assessed when the data on the scale being evaluated on the criterion
variables are collected at the same time.
a. Convergent validity
b. Predictive validity
c. Concurrent validity
d. Discriminant validity
(c, moderate, page 269)
87. _____ is a type of validity that addresses the question of what construct or
characteristic the scale is measuring. An attempt is made to answer theoretical
questions of why a scale works and what deductions can be made concerning the
theory underlying the scale.
a. Construct validity
b. Content validity
c. Criterion validity
d. Internal consistency validity
(a, easy, page 269)
88. _____ is a measure of construct validity that measures the extent to which the scale
correlates positively with other measures of the same construct.
a. Convergent validity
b. Discriminant validity
c. Nomological validity
d. Concurrent validity
(a, difficult, page 269)
89. _____ is a type of construct validity that assesses the extent to which a measure does
not correlate with other constructs from which it is supposed to differ.
a. Convergent validity
b. Discriminant validity
c. Nomological validity
d. Concurrent validity
(b, difficult, page 269)
90. _____ is the extent to which the scale correlates in theoretically predicted ways with
measures of different but related constructs.
a. Convergent validity
b. Discriminant validity
c. Nomological validity
d. Concurrent validity
(c, difficult, page 269)

128

91. Which statement is not true regarding the relationship between reliability and
validity?
a. If a measure is perfectly valid, it is also perfectly reliable.
b. Unreliability implies invalidity.
c. If a measure is perfectly reliable, it is perfectly valid.
d. Reliability is a necessary, but not sufficient, condition for validity.
(c, moderate, page 270)
92. _____ is the degree to which a study based on a sample applies to a universe of
generalizations.
a. Validity
b. Generalizability
c. Reliability
d. None of the above
(b, easy, page 270)
93. Which statement about generalizability is not true?
a. The set of all conditions of measurement over which the investigator wishes to
generalize is the universe of generalizations.
b. In generalizability studies, measurement procedures are designed to investigate
the universes of interest by sampling conditions of measurement from each of
them.
c. To generalize to other universes, facet theory procedures must be employed.
d. Traditional reliability methods can be viewed as single-facet generalizability
studies.
(c, difficult, page 270)
94. When choosing a scaling technique, which of the following factors should be
considered?
a. the capabilities of the respondents
b. the characteristics of the stimulus objects
c. the method of administration
d. all of the above
(d, easy, page 270)

129

95. When developing scales for international research, the researcher must pay special
attention to details that can make the measurement instrument specific to the country
in which the instrument will be used. Which of the following should be of concern to
the marketing researcher when developing scales for international research?
a. Special attention should be devoted to determining equivalent verbal descriptors
in different languages and cultures.
b. Scale endpoints and the verbal descriptors should be employed in a manner that is
consistent with the culture.
c. It is critical to establish the equivalence of scales and measures used to obtain data
from different countries.
d. All of the above are correct.
(d, moderate, page 272)
Essay Questions
96. What six major decisions must the researcher make when constructing noncomparative itemized rating scales?
Answer
1. the number of scale categories to use
2. balanced versus unbalanced scales
3. odd or even number of categories
4. forced versus non-forced choice
5. the nature and degree of the verbal description
6. the physical form of the scale
(moderate, page 261)
97. Figure 9.4 showed the development of a multi-item scale. Discuss the development
process.
Answer
Data are collected on the reduced set of potential scale items from a large pretest
sample of respondents. The data are analyzed using techniques such as correlations,
factor analysis, cluster analysis, discriminant analysis, and statistical tests. As a result
of these statistical analyses, several more items are eliminated, resulting in a purified
scale. The purified scale is evaluated for reliability and validity by collecting more
data from a different sample. On the basis of these assessments, a final set of scale
items is selected. As can be seen from Figure 9.4, the scale development process is an
iterative one with several feedback loops.
(difficult, page 265)

130

98. Discuss coefficient alpha and how its value might be inflated.
Answer
The coefficient alpha, or Cronbachs alpha, is the average of all possible split-half
coefficients resulting from different ways of splitting the scale items. This coefficient
varies from 0 to 1, and a value of 0.6 or less generally indicates unsatisfactory
internal consistency reliability. An important property of coefficient alpha is that its
value tends to increase with an increase in the number of scale items. Therefore,
coefficient alpha may be artificially, and inappropriately, inflated by including several
redundant scale items.
(moderate, page 268)
99. What is the appropriate way to assess the internal consistency of a multi-item scale
with sets of items designed to measure different aspects of a multi-dimensional
construct?
Answer
Some multi-item scales include several sets of items designed to measure different
aspects of a multidimensional construct. For example, store image is a
multidimensional construct that includes quality of merchandise, variety and
assortment of merchandise, layout of the store, and credit and billing policies. Hence,
a scale designed to measure store image would contain items measuring each of these
dimensions. Because these dimensions are somewhat independent, a measure of
internal consistency computed across dimensions would be inappropriate. However, if
several items are used to measure each dimension, internal consistency reliability can
be computed for each dimension.
(moderate, page 268)
100.

Discuss construct validity and the types of construct validity.

Answer
Construct validity addresses the question of what construct or characteristic the scale
is, in fact, measuring. When assessing construct validity, the researcher attempts to
answer theoretical questions about why the scale works and what deductions can be
made concerning the underlying theory. Thus, construct validity requires a sound
theory of the nature of the construct being measured and how it relates to other
constructs. Construct validity is the most sophisticated and difficult type of validity to
establish. As Figure 9.5 shows, construct validity includes convergent, discriminant,
and nomological validity.
Convergent validity is the extent to which the scale correlates positively with other
measures of the same construct. It is not necessary that all these measures be obtained
by using conventional scaling techniques. Discriminant validity is the extent to which
a measure does not correlate with other constructs from which it is supposed to differ.
It involves demonstrating a lack of correlation among differing constructs.
Nomological validity is the extent to which the scale correlates in theoretically

131

predicted ways with measures of different but related constructs. A theoretical model
is formulated that leads to further deductions, tests, and inferences. Gradually, a
nomological net is built in which several constructs are systematically interrelated.
(difficult, page 269)
101.

What differentiates mathematically derived scales from the other non-comparative


scaling techniques?

Answer
All the scaling techniques discussed in the chapter required the respondents to
evaluate directly various characteristics of the stimulus objects. In contrast,
mathematical scaling techniques allow researchers to infer respondents evaluations
of characteristics of stimulus objects. These evaluations are inferred from the
respondents overall judgments of the objects.
(moderate, page 271)

132

También podría gustarte