Está en la página 1de 7

International Journal of Accounting Information Systems 12 (2011) 161167

Contents lists available at ScienceDirect

International Journal of Accounting


Information Systems

Experimental methods in decision aid research


Patrick Wheeler a,,1, Uday Murthy b,1,2
a
b

School of Accountancy, University of Missouri, 441 Cornell Hall, Columbia, MO 65211, USA
School of Accountancy, University of South Florida, 4202 East Fowler Avenue, BSN3403, Tampa, FL 33620-5500, USA

a r t i c l e

i n f o

Article history:
Received 26 July 2010
Received in revised form 14 November 2010
Accepted 22 December 2010
Keywords:
Decision aids
Experimental methods
Information technology

a b s t r a c t
In this overview of experimental decision aid research, we examine
some of the basic considerations necessary when conducting this type
of research. We next look at several specic lessons we have learned
over our combined careers in this research stream. Whether dealing
with the basics or specic lessons learned, we provide the reader a
foundational understanding of the problems involved and suggestions
toward solutions. We conclude by discussing some of the unique
advantages and disadvantages of doing decision aid research using
experimental methods. Specically, we note that experimental
decision aid research spans a wide range of human experience, from
psychology to technology, and that it is well placed to get to the why
behind the what experiments nd.
2010 Elsevier Inc. All rights reserved.

1. Introduction
One of the exciting aspects of doing experimental research on decision aids3 is that the researchers are
able to look at some of the extremes of human activity. On the one hand, the experiment will be
investigating the capabilities of computerscertainly, an item on the short list of the most recent and
revolutionary of human inventions. On the other hand, the experiment will also be examining (human)
users of decision aids who bring into the experimental setting ways of thinking and acting that are as old as

Corresponding author. Tel.: +1 573 882 6056; fax: +1 573 882 2437.
E-mail addresses: wheelerp@missouri.edu (P. Wheeler), umurthy@usf.edu (U. Murthy).
1
Authors contributed equally to the project; names are listed in reverse alphabetical order.
2
Tel.: +1 813 974 6523; fax: +1 813 974 6528.
3
A working denition of decision aid must recognize that decision aids range in complexity and technology. In essence, a
decision aid is a tool for helping the decision aid user solve a problem by presenting the user with some type of embedded
information. The tool may be as simple as a formula to be memorized or a paper checklist. However, since almost all current real
world decision aids are computerized, most decision aid research is conducted using computerized decision aids with algorithms for
transforming data (e.g., doing mathematical computations like regression analysis or identifying cases in a database similar to the
task at hand).
1467-0895/$ see front matter 2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.accinf.2010.12.004

162

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

humankind itself, and thus, fundamental and basic to many types of human behavior (see discussion by
Wheeler and Jones, 2003).
Our primary goal for this paper is to convey information useful to any researcher doing experiments
involving decision aids, whether the researcher is new to this type of research or highly experienced.
Accordingly, we present two types of material in the following discussion. First, we provide a brief
overview of the basics of doing experimental decision aid research, primarily oriented toward those new to
this eld. Second, we look in more detail at some specic lessons we have learned in our own careers doing
decision aid research. This latter discussion should be of interest to both novice and experienced
researchers. Nevertheless, it should be noted that due to the requisite brevity of this note this discussion
cannot be exhaustive in accomplishing either of these two goals.4
2. Basics of experimental decision aid research
Probably the most basic consideration in doing experimental decision aid research concerns what is
needed to get published. Or conversely, why do decision aid papers using experiments get rejected? In
general, we can state that such papers get rejected for the following reasons, in increasing order of severity
and solvability of the problem: poor writing; statistical problems; design/methodological aws; lack of
theory; lack of t for the journal; uninteresting results; insufcient contributions; and bad research
questions.
Some of these problems can be dealt with prior to doing the experimental study or submitting it for
review, e.g., poor writing, design aws, lack of theory, lack of t for the journal, and bad research
questions. These problems can be addressed through due diligence, preparatory work, a thorough
literature review, and soliciting feedback in the early stages of the project. However, some of the problems
(e.g., statistical problems, uninteresting results, and insufcient contribution) are less foreseeable and
addressable, primarily because they often depend on how the experiment turns out, i.e., the results of the
study. Furthermore, once the experiment has been run, you have available for analysis all of the data you
will ever have from that particular experiment. If the results are not as predictedand in experimental
research, they rarely come out exactly as expectedthen you will often have problems with analysis and
statistical testing, and may have uninteresting or confusing results that lead to insufcient contributions to
the decision aid research stream. Thus, it is critical to do all that is possible to ensure the results will be as
close as possible to what is expected. To do so, the researchers need to be well grounded in the related
research streams. Accordingly, one can see how results turned out in other experiments similar to yours,
which in turn should be useful in determining what results may be reasonably expected from your own
experiment.
Determining which theories to use is one of the most critical steps in experimental decision aid research
design. First, it must be a theory which allows us to understand the why behind the what one expects to
nd in the study. Without this increased understanding, it is unlikely that the results will be generalizable
beyond the current study, thereby leading to a lack of contribution from the study. Also, when sound
theory is employed in a well-controlled experiment free from threats to validity, the ensuing results will
likely have implications for practice as well as implications for theory thereby enhancing the paper's
contribution.
Deciding on which theories to use is especially challenging in experimental decision aid research
because, as noted above, the researchers are dealing with two of the extremes of the human situation:
technology and psychology. Thus, most decision aid experimental studies need two types of theories
theories dealing with technology and theories dealing with the users of the technology. For example, one
might start with technology theory concerning the different types of predictions that decision aids can
make (e.g., predictions on regression analysis versus those based on cases similar to the task being
presented to the experimental participant). One would then need to combine this technology theory with
psychology theory about how the human mind reacts to these different types of predictions, e.g., whether
humans are more comfortable with cases than calculations (see Wheeler and Jones, 2008). Or, as a second
example, researchers might begin with theory concerning different types of group decision support
4
For more complete discussions (although not dealing specically with decision aid research), see Cook and Campbell (1979),
Kerlinger and Lee (1999), Martin (2007), and Trochim and Donnelly (2006).

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

163

systems (e.g., concerning degrees of media richness and non-verbal communication) along with theory
about how groups interact (e.g., small versus large group interactions and polarization of opinions in
groups) (see Landis et al., 2010).
Theory selection is also critical for avoiding bad research questions. In general, bad research
questions for conducting accounting decision aid research are questions that do not in some manner relate
uniquely and specically to accounting. For example, doing an experiment solely to demonstrate that tax
professionals using tax software exhibit a conrmation bias when doing tax information searches is a bad
research question in and of itself, because there is ample evidence from the psychology literature that
individuals are prone to such a conrmation bias. As such, the contribution from demonstrating the
existence of conrmation bias in yet another setting (tax) would be minimal. However, if one can draw
upon theory or prior research to hypothesize that some unique characteristic of tax professionals or tax
software or tax information searches would attenuate or even eliminate the conrmation bias, then one
might have a good research question (see Wheeler and Arunachalam, 2008). Alternatively, one might
turn the above bad research question into a good one by showing how a decision aid might decrease the
conrmation bias in tax professionals, having rst shown its presence. This approach aims primarily at
making a contribution to accounting practice. Similarly, doing a decision aid experiment to examine how
expense data affects nancial decision making is likely to be a bad research question because one is not
comparing or contrasting unique accounting elements. Accordingly, doing the same experiment using
expense data in some conditions and revenue data in others is probably on the way to becoming good
research, particularly if the researcher can draw on theory to hypothesize that different types of data
(revenue versus expenses) would have differential effects on the efcacy of the decision aid.
Next, in our brief consideration of the basics of experimental decision aid research, let us discuss how to
measure the effect of decision aid use on user behavior, i.e., the dependent variable in decision aid
experiments. This has long been recognized as one of the major problems when conducting decision aid
research (see Rose, 2002), but there is still no consensus on how to best solve the problem. Here, because of
limited space, we wish merely to make the reader aware of the fundamentals of the problem and offer
some suggestions for dealing with them. Researchers should be familiar with these various options and
should think carefully about which one or ones are best suited to their current study.
On a conceptual level, there are three main types of effects of decision aid use on user behavior:
learning, task performance, and decision aid reliance. Learning-oriented studies investigate using decision
aids to acquire knowledge and/or develop professional expertise, as is often done in practice. Such studies
tend to employ a multiple tasks or sessions design in order for learning to occur over a period of time (see
Eining and Dorr, 1991; Rose, 2002). Another motivation of such studies is to examine whether the use of a
decision aid can help novice decision makers can acquire expert like schema (Rose, Rose and McKay,
2007; Rose and Wolfe, 2000). Alternatively, decision aid reliance and task performance accuracy focus on
the outcome of the decision making process. The former is probably used more frequently than the latter
because accuracy usually requires some type of normative benchmark (i.e., the correct solution to the task)
which is not always available. Even reliance is not as straightforward a measure as it might rst appear
since once cannot directly observe whether or not participants are relying on the decision aid's
recommendation (an unobservable psychological occurrence) but only how close participants answers are
to those offered by the decision aids. One then assumes that the closer the two, the greater the reliance.
One suggestion for dealing with the reliance issue is to ask the decision maker for a decision before the
decision aid's advice is shown. If this is done, it should be easier to measure the degree of reliance since one
can now compare a before and after situation.
Ashton's (1990) seminal bank loan decision aid task can be used to illustrate these three types of
decision aid measures and some of the factors involved in choosing which one to use for a particular
experiment. A learning dependent variable would be best if investigating the use of different decision aids
for training users to make bank loan decisions. Choosing between reliance and accuracy is more difcult.
Clearly the two should be highly correlated and one often nds both measures used in studies.
Nevertheless, there are differences. In a bank loan task, one would probably use a reliance measure if the
task dataset was weak in external validity (e.g., simulated data with only a few variables) or if the decision
aid was weak in external validity (e.g., simplied to capture a type of decision but lacking many features
found on commercially available decision aids). Conversely, if the dataset and decision aid are strong in
terms of external validity, then an accuracy dependent variable would be preferred. Ultimately, one is

164

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

concerned about improving task performance (i.e., accuracy), not merely increasing reliance on the
decision aid.
3. Some specic lessons learned
We now wish to discuss some specic issues when conducting experiments using decision aids. These
are issues that from our own experiences do not necessary appear obvious when going through the basics
of designing a decision aid experiment, although they are usually implied in the basics.
3.1. Operationalization
One of the more critical steps in designing a decision aid experiment is to rst identify the theoretical
variables of interest (using various technology and psychology theories, as discussed above) and then to
decide how to operationalize these theoretical variables so that they can be observed or measured. As
noted previously, this is often dually challenging in decision aid research since one is frequently working
with a range of theories relating to decision aid technology and, at the same time, the psychology of
decision aid users. Libby boxes can be a useful tool for trying to work through these design issues because
they provide researchers with a simple model for identifying theoretical variables and operationalized
variables (see Libby, 1981).
3.2. Decision aid design
When designing the decision aid for the experiment, it is important to know what kind of decision aids
are currently being used in the accounting profession. Such knowledge comes from interacting with
accountants and accounting rms. Research articles can also be an important source of this information. For
example, Dowling and Leech (2007) provide an excellent overview of the types of decision aids used in
auditing rms. There must be a degree of similarity between the experimental decision aid and those being
used in accounting practice. Otherwise, it is difcult to justifying the research as accounting research.
Determining this degree of similarity is addressed next.
When designing decision aids it is critical to avoid the two extremes of being too generic and too
specic. If a decision aid is designed in too generic a manner it will bear so little resemblance to decision
aids currently used in accounting practice that one may doubt that the results of the study have any
generalizability. Thus, the ndings might appear to have no real world implications or application. For
example, a decision aid consisting of nothing but regression analysis would be too generic for any results
from its use in an experiment to be accepted as generalizable. On the other hand, if the decision aid is too
specic then it will tend to resemble an existing decision aid to such an extent that one may doubt the
experiment's results can be generalized beyond that particular decision aid design. The ndings might
accordingly be seen as too strongly practitioner-oriented and therefore lacking in theoretical contribution.
This is an example of what Peecher and Solomon (2001) call the mundane realism trap, i.e., achieving
external validity by exactly copying the real world. The problem with this approach is that it usually has a
detrimental effect on internal validity, in addition to its harmful effect on contribution and generalizability.
3.3. Experimental instrument
The fact that in an experiment (unlike archival research), you only get one chance at collecting the data,
implies that the researchers must be as careful as possible in designing the experimental instrument. The
instrument in the experiment will determine, among other things, for which variables data are collected,
how the variables are operationalized, and how they are manipulated. Also, fatal design aws, such as
demand effects, can all too easily creep into the experimental instrument. There is a ne line between
making the treatment salient enough to foster the desired effect and making it so salient that it evokes
(undesirable) demand effects. Thus, not surprisingly, the experimental instrument should be well thought
out, and pilot testedmaybe repeatedlyto ensure that it will perform as desired.

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

165

3.4. Running the experiment


The mechanics of running the experiment is another area in which problems are encountered.
Generally, one can run an experiment with a group of subjects only once. So it is critical not to waste the
opportunity on problems relating to conducting the experiment. Pilot runs are important in this regard
and, if possible, should be run in the same situation in which the experiment will be run, e.g., computer lab,
class room, hotel training facility, over the Internet, etc. This approach will help the researcher detect any
software and hardware problems in advance.
Time constraints are another common problem in running a behavioral decision aid experiment. The
usual upper end of the time constraint is the length of the session, e.g., one class period. Additionally, one
must balance the time required to do the experiment between being too long and leading to boredom and
fatigue versus being too short and thus not representative of the task as performed in practice, which
would threaten external validity. While time constraints are a general issue in behavioral experimentation,
decision aid experiments have some unique advantages and disadvantages in relation to time. Decision aid
experiments are at an advantage in this regard since by their nature they are frequently time-saving
devices. A disadvantage is the fact that, unlike paper and pencil experiments, decision aid experiments
may require additional time for training on how to use the computerized decision aid. As a rule of thumb,
aim at (and pilot test) the experiment to be 4560 min long.
3.5. Post-experimental survey
Since most decision aid research involves psychology theories, as noted above, it is generally necessary
to follow the main experiment with a post-experimental survey or questionnaire that attempts to get at
what participants were thinking during the experiment. One of the strengths of experimentation (versus
archival research) is its ability to more directly get at the why behind the what (i.e., the results).
Accordingly, knowing what participants were thinking while doing the experiment can be very valuable in
this regard. However, one needs to be somewhat cautious in accepting that participants necessarily know
clearly what they were thinking during the experiment or that they are completely honest in their
reporting of what they were thinking. Another use of the post-experiment survey is to rule out alternative
explanations for the ndings (i.e., establish internal validity) by asking questions that elicit responses
directed at potential alternative explanations. In this regard, it is important to include manipulation check
questions and demographic questions in the post-experimental instrument. The former are needed to
eliminate the possibility that the participants misunderstood some critical aspect of the experiment (one
alternative explanation of results). The latter (demographic questions) are also vital for testing for internal
validity. The experimental variables should be analyzed against the demographic variables to ensure that
results were not being driven by some characteristic of participants that did not get randomly distributed
across conditions. Thus, even if the results do not turn out as expected, answers to the post-experiment
survey questions could be illuminating for the next iteration of the experiment, should the researchers
choose to continue with the project. The post-experimental survey is a critical part of most experimental
decision aid studies and, like the experimental instrument, should be rigorously thought through and pilot
tested.
3.6. Control groups
As a general rule, it is important to have control groups in experiments. The situation, however, is a bit
more complicated in decision aid experiments because the control in such an experiment is usually a group
of participants without access to a decision aid, i.e., unaided. This is obviously of limited value in itself since
it in often means having the participants make decisions without certain vital information. However, it is
nevertheless generally best to include such control groups in decision aid experiments in order to establish
that the decision aid is actually benecial to users, i.e., that it is aiding decision making. One option is to
consider a pre- post- design, in which the participant performs a task unaided and then performs a variant
of the task using a decision aid. Such a within-subjects design has both advantages (each subject acts as his/
her own control) and disadvantages (greater potential for demand effects), but is an alternative worth
considering.

166

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

3.7. Participants
Another critically important decision in designing a decision aid experiment is whether to use students
or professional as participants. Of course, it is almost always easier to acquire students than it is
professionals as participants, but this consideration should not be the primary basis for this decision. One
needs to carefully think about whom in the real world will be using the decision aid being investigated,
along with the nature of the experimental task. Will real world users need expertise to effectively use the
decision aid or solve the task? If so, then students are probably not the right participants. If not, then
students are probably good proxies. It is only inappropriate to use students when theory or prior research
suggests that experience interacts with a factor of interest in the study, causing a threat to internal validity
(see Peecher and Solomon, 2001). For example, Wheeler and Arunachalam (2008) used tax professionals
because there was prerequisite knowledge of how to conduct tax information searches that undergraduate
students could not be expected to possess or be quickly trained in. Note that even if students are not the
best participants for the main experiment, they may be acceptable from considerations of convenience for
the pilot tests of the experiment. However, such a situation would lessen one's ability to rely on the results
of the pilot tests. One should also be sure to randomly distribute participants across conditions to help
ensure internal validity, as discussed above in relation to demographic variables. As a nal piece of advice
in this area, one should count on using 1620 participants for each experimental condition. Statistical
software can help rene the number of participants needed through a power analysis based on the number
of cells in the experimental design.
4. Conclusions
To summarize our discussion of how to do experiment-based decision aid research, we would
emphasize the following points:
Know the research stream and real world area of interest
Start with research questions that are interesting to both other researchers and to real world users of
decision aids
Ground these questions in good theoryboth technological and psychologicaland prior research so that
you can get to the why behind the what
Do not neglect those potential problems that one can more easily address, e.g., poor writing or
experimental design aws
Avoid extremes when developing the decision aidnot too generic, not too specic
You cannot not spend too much time developing and piloting the experimental instrument
You need a good post-experiment questionnaire that gets to what the users were thinking about,
measures likely covariates, and rules out alternative explanations
Consider the various ways the decision aid performance can be measured and which of these is best for
your particular study
Pick the type of participants to be included in the study carefully
In conclusion, we would like to state what we believe to be two of the main advantages of doing
experimental decision aid research. First, the inclusion of advanced information technologies (i.e., decision
aids) in an experiment helps ensure the real world relevance of the study and that an interesting research
question with external validity is likely to be addressed. However, it should be noted that the use of such
technologies is a mixed blessing. These technologies are frequently seen as merely the delivery vehicles
of the real variables of interest. Also, research involving advanced technologies may be viewed as more
practitioner oriented and too transitory for serious academic research. This is one reason why is it is
important to design the decision aid so that it is not too generic or too specic, as discussed above.
Second, of the various research methodologies available, experiments come closest to establishing
causality. This is true because experiments allow the researcher to directly manipulate the variables of
interest and thus to have greater control over internal validity, i.e., to establish that the measured outcomes
of the experiment result from the variables in the experiment that are measured, manipulated, or
controlled. By contrast, archival research deals with historical data and therefore cannot directly

P. Wheeler, U. Murthy / International Journal of Accounting Information Systems 12 (2011) 161167

167

manipulate variable and control the context of the study to the same degree as in an experiment. Thus, we
believe that experimental decision aid research allows researchers a unique opportunity for understanding
the why behind the what observed in the research.
References
Ashton RH. Pressure and performance in accounting decision settings: paradoxical effects of incentives, feedback, and justication.
J Acc Res Suppl 1990;28:14880.
Cook T, Campbell D. Quasi-experimentation: design & analysis issues for eld settings. Chicago, IL: Rand McNally College Publishing
Company; 1979.
Dowling C, Leech S. Audit support systems and decision aids: current practice and opportunities for future research. Int J Acc Inf Syst
2007;8(2):92-116.
Eining MM, Dorr PB. The impact of expert system usage on experiential learning in an auditing setting. J Inf Syst 1991;5(1):1-16.
Kerlinger F, Lee H. Foundations of Behavioral Research. 4th edition. Wadsworth Publishing: Boston, MA; 1999.
Landis M, Arunachalam V, Wheeler P. Choice shifts and group polarization in dyads. Working paper; 2010.
Libby R. Accounting and human information processing: theory and applications. Englewood Cliffs, NJ: Prentice-Hall; 1981.
Martin David W. Doing Psychology Experiments. 7th edition. Boston, MA: Wadsworth Publishing; 2007.
Peecher M, Solomon I. Theory and experimentation in studies of audit judgment and decisions: avoiding common research traps. Int J
Auditing 2001;5:193203.
Rose J. Behavioral decision aid research: decision aid use and effects. In: Arnold V, Sutton S, editors. Researching accounting as an
information systems discipline. Sarasota, FL: American Accounting Association; 2002.
Rose J, Wolfe C. The effects of system design alternatives on the acquisition of tx knowledge from a computerized tax decision aid. Acc
Organ Soc 2000;25:285306.
Rose J, Rose A, McKay B. Measurement of knowledge structures acquired through instruction, training, and decision aid use. Int J Acc
Inf Syst 2007;8(2):11737.
Trochim W, Donnelly J. The research methods knowledge base. 3rd edition. Mason, OH: Atomic Dog Publishing; 2006.
Wheeler P, Arunachalam V. The effects of decision aid design on the information search strategies and conrmation bias of tax
professionals. Behav Res Acc 2008;20(1):13145.
Wheeler P, Jones D. The effects of exclusive user choice of decision aid features on decision making. J Inf Syst 2003;17(1):6383.
Wheeler P, Jones D. The psychology of case-based reasoning: how information produced from case-matching methods facilitates the
use of statistically generated information. J Inf Syst 2008;22(1):1-26.