Está en la página 1de 14

Empirical Accounting Research Design for Ph. D.

Students
Author(s): William R. Kinney, Jr.
Source: The Accounting Review, Vol. 61, No. 2 (Apr., 1986), pp. 338-350
Published by: American Accounting Association
Stable URL: http://www.jstor.org/stable/247264 .
Accessed: 09/05/2014 16:35
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Accounting Association is collaborating with JSTOR to digitize, preserve and extend access to The
Accounting Review.

http://www.jstor.org

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

THE ACCOUNTING REVIEW


Vol. LXI, No. 2
April 1986

EDUCATION RESEARCH
Frank H. Selto, Editor

Empirical

Research

Accounting
For

Ph.D.

Design

Students

WilliamR. Kinney, Jr.


ABSTRACT: This paper discusses an approach to introducing empirical accounting research
design to Ph.D. students. The approach includes a framework for evaluating accounting experiments as well as studies based on passive observation of subjects or data. Alternative
methods of isolating the effect of the "independent" variable of interest from effects of priorto-the-study-period variables and contemporaneous variables are discussed along with the
advantages and limitations of each method. Also discussed is the relationship between type I
and type 11error risks, sample size, and research design. The importance of research design,
including theory development and means for mitigating the effects of extraneous variables, is
emphasized as perhaps the only practical way to achieve research objectives in empirical research in accounting.

encountered problem
in accounting Ph.D. programs is
that first-year students do not have
background in empirical research in accounting. Few B.B.A., M.B.A. or M.Acc.
programs include courses in empirical research and many students have not seriously considered its nature. Yet, such an
introduction is necessary if Ph.D students
are to efficiently relate other courses to
substantive problems in accounting and
be able to take full advantage of accounting workshops.
The purpose of this paper is to show
how a basic framework for evaluating
empirical research in accounting can be
obtained in a short introduction. This can
be done at the start of the first term course
and provides a context for further work in
philosophy of science and statisticaldesign
as well as substantive areas of accounting.
FREQUENTLY

The approach is generic-it is not tied to


an area of accounting and doesn't depend
on prior knowledge of a particular paradigm.'
' Illustrations and extensions from applied areas of
accounting are also helpful. Good sources for financial
accounting are Ball and Foster [1982], Lev and Ohlson
[1982], and Abdel-khalikand Ajinkya [1979].Good sources
for behavioral work are Ashton [1982] and Libby [1981].
I would like to acknowledge the helpful comments and
suggestions of Vic Bernard, Dan Collins, Grant Clowery,
Bob Libby, JerrySalamon and two anonymous reviewers.
An earlier version of this paper was presented at the
American Accounting Association's Doctoral Consortium in Toronto, Ontario, in August 1984.

William R. Kinney, Jr. is Price Waterhouse Auditing Professor at the University of Michigan.
Manuscript received September 1984.
Revisions received April 1985 and August 1985.
Accepted September 1985.

338

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

Kinney

339

The generic approach focuses attention


on the essence of scientific inquiry in accounting. Many of the problems faced by
accounting experimenters who can manipulate some (but not all) of the levels of
variables to be studied are similar to those
faced by "passive observers" of the levels
of all variables as set by Nature.2 Thus,
the generic approach may help to avoid
premature specialization [Boulding,
1956, p. 199].
Section I presents a definition of empirical accounting research and theory, hypothesis, and fact. It also defines "dependent," "independent," and prior and
contemporaneous influence variables.
Section II discusses alternative means for
separating the effects of prior and contemporaneous influence variables from
the independent variable(s), and in Section
III the interrelationships among significance, power, and research design are
explored. Section IV gives a summary and
conclusions.
I. A FRAMEWORK
FOREMPIRICAL
RESEARCH IN ACCOUNTING

Research is a purposive activity and its


purpose is to allow us to understand, predict, or control some aspect of the environment. Research will be defined here
as the development and testing of new
theories of 'how the world works' or refutation of widely held existing theories.
For accounting research, the theories concern how the world works with respect to
accounting practices. Watts and Zimmerman [1984, p.1], state: "The objective of
accounting theory is to explain and predict
accounting practice." This positive, howthe-world-is approach is in contrast with
the more traditional normative view that
accounting "theory" is concerned with
what accounting practices ought to be.
Empirical accounting research (broadly
considered) addressesthe question: "Does
how we as a firm or as a society account

for things make a difference?"' Clearly,


the accounting for items affecting tax payments makes a difference in our individual and collective lives. But does the accounting for, say, depreciation in internal
or external reports affect decisions within
a business firm or affect stock prices? If
it does, then the size of the effect and why
it occurs are important follow-up questions. Additionally, the accounting researcher must separate the underlying
economic event (or state) from the accounting report of the event. Thus, while a finance researcher may be concerned only
with firm characteristics, the accounting
researcher must also be concerned with
the costs and benefits of alternative accounting reports of those characteristics.4
In essence, empirical research involves
theory, hypothesis, andfact. "Facts" are
states or events that are observable in the
real world. A "theory" offers a tentative
explanation of the relationship between
or among groups of facts in general. "Hypotheses" are predictions (or assertions)
about the "facts" that will occur in a particular instance assuming that the theory
is valid. Finally, observing "facts" consistent with the prediction or assertion
2 The problems are not identical. For example, while
experimenters have the advantage of being able to specify
the values of some variables, they also face the risk of
choosing values that are too close together (or too far
apart) to allow precise estimates of treatment effects or to
allow generalization of conclusions to the real world.
3 Within this definition,
relevant questions for
auditing include, "Does how precisely we audit and
report the state of a firm make a difference?" and,
"How can audits of a given precision be conducted efficiently?" The first auditing question is related to the
accounting question through the concept of materiality.
Parallel questions involving the design of accounting
systems also could be developed.
4Accounting professors may conduct research in
finance, economics, behavioral science, or statistics. If
the accounting question is not addressed, however, the
accounting professor may face the disadvantage of being
undertrained relative to other researchers. Also, he or she
ignores a comparative advantage in the knowledge of
accounting institutions and the sometimes subtle role of
information.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

The Accounting Review, April 1986

340

made in the hypothesis lends credibility to


the theory.
Ordinarily, researchbegins with a realworld problem or question. One thinks
about or studies the problem, reads about
seemingly similar problems in other areas
or disciplines such as economics, psychology, organizational behavior, or political science. By immersing himself or
herself in the problem, the researcher
may, either by genius or by adapting a
solution from another area, develop a
general theory to explain relationships
among facts [Simon, 1976, Chapter 7 and
Boulding, 1956]. From this statement of
the general relationship among facts, hypotheses about what should be observed
in a particular situation can be derived.
An experiment or passive observation
study then can be designed to support or
deny the hypotheses.
For example, suppose it is observed
that a stock price increase usually follows
the announcementthat a firm has changed
from straight-line to accelerated depreciation. Why should a mere bookkeeping change seem to lead to an increase
in firms' values? An explanation might
be that market participants believe that
eventsleadingto such an accountingchange
also typically lead to better prospects for
the firm in the future. With development
and elaboration of such a theory, the
researchermight develop a passive observation study of past changes or an experiment to test hypotheses based on the proposed explanation.
Theories are usually stated in terms
of theoretical variables or "principals"
while empirical measurement requires observable variables. The difference between
the principal and real-world observable
variables presents difficulties for accounting researchers since accounting measurements may be either surrogates for
some underlying principal of interest or
may be the principal itself. For example,

if firm "performance" is the theoretical


principal and earnings is chosen as the
surrogate measure of firm performance,
then straight-line depreciation is a component of the surrogate.
As a surrogate, straight-line depreciation contains two sources of potential
error that may require consideration by
the researcher. One is the surrogation
error due to the fact that straight-line depreciation does not "correctly" reflect
the relevant performance of the firm for
the purpose at hand. The other is application error due to mistakes or imprecision
in applying straight-line depreciation.
On the other hand, in evaluating possible determinants of managerial behavior,
audited earnings using straight-linedepreciation may be specified in a contract and
may serve as a principal. For example, if a
manager is to receive a bonus or profit
share of one percent of audited earnings,
then audited earnings is the principal.
Surrogation error, and any application
error not detected and corrected by the
auditor, is ignored for contract purposes.
The same number used as a measure of
firm performance will likely contain both
surrogation and application error.5
To add credibility to a theory, one must
not only be able to show hypothesis test
resultsthat are consistent with the theory's
predictions, but also have a basis to rule
out alternativeexplanationsof the observed
facts. This requiresconsideration of a reasonably comprehensive list of alternative
explanations. Again, knowledge of related
disciplines is useful in generating alternative explanations for accounting-related
"facts." Some possible explanations can,
of course, easily be ruled out as being of
5 Accounting systems designers and financial
accounting standards setters can control the first type of
measurement error, while auditors and auditing standards setters can control the second. Problems relating to
the interaction of accounting and auditing standards
setters, users, and auditors is, of course, a matter for
accounting research.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

Kinney

341

likely negligible effect, but others will require attention.


To be more specific, let Y denote the
"dependent" variable to be understood,
explained, or predicted. Variables causing Y(or at least related to Y) can be classified into three broad groups as follows:
X=the "independent" variable that
the proposed theory states should
effect Y,
Vs = prior-influence (prior-to-study
period) factors that may affect Y,
and
Zs= contemporaneous factors (other
than X) that may affect Y.6
That is, Y=f(X, V,, V2,

..,ZIZ2,

...

A common Yfor addressingan accounting question is the change in a firm's stock


price. Others are a manager's act or decision. A common X is a change or difference in accounting method, whether by
management's choice or by a regulatory
directive. Prevalent Vs are the firm's prior
period state variablessuch as profitability,
leverage, liquidity, and size. For tests of
theories about decision-making behavior
of particular human subjects, relevant
Vs often include the subject's personality
traits, mathematical ability, education,
training, age, firm association, and experience.
The most common Z factor in accounting research studies involving stock prices
is the market return (Rm). Another common group of Zs for external reporting
and managerial performance studies is
the unexpected portion of contemporaneous accounting measures for other firms
or other divisions. Finally, since the accounting researcher is concerned with the
effect of accounting reports, Zs may be
underlying characteristics of the "true
state" of the firm at the time of the study
as measured by contemporaneous nonaccounting reports about the firm.

DISENTANGLINGTHE EFFECTS FROM VS


AND ZS FROMTHE EFFECT OF X

For simplicity, assume that X is measuredat only two levels. Eitherthe observed
Yis from the "control" group that receives
no "treatment" or from the treatment
group that receives the treatment. Alternatively, the two groups could simply be
different on some relevant dimension
(e.g., to test theories about accelerated
depreciation, the control group might be
defined as those firms that use straightline and the treatment group as those that
use accelerated).
Also for simplicity, assume that there is
a singleprior-influencefactor Vthat effects
Yand Vhas the same effect on Ywhether
the subject is from the control or treatment group. Furthermore, there are no
contemporaneous Zs that affect Y and
the model determining Y is:
(1)
Yij=Bo+BX,+B2Vj+ ej,,
where Yij is the value of the dependent
variable for the "j"th subject in the "i"th
treatment, Bo is the intercept for the control group, Xi is an indicator variable
(zero for the control group and one for
treatment), Bo+ B1 is the intercept for the
treatment group (that is, B1is the effect of
treatment), B2is the regression coefficient
relating Vto Y, and eij is a random error
term. The eij term will include the effects
of other Vs and Zs that are here assumed
to be negligible and randomly distributed
between the two groups, and eijis assumed
to have expectation zero and be uncorrelated with either X or V.
For the simple model of equation 1, a
plot of the expected values of Y given V
for both groups will be parallel lines with
possibly different intercepts. The difference in intercepts is the effect of the treatment (B1). Figure 1 shows the components
6 Some Zs may be expectations,at the time of the
study,of still futurevaluesof X, Y, V, and otherZs.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

The Accounting Review, April 1986

342
FIGURE 1

Y VALUES
FORNONEQUIVALENT
GROUPSUNDERANOVA, ANCOVA, ANDMATCHING
y

X+02V
~~~~~~~~~~~~Y=00+
X

ANOVA
/-

andMatching
~~~NCOVA

~~~~~~~~~~I

lI

_~~~~~~ I

V.

Vs

Matches

of equation 1 along with ellipsesthat approximatethe locus of membersof the


two groups.
An experimentermay ignore V and
may randomlyassignsubjectsto groups.
On average,the groupswill be equivalent
on V. For small samples,however,there
is nontrivialrisk that such a procedure

may assign to the treatmentgroup, say,


those with high values of V and to the
controlgroupthose withlow V. In analyzingresults,the effect of the high Vvalues
(i.e., B2VJ) will be mixed with the treatment effect. An experimentermay rule
out the possible effect of V by random
assignmentof subjects measuredon V

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

Kinney

343

between groups. The sample subjects are


measured on V, matched into pairs according to their V values, and one from
each pair is randomly assigned to treatment. Thus, even for small samples the
two groups will be approximately equivalent on V.
The passive observerhas no opportunity
to randomlyassignsamplesubjectsto treatment. Even experimenters may have difficulty developing a satisfactory randomized design due to having too many
potentially important Vsand Zs that must
be simultaneously matched. Thus, in general, researchersface the problem of treatment groups that are not equivalent with
respect to V.
For nonequivalent groups, there are
three basic ways in which the researcher
can mitigate the possible effects of the V
factor in the model of equation 1. These
are:
1. ignoring V(i.e., assuming or hoping
that Vis randomized with respect to
X),

2. matching on Vex post (i.e., matching after X has been chosen by the
subject or assigned by Nature), and
3. using covariance analysis to statistically estimate and remove the effect of V.
The first approach ignores V, and results
can be analyzed with a single-factor analysis of variance (ANOVA). The second
approach physically equates the treatment
groups with respect to V, and results can
be analyzed using a randomized block design. The third approach "statistically"
equates the groups, and results can be
analyzed using analysis of covariance
(ANCOVA).
Each of these approaches is discussed in
turn, along with some of the advantages
and limitations of each for experimenters
and passive observers.

Ignoring V
As discussed above, ignoring a potential V is generally inadvisable due to the
unknown effect of V. A negligible effect
is the hopedfor result for any unmatched,
unmeasured, or unknown Vsor Zs. However, most real-worldevents have multiple
causes and a negligible overall effect is
unlikely. Furthermore, larger samples
will not help in researchdesignsthat ignore
systematic effects of V.
While expost matching and covariance
analysis can't account for all possible Vs
and Zs, they can reduce the risk that some
potentially important Vs and Zs disguise
the true effect of X. Figure 1 shows the
relevant sampling distributions for the
three approaches applied to the example.
As shown in the relativelyflat distributions
on the left margin, ANOVA is based on
the marginal distribution of Y with no
consideration of V.
Ex Post Matching
In many situations, the researcher selects a sample after the phenomenon of
interest has taken place. Often, the researcher selects a sample of subjects from
the treatment group and then selects a
subject from the control group with V
equal or similar to V for each treatment
subject. This ex post matching on V is
probably the most commonly used design
for passive observational studies in accounting.
For ex post matching, the model assumed to determine Y is:
m
=
Bo
+BIX,+
(2)
Y,
E BjMj +eij,
j2

where Bo is the overall mean of Yplus the


effect of (arbitrarilydesignated) match 1,
Bj (for j> 1) is the differential effect of
match compared to match 1, Mj is an indicator variable (equal to one if the sample subject is a member of match j and

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

The Accounting Review, April 1986

344

zero otherwise), and m is the number of


matches. '
For passive observational studies, it is
impossible to randomly assign subjects to
treatments since by definition the subjects
have already either "self-selected" into
treatmentgroups, or have been so selected
by Nature. It is possible that there will be
few or even no matches. For example, all
the firms using straight-line depreciation
may be small firms and all firms using accelerated depreciation may be large. A
firm's choice of accounting method (or
decision to change methods) may merely
reflect its V value. Figure 1 illustrates
such a possibility in that only the bottom
half of group 1 can be matched with a
member of group 0 due to the difference
in V for the groups.
Even experimenters using ex ante
matching with random assignment of
subjects often face the lack-of-matches
problem for at least some Vsand Zs. Suppose, for example, that a researcherbelieves
that an auditor's response in a professional task experiment may be related to the
auditor's professional training (X) after
accounting for his or her mathematical
abilities ( V). It may be difficult to match
subjects from different firms based on
mathematical abilities. This is because
firms may hire and thus train (or students
may choose firms and be trained) on the
basis of mathematical ability.8
As will be discussedbelow, the efficiency
of matching may be less than that for covarianceanalysis. However, expost matching is likely to be superior to covariance
analysis when the functional form of the
YlIV relationshipis nonlinearor unknown.
Given that the treatment effect is not correlated with V, matching can be used for
any functional form of Y and V (known
or not) and analyzed using a blocked design.
Covariance Analysis
A researcher using analysis of covari-

ance (ANCOVA) statistically estimates


the effect of V on Y and removes it.
ANCOVA can be viewed as the result
of projecting observed Y values along
the regression line to a common point
on the V axis, such as V, to yield the
conditional distributions as shown in Figure 1.1
Figure 1 shows that part of the difference between the marginal distributions
of Yoand Y1(as estimated using ANOVA)
is due to a larger Vfor the treatmentgroup.
Matching (equation 2) accounts for the
difference by subtracting B O+1 BjMj
from Y for each subject, and ANCOVA
accounts for the difference by subtracting
B0+ B2 Vijfrom Y for each subject. Thus,
both matching and ANCOVA are seen to
mitigate the differential effect of V. Figure 1 also shows that control group subjects with relatively high Vfor the control
group are matched to subjects with relatively low Vfor the treatment group. For
matched designs, all other potential sample
subjects must be omitted due to lack of
matches.
Matching and ANCOVA yield more
efficient (more precise) estimates of the
treatment effect than does ANOVA. In
general, however, it is unclear which of
the two will be more precise. This is due to
the fact that while the difference in Voand
V, reduces the precision of ANCOVA,
the reduction in sample size due to lack of
matches reduces precision for matching.
' The matches may be by individual subject ("precision" or "caliper" matching) as discussed above, or by
frequency distribution (e.g., equal mean and variance
with respect to V for both groups). A test of equality on V
is often used as a justification for ignoring V in the
statistical analysis.
I An alternative design is to limit all subjects in an
experiment to a fixed level of V. This equalizes the effect
of V but greatly reduces the generalizability of results
over the range of reasonable V values that might occur.
I The sampling distributions for matching and
ANCOVA are shown as the same in Figure 1 since the
expectations of estimates of the treatment effects are the
same for both. As discussed, however, their standard
errors will differ.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

345

Kinney

It is often difficult to predict which will be


the greater problem. 10
Equation 1 and Figure I present a very
simple situation even for a single V. For
example, the YI Vrelationship may differ
depending on whether X is at level zero or
one. Furthermore, the occurrence of a
given level of Vat time t-1 may have a direct effect on Y at time t but may also
affect the level of X at time t which, in
turn, affects Y at t. Thus, there may be
two paths by which V affects Y. A complete approach would include a model of
the "selection" process by which V affects
X as well as the direct effects of X and V
on Y."1

III. ALPHA,

BETA, SAMPLE SIZE


AND RESEARCH DESIGN

Planning researchto disentangle X, Vs,


and Zs involves four related factors.
These are alpha (ce),beta (@),sample size,
and what will be called the "research design" factor (denoted D). In a given situation, setting any three of them sets the
fourth. The statistical factors of a (the
probability of a type I error or incorrectly
rejectinga true null hypothesis of no treatment effect), f3(the probability of a type
II error or not rejecting the null hypothesis when, in fact, there is a treatment
effect), and sample size are well known.
The research design factor is the ratio of
two subfactors. Its numerator is the hypothesized magnitude of the (X) treatment effect (denoted 6), and its denominator is the standard deviation of
the residuals in the equation used to estimate B1 (denoted a). Thus, D = 6/a. The
numerator depends on the researcher's
theory, and the denominator depends on
how the researcher disentangles the Vs
and Zs and the inherent variability in the
phenomenon under examination.
The required sample size is a decreasing
function of a, 3,6 and an increasing function of a. Therefore, for a given a and 3,
the required sample size will be small if

the proposed theory implies a large effect


on Yand/or the researcheris clever in designing a plan to disentangle the effects of
the Vsand Zs. For example, usingANOVA
(no matching) in a single test with target
the reac=.05, 03=.1, and D=6/a=.5,
quired sample size is 70 for each of the two
groups. If the researcher has a theory
yielding a larger 6 that would increase D
to .75, then the sample would be 32 each,
and if D is 1.0 then the sample size is 18
per group. Alternatively, for D=.5 and
holding 6 constant, if matching is used
and the Y., Y, correlation is .25 then
the required sample is approximately 57
pairs."3 If the Yo, Y1 correlation is .5
(implying a reduction in aof about 18 percent), then the required sample is 36 pairs.
The four factors and their implications
for accounting research will be discussed
throughtwo subtopics.These are: 1) power,
and 2) prejudice against the null hypothesis.
Power
Consider a researcherwho has a theory
that the treatment effect (B1) is positive,
and who therefore is interested in testing
the (null) hypothesis that the true effect
of treatment is less than or equal to zero
against the alternative that the true effect
'0 ANCOVA will usually be less biased, however (see
Cook and Campbell [1979, pp. 177-182], and Cochran
[1983, pp. 127-128]).
" See Cochran's comments on R. A. Fisher's advice to
"Make your theories elaborate. " According to Cochran,
Fisher meant that when "constructing a causal hypothesis one should envisage as many different consequences
of its truth as possible, and plan observational studies to
discover whether each of these consequences is found to
hold" [Rosenbaum, 1984, p. 43]. This advice is consistent with Boulding's exhortation to develop and test
theories that are at the level of the real world.
12 The required sample size is:
n =2[(t.,

2X-2

6,

2,-2

(1 /D)]

See Ostle [1963, p. 553] for a table.


'" The required sample size is:
n=

+ to55]) (f aD)]b'
(1963,p

See Ostle [1963, p. 55 1] for a table.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

The Accounting Review, April 1986

346

is greaterthan zero. Assume that a sample


of treatment subjects has been matched
or "paired" on V with control subjects.
Also, based on an assessment of the appropriatesignificance level for the issue at
hand, the researcherhas set a at .05, and
the researcher has a research design in
mind.
What the researcher may not consider
at the planning stage is the magnitude of
the hypothesized effect (i.e., a particular
6 for the alternative hypothesis) and the
allowable 3 for that 6. 14 A 6 may not be
considered since most theories suggest
only the direction of an effect and not its
magnitude, and f3 is not considered because no particular 6 is specified. The researcher may proceed to testing with little
consideration of whether the planned test
has an adequate chance to reject the null
hypothesis even if it is false.
To illustrate, consider the sampling distributions in Figure 2. For both panels of
Figure 2, the left-hand distribution is for
the mean of the paired differences if the
null hypothesis (B1=0) is true, and the
right-hand distribution applies if the particular alternative (B1= 6) is true. Also for
both panels, k is the point that yields
a =.05 or five percent of the area to the
right of the point under the left-hand distribution (i.e., Ho). In panel a, the research design and sample size yield a
sampling distribution with a large area
(1 - 3)to the right of k under the alternative hypothesis. Thus, there is high probability of rejecting Ho when the alternative
hypothesis is true. In other words, the
power of the test (1 - f) is high.
In Figure 2, panel b, 6 is the same as in
panel a, but the sampling distributions
are much flatter due either to small sample sizes or a large standard deviation due
to remaining effects of Vsand Zs. Rather
than the relatively high power test of
panel a, the researcher faces a low power
test. At a(=.05, 13for the simple alterna-

tive hypothesis is greater than .5, and


power is less than .5. Even if Ho is false
(i.e., B = 6). and thus the researcher's
theory of a positive treatment effect is
correct, the researcherhas a less than even
chance of rejecting it!
Suppose that the researcher in panel b
observes a test statistic that is almost significant, and the sample estimate of the
treatment effect is equal to 5. He or she
then decides to take a follow-up sample.
The follow-up sample is also likely to indicate nonrejection due to its small size
[Tversky and Kahneman, 1971, p. 107].
The real culprit, of course, is the low
power of the test. If the low power is anticipated at the planning stage, an attempt
can be made to mitigate its negative effects
or else abort the project. In general, power
can be increased by 1) increasing the sample size, or 2) increasing the design factor
D by developing better theories (yielding
larger 6) or by making better use of agiven
sample size and theory by careful attention to the Vs and Zs (yielding smaller a).
As a practical matter, improved design
is often the only alternative in accounting
research since the size of samples in accounting frequently is effectively fixed.
For experiments, the pool of available auditors, accountants, financial statement
users, and even students is effectively limited to fairly small numbers. Subjects' time
is not free, and the supply is not inexhaustible. For passive observation, the number
of firms for which particular accounting
and other required economic data are
available may be relatively small. Thus,
accounting researchers need to be aware
of a variety of analytical techniques appli4 See Tversky and Kahneman [19711. This is in
contrast to classical or normal distribution theory-based
audit sampling where, in addition to setting a to control
the risk of incorrect rejection, the auditor sets i to control the risk of incorrect acceptance and sets 6 based on
"intolerable" error (materiality). The auditor then
selects an estimator and calculates the minimum sample
size subject to the target a, j, and 6.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

347

Kinney
FIGuRE2
SAMPLINGDIsRIuBUTIONSOF d

a. HighPower

FORHIGH AND Low POWER TESTS

OI

Rt

b. LowPower

l~~~~~~~~~~~

l.0

l~~~~~~~~
RejectHo~

Reec

l~~~~~~~~~~~~~Rc

lY~o)M
jlal

Ho

cable to a variety of research problems. 15


Furthermore, for a given research paradigm, the problem of low power is likely
to become more difficult over time. Other
things equal, as knowledge of the effects
of accounting expands, the likely size of
the effect of each new or more refined
theory (B1)will tend to have less additional explanatory power. As knowledge expands, the best potential Xs are investigated and become Vsor Zs. For example,
early studies tested hypotheses about the

degreeof owner versusmanagercontrol


as an X that affectedaccountingchoices.
Laterstudieshave usedthe samevariable
as a Vor Z. Absentdevelopmentsthat restructurethe way a particularproblem
is addressed, future researcherswill be
faced with discoveringnew Xs that have
less potentialexplanatorypower.
's In debate on the preferability of parametric vs. nonparametric statistics in research, the ability of parametric
methods to accommodate more Vs and Zs through covariance analysis is an often overlooked advantage.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

348

The Accounting Review, April 1986

In fact, it may be unreasonableto expect


that a particulartheory based on accounting methods will yield a true differential
effect that is very large relative to the variance of Y. How things are accounted for
simply can't be expected to explain a large
portion of stock price variability or managerial or investor behavior. Under some
conditions, the sample size required to
yield reasonable power exceeds the size of
the known population!
A researcher can get some protection
by making a tentative calculation of power
before investing in expensive data collection or in experimentation. For example,
passive observers of accounting changes
and stock returns may be able to make
reasonable estimates of the standard deviation of return residuals and might make
power estimates for various levels of 6. 16
If the estimated power is inadequate even
for the maximum 6 that might reasonably
exist, then the research can be redesigned
or aborted. Experimenters are perhaps
more familiar with prospective power calculations and frequently use a pilot sample to assist with the sample size and research design development.
Prejudice Against the Null Hypothesis
A theory usually specifies the direction
of the treatment effect and a researcher
generally sets out to reject hypotheses
based on the assumptionthat the treatment
effect is zero or in the opposite direction
from what the theory predicts. The focus
on rejecting the null hypothesis is the
source of a number of "biases" against
the null that may lead to dysfunctional
consequences. Greenwald [1975] lists
eight such consequences; four that seem
most important for accounting researchers are discussed in order to be better able
to avoid them.
1. A paper will not be submitted for
publication consideration unless the

results against the null are "significant." Especially interesting or innovative results may be submitted
on higher than .05 significance (or
alternatively, the probability at
which the results are significant are
reported), but rarely does an editor
see results with significance levels
above .15. This prejudice need not
exist if not rejecting the null gives
reasonable credibility to the null."7
2. Ancillary hypotheses will be elevated in the exposition of results.
Secondary hypotheses that are significant ex post will receive more
attention than other secondary hypotheses and perhaps the primary
hypotheses. Suggestions will be
made that these results warrant
further study, when in fact one
would expect about one in ten nonsense relationships to be significant
at the .10 level.
3. Alternative operationalization of
variables or their functional form
will be conducted only if "preliminary" results are insignificant. The
extent to which this search activity is
justified is open to debate since
most theories don't imply a single
measurement or functional form.
4. The search for errors will be asymmetric. Outliers that impede rejection of the null hypothesis will
tend to receive more diligent attention than those that favor rejecting
the null. If significant results are
16 The choice of 6 is somewhat arbitrary, but in planning it is useful to consider reasonable or plausible values
for the true treatment effect of X. Alternatively, one
might choose the smallest effect that informed persons
would agree is empirically "important" and therefore
worth knowing about if it exists, or the largest amount
that one could reasonably expect.
" In classical statistics, not rejecting the null is not
equivalent to accepting the null. However, non-rejection
by a reasonably powerful test or series of tests does
increase one's subjective degree of belief in the null.

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

Kinney

349

obtained on the first analysis of a


problem, the neophyte researcher
may not consider a search for outliers or for other violations of statistical assumptions underlying the
analysis. Nonrejection may lead one
to consider such explanations and
to search for programming errors
and data coding errors.
V. SUMMARY AND CONCLUSIONS

In this paper we have stressed consideration of planning for Vsand Zs to be able


to isolate the effect of X as a potential explanation of differences in Y. This consideration may allow increased power in tests
and may allow more to be learned from a
given sample. Planning for Vs and Zs can
reduce the risk of not rejecting the null
hypothesis when it is false. Such planning
may also allow a basis to argue that nonrejection of the null hypothesis may support acceptance of the null. That is, if the
treatment has an important effect, then it
should be revealedby the test. Thus, something may be learned whether results are
statistically significant or not. This should
increase the objectivity of the researcher,
since the work is valuable whatever the
empirical results.
There are at least two ways in which the
approach discussed in this paper can be
useful to Ph.D. students. One is in evaluating the researchdesign of others, and the
other is in planning the student's dissertation. 18 Students must evaluate the work
of others whether in published articles,
working papers, or accounting research
workshops. A student applying the approach to the work of others might try to
answer the following questions: What is
the Y and what is the X? What Vs and Zs

are considered? Are there better ways to


account for the effects of Vs and Zs?
What are other Vsand Zs that might have
important effects?
The same approach can be applied by
the student to his or her own dissertation
proposal. While the basic development of
a research proposal is the responsibility of
the student, there is much to warrantearly
faculty discussion of planned dissertation
research. That is, the faculty can evaluate
a proposal by considering the reasonableness of the theory and the adequacy of
control of potential Vs and Zs. The
faculty should be asked: Is the magnitude
of the hypothesized effect plausible? Are
all importantcompetingexplanationslisted
and adequately dealt with in the plan?
Will the proposed tests likely uncover evidence of a difference equal to 6 if it exists?
Will nonrejection lend credibility to the
null?
Faculty approval of planned dissertation research reduces the student's risk by
1) ruling out potential topics that have
little chance of successful completion, 2)
gathering the right data on the first attempt, 3) eliminatingoutcome dependence
(thus reducing moral hazard for the student), and 4) reducing the temptation of
the student (and faculty) to pursue numerous tangents that may come to light as
the research progresses.
1 In planning research or evaluating the research of
others, a useful practice is to give early attention to the
purpose of the research through preparation of a threeshort-paragraph abstract, synopsis, or working model of
the research. The first paragraph answers the question
"What is the problem?" The second asks, "Why is it an
important problem?" and the third, "How will it be
solved?" Alternatively, the questions might be: "What
are you (or the researcher) trying to find out?",
"Why?", and "How will it be done?"

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

350

The Accounting Review, April 1986

REFERENCES
Abdel-khalik, A. R. and B. B. Ajinkya, Empirical Research in Accounting: A Methodological Viewpoint,
Accounting Education Series No. 4 (American Accounting Association, 1979).
Ashton, R. H., Human Information Processing in Accounting, Accounting Research Study, No. 17
(American Accounting Association, 1982).
Ball, R. and G. Foster, "Corporate Financial Reporting: A Methodological Review of Financial
Research," Studies in Current Research Methodologies in Accounting: A Critical Evaluation," Journal
of Accounting Research (Supplement 1982), pp. 161-234.
Boulding, K. E., "General Systems Theory-The Skeleton of Science," Management Science (April 1956),
pp. 197-208.
Cochran, W. G., Planning and Analysis of Observational Studies, edited by L. E. Moses and F. Mosteller
(John Wiley & Sons, Inc. 1983).
Cook, T. D. and D. T. Campbell, Quasi-Experimentation Design & Analysis Issues for Field Settings
(Houghton-Mifflin Company, 1979) especially chapters 3 and 4.
Greenwald, A. G., "Consequences of Prejudice Against the Null Hypothesis," Psychological Bulletin
(January 1975), pp. 1-20.
Lev, B. and J. A. Ohlson, "Market-Based Empirical Research in Accounting: A Review, Interpretation,
and Extension," "Studies in Current Research Methodologies in Accounting: A Critical Evaluation,"
Journal of Accounting Research (Supplement 1982), pp. 249-322.
Libby, R., Accounting and Human Inlormation Processing: Theory and Applications (Prentice-Hall,
Inc., 1981).
Ostle, B., Statistics in Research, 2nd Edition (The Iowa State University Press, 1963).
Rosenbaum, P. R., "From Association to Causation in Observational Studies: The Role of Tests of
Strongly Ignorable Treatment Assignment," Journal of the American Statistical Association (March
1984), pp. 41-48.
Simon, J. L., Basic Research Methods in Social Science: The Art of Empirical Investigation, 2nd Edition
(Random House, Inc., 1978) especially chapters 3, 7, and 11.
Tversky, A. and D. Kahneman, "Belief in the Law of Small Numbers," Psychological Bulletin (August
1971), pp. 105-1 10.
Watts, R. L. and J. L. Zimmerman, Positive Accounting Theory (Prentice-Hall, 1986).

This content downloaded from 169.229.32.138 on Fri, 9 May 2014 16:35:44 PM


All use subject to JSTOR Terms and Conditions

También podría gustarte