Está en la página 1de 152

University of California

Los Angeles

Essays on Return Predictability and Volatility Estimation

A dissertation submitted in partial satisfaction


of the requirements for the degree
Doctor of Philosophy in Management

by

Yuzhao Zhang

2008
c Copyright by

Yuzhao Zhang
2008
The dissertation of Yuzhao Zhang is approved.

Michael Brennan

Bryan Ellickson

Robert Geske

Mark Grinblatt

Rossen Valkanov

Richard Roll, Committee Chair

University of California, Los Angeles


2008

ii
to my parents, my aunt, Ming Wang and my girlfriend, Rui Wu

iii
Table of Contents

1 Contrarian Investors and Stock Returns . . . . . . . . . . . . . . 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.1 The Economy . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.2 The Equilibrium and Excess Returns . . . . . . . . . . . . 11

1.2.3 Contrarian Flows . . . . . . . . . . . . . . . . . . . . . . . 13

1.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.4 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.4.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . 18

1.4.2 Predictability Results . . . . . . . . . . . . . . . . . . . . . 19

1.4.3 Disagreement or Risk . . . . . . . . . . . . . . . . . . . . . 29

1.5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

1.5.1 Out of Sample Tests . . . . . . . . . . . . . . . . . . . . . 33

1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.7 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1.7.1 Proof of Lemma 1 and Proposition 1 . . . . . . . . . . . . 37

1.7.2 Proof of Proposition 2 . . . . . . . . . . . . . . . . . . . . 40

1.7.3 Proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . 42

1.7.4 Proof of Proposition 4 . . . . . . . . . . . . . . . . . . . . 43

2 Does the Early Exercise Premium Contain Information about

iv
Future Underlying Returns . . . . . . . . . . . . . . . . . . . . . . . . 53

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

2.3 The EEP : Magnitude and Dynamics . . . . . . . . . . . . . . . . 66

2.3.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . 66

2.3.2 Explaining the Magnitude and Dynamics of the EEP . . . 67

2.4 The Information Content of the EEP . . . . . . . . . . . . . . . . 71

2.4.1 Predictive Regressions . . . . . . . . . . . . . . . . . . . . 72

2.4.2 The Provenance of the Predictability . . . . . . . . . . . . 77

2.4.3 Discussion and Related Literature . . . . . . . . . . . . . . 85

2.5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

2.5.1 Subsamples: Pre- and Post-1994 . . . . . . . . . . . . . . . 88

2.5.2 Put EEP Results . . . . . . . . . . . . . . . . . . . . . . . 89

2.5.3 Alternative EEP Aggregation Methods . . . . . . . . . . . 91

2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

3 Alternative Variance Estimators for Pricing Options . . . . . . 106

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

3.2 Data and Test Calculations . . . . . . . . . . . . . . . . . . . . . 112

3.3 Alternative Variance Estimators . . . . . . . . . . . . . . . . . . . 113

3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 116

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

v
List of Figures

1.1 Coefficients for 25 Fama-French Portfolios . . . . . . . . . . . . . 52

2.1 Variation in the level of the EEP . . . . . . . . . . . . . . . . . . 104

2.2 MIDAS Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

vi
List of Tables

1.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 44

1.2 Predictive Regressions of Index Excess Returns . . . . . . . . . . 45

1.3 Predictive Regressions of HML . . . . . . . . . . . . . . . . . . . 46

1.4 Predictive Regressions of the Fama-French Portfolios . . . . . . . 47

1.5 Contemporaneous Returns . . . . . . . . . . . . . . . . . . . . . 48

1.6 Contrarian Flows and Analyst Dispersions . . . . . . . . . . . . . 49

1.7 Contrarian Likelihood of All Investors . . . . . . . . . . . . . . . 50

1.8 Out of Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 94

2.2 Simulating the Dynamics of the EEP . . . . . . . . . . . . . . . 95

2.3 Predictive Regressions of Index Excess Returns . . . . . . . . . . 96

2.4 Predictive Regressions of Index Futures Returns . . . . . . . . . 97

2.5 VAR Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

2.6 Identifying the Source of the Forecastability . . . . . . . . . . . . 99

2.7 Dividend Growth Forecast: MIDAS Regression . . . . . . . . . . 100

2.8 Predictive Regression in Subsamples . . . . . . . . . . . . . . . . 101

2.9 Robustness Check: Put Results . . . . . . . . . . . . . . . . . . . 102

2.10 Predictive Regression under Alternative EEP Aggregation . . . . 103

3.1 Summary Statistics of Annualized Volatility Estimates . . . . . . 124

3.2 Volatility Biases . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

vii
3.3 Prediction Errors of Volatility Estimators . . . . . . . . . . . . . 126

3.4 Shrinkage and Systematic Risk . . . . . . . . . . . . . . . . . . . 127

3.5 Control for Kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.6 Prediction Errors for Low and High Kurtosis Stocks . . . . . . . 129

viii
Acknowledgments

I am deeply indebted to my committee for the guidance and full support they
offered me throughout the Ph.D. program. Richard Roll, my committee chair was
always encouraging and provided countless constructive comments and insights
to my dissertation. Michael Brennan led our weekly reading group, gave a lot
of helpful suggestions, and patiently listened to my babblings. Bob Geske was
there whenever I needed help and I benefited greatly from conversations with
him on finance and life in general. Ross Valkanov first introduced to me what
empirical finance is and how to write an academic paper. Mark Grinblatt helped
me differentiate important issues from the rest. Bryan Ellickson told the asset
pricing story from a whole new perspective. Many other faculty members helped
me during my study and made helpful comments, including Jun Liu, Francis
Longstaff and Avanidhar Subrahmanyam.

I would also like to thank my fellow students. Feifei Li, Jason Hsu, Max
Moroz and Ashley Wang familiarized me with Westwood and the program when
I first arrived. The afternoon office hoops (a big paper recycle bin) with Juhani
Linnainmaa, Alessio Saretto and Duke Whang became one of my favorite sports.
I also enjoyed and learned a lot through discussions with Brett Myers, Udi Peleg,
Al Sheen, and Cesare Fracassi.

My grandma, Yayu Guan taught me to read and count and had always wanted
to see me in the gown. My grandpa (my grandma’s brother), Xunfu Guan, rode
me everywhere with his vintage black bicycle when I was a kid. I hope they
both will be watching me up there. My girlfriend Rui Wu brought me a lot of
happiness and laughters (sometimes by impersonating me) and I enjoyed every
minute of her presence. My parents and my mom-like aunt, Ming Wang, have
supported and inspired me over these years. Words can not express my feelings

ix
for their unconditional love.

x
Vita

1977 Born in Beijing, China


to Xiaoxian Zhang and Jing Wang

1994 First Class Award in Chinese Mathematics Olympiad

1999 B.S., Biochemistry and Molecular Biology


Peking University, Beijing, China

2001 Summer Intern, Quantitative Derivatives


Empirica Capital, LLC, Greenwich, CT

2002 M.S., Mathematical Finance


Courant Institute of Mathematical Sciences
New York University, New York, NY

2002–2008 Teaching Assistant


UCLA Anderson School
University of California, Los Angeles, CA

2005 Consultant, Research and Investment Management


Research Affiliates, LLC, Pasadena, CA

2006 Consultant, Investment Management


TCW Group, Inc., Los Angeles, CA

2007 Center of Finance and Investments Dissertation Fellowship


UCLA Anderson School
University of California, Los Angeles, CA

xi
2008 Ph.D., Management
UCLA Anderson School
University of California, Los Angeles, CA

Presentations

Zhang, Y. (2008). Contrarian Investors and Stock Returns. Paper presented at


University of California, Los Angeles, University of Illinois at Chicago, University
of Hong Kong, Singapore Management University, Peking University, and Temple
University.

Zhang, Y. (2006). Does the Early Exercise Premium Contain Information about
Future Underlying Returns? Paper presented at European Finance Association
2006 Meetings.

xii
Abstract of the Dissertation

Essays on Return Predictability and Volatility Estimation


by

Yuzhao Zhang
Doctor of Philosophy in Management
University of California, Los Angeles, 2008
Professor Richard Roll, Chair

The three chapters of this dissertation examine return predictability and volatility
estimation. The first chapter shows that flows from contrarian investors reveal
information about the representative agent’s risk aversion in a model in which
the representative agent displays time varying risk aversion and investors have
heterogeneous preferences. The key result, consistent with theory, is that the
flows of contrarian investors predict market returns. The second chapter studies
the information content of the call (put) Early Exercise Premium, or EEP . The
call EEP specifically captures investors’ expectations about future lump sum
dividend payments as well as other state variables such as conditional volatility
and interest rates. From that perspective, the EEP should also be related
to future returns of the underlying security. Interestingly, we find that the
EEP is a good forecaster of returns at daily horizons. Importantly, we find
that the predictability stems primarily from the ability of the EEP to forecast
innovations in dividend growth, rather than other components of unexpected
returns. The third chapter examines a volatility estimation bias that may be
commonly exhibited by all option pricing models. Black and Scholes (1972) were
the first to illustrate the bias by showing that their model under priced options on

xiii
relatively low variance stocks and over priced options on relatively high variance
stocks. The bias is always observed in cross section among individual stocks. We
show that alternative variance estimators that use “shrinkage” techniques can
eliminate the bias.

xiv
CHAPTER 1

Contrarian Investors and Stock Returns

We investigate the relation between aggregate stock returns and the capital
flows to the stock market from contrarian investors, defined as those whose
capital flows are in the opposite direction from those of the representative agent.
It is shown that flows from contrarian investors reveal information about the
representative agent’s risk aversion in a model in which the representative agent
displays time varying risk aversion and investors have heterogeneous preferences.
The key implication is that the flows of contrarian investors predict market
returns. We construct a contrarian flow measure using the capital flows from all
major market participants and find, consistent with our theoretical prediction,
that the contrarian flow is a good forecaster of returns at quarterly horizons.
The predictability is robust to other known return predictors. Moreover, the
predictability is stronger for growth stocks than for value stocks which, together
with the notion that growth stocks bear more discount rate risk, supports our
hypothesis that the contrarian flow measures the market risk premium.

1
1.1 Introduction

A large empirical literature documents evidence of a countercyclical risk premium


in aggregate stock returns (Campbell and Shiller (1988a, 1988b), Fama and
French (1989) and many others). Campbell and Cochrane (1999) show that
such a pattern can be explained by a model in which the representative agent’s
utility function contains time-varying habit and therefore displays countercyclical
variation in risk aversion. The same pattern in the risk premium appears in
an economy in which agents have heterogenous risk preferences and the wealth
distribution fluctuates over time as in Chan and Kogan (2002). In their model, the
weights of agents are determined by their respective levels of wealth. Therefore
the time varying wealth distribution generates the variation in risk premium.
In one model, the subsistence level of consumption, and in the other model
the wealth distribution is crucial in determining the representative investor’s
displayed risk aversion and hence the risk premium.

Unfortunately from an empirical perspective, neither subsistence consumption


level nor the wealth distribution is directly observable. Therefore, extant research
on predictability focuses on variables that relate to the business cycles, like
the short term interest rate, or variables that reveal information about the
representative agent’s risk aversion, like the dividend price ratio. Dichev (2007)
calculates the internal rate of return (IRR) experienced by the aggregate investor
using the capital flows to the market and shows that the representative agent’s
IRR is significantly lower than the historical average returns of the stock market.
This result suggests that the representative investor’s high flows into the market
precede low market returns. This is consistent with the findings of Baker and
Wurgler (2000) and Boudoukh, Michaely, Richardson, and Roberts (2007) that
high net equity issuance from all companies predict low market returns. Most of

2
the studies focus on the link between aggregate investor flows or issues and the
market returns. If investors are heterogenous, some valuable information will be
lost in the aggregation process.

In this paper we develop and test the implications for the market risk premium
of heterogeneity in investor risk preferences. The model we develop abstracts
from both the mechanism which causes the risk aversion of the representative
agent to vary over time and the determination of the supply of risky assets. We
concentrate on the implication of heterogenous risk aversions when the supply of
risky asset is assumed to be negatively correlated with the representative agent’s
risk aversion. The model is set up with heterogenous preference and homogeneous
information. Investors have exponential utility and differ in their risk aversions
in a way similar to that of Campbell, Grossman, and Wang (1993). Under the
assumption that the supply of risky asset is inversely related to risk premium,
some investors are contrarian whose flows are in the opposite direction to the
flows of the representative agent. The major testable hypothesis of the model
is that the contrarian investors’ flows are positively related to the representative
agent’s risk aversion and therefore predict market returns.

The assumption of the supply of risky asset inversely related to risk premium
is a key component to the model.

The intuition for the results is as follows. Consider an economy in which the
representative agent has time varying risk aversion and the change in the supply
of the risky asset is negatively correlated with the revision of the representative
agent’s risk aversion. Time varying risk aversion induces a time varying risk
premium. The representative agent is composed of two types of investors. The
risk preference of the first type of investors does not vary over time while
the other type of investors’ exhibit time-varying preference which dictates the

3
representative agent’s risk aversion. As the latter group of investors becomes more
risk averse, they reduce their equity exposures and the former group of investors
increase their equity holdings. Under this scenario, the former group of investors
shows positive capital flow to the market while the latter group shows negative
flow. At the same time, the supply of risky assets decreases because the change
in total share supply is assumed to be negatively related to the change in the
representative agent’s risk aversion. The representative agent shows a negative
capital flow because of the reduction in share supply. By our definition, the former
group of investors is categorized as contrarian. The contrarian investors increase
their holdings of risky assets and are compensated with a higher expected return
because the representative agent also turns more risk averse together with the
latter group of investors. The same logic applies when the representative agent
along with the latter group of investors becomes more risk tolerant.

Guided by the intuition derived from the model, we examine the capital
flows by all major market participants. Specifically, each quarter one aggregate
participant is classified as contrarian if its flow is in the opposite direction with
that of the representative agent’s. For example, in a quarter in which the
market receives positive capital flow from the representative agent, participants
that reduce their equity holdings are labeled contrarian investors. We sum
the normalized flows of these contrarian investors to construct the contrarian
flow (CTF) measure. Cohen (1999) aggregates all major investor groups
into households and institutions and studies the asset allocation decisions
of individuals and institution investors by analyzing their portfolio flows.
Households are found to reduce their equity exposures during business cycle
troughs. This result suggest that the risk aversion of households is countercyclical.
Wang (2003) proposes the equity flows from all institutional investors as a proxy
for market liquidity and find that the flow based liquidity measure commands a

4
premium in the cross section. Rather than focusing on the representative agent as
in Dichev (2007) or breaking down all participants as households and institutions
like in Cohen (1999) and Wang (2003), we adopt a different approach and follow
the classification principal derived from the theory and categorize major market
participants into two groups every quarter by their flows to the stock market.

We do not distinguish households from institutions as in previous studies


because our theoretical framework suggests a clear classification of major market
participants by their flows. Second, one objective of this study is to uncover
the information through disaggregation. So we break up all institutions and
examine each market participant separately. Third, although one can argue that
individuals and institutions differ because of their utility functions and possible
agency problems arising inside institutions, there is no doubt that all institutions
are heterogenous and differences between various types of institutions could
potentially be as large as the difference between individuals and the who universe
of institutions. Finally, to the extent that our way of separating institutions and
households provides a measure of the differences between market participants, it
is interesting to see how much institutions and households differ from each other
in the data.

The contrarian flow reveals the representative agent’s risk preference in an


economy with heterogeneous agents. Alternatively, it might be argued that the
contrarian flow is associated with net issues for the following reason: if we look at
the signs of the two variables, the contrarian flow is negative (positive) whenever
there is positive (negative) net issues, by definition. This might be of some
concern because a lot of studies show that issues predict returns both in the
time series (Baker and Wurgler (2000) and Boudoukh, Michaely, Richardson,
and Roberts (2007)) and in the cross section (Ritter (1991), Loughran and Ritter

5
(1995) and many others). Nevertheless, the two measures differ substantially in
principle because the contrarian flow can fluctuate when net issues are fixed and
vice versa. Hence, neither is a perfect substitute for the other. This point is
further supported by the data. Ultimately, it is an empirical question whether
the information in the contrarian flow is fully contained in the net issues.

In order to test the implications of the model, we investigate the empirical


relation between the contrarian flows and the CRSP value weighted index returns
using quarterly data from January 1960 to December 2005. First, we describe
the statistical properties of the contrarian flow. We document that the average
contrarian flow is negative and its magnitude is economically and statistically
significant. The contrarian flow also demonstrates a negative and significant
correlation with the net issues but the magnitude is rather modest. These results
are consistent with how the contrarian flow variable is constructed because the
representative investor has provided positive flow to the stock market over the
last half century.

Second, We find that the CTF does forecast the index returns at one quarter
to intermediate horizons. The predictive relation is robust to the addition of other
control variables such as dividend yield, short interest rate, net issues, term spread
and default spread. The forecastability disappears at longer horizons. The ability
of the CTF to forecast returns is markedly different from that of the dividend
yield and interest rates in two important aspects. From a statistical perspective,
the CTF contains information about the market risk premium but does not
suffer from the well known-statistical problems (such as extreme persistence
and low volatility) which have rendered forecasting with these predictors quite
problematic (Stambaugh (1999), Ferson, Sarkissian, and Simian (2003), Torous,
Valkanov, and Yan (2005), Boudoukh, Richardson, and Whitelaw (2005)). The

6
autocorrelation of the CTF , while significantly different from zero, is not near
the boundary of non-stationarity and will not significantly bias the estimates in
forecasting regressions. Also, the volatility of the CTF is actually larger than
that of the CRSP index returns. This is in contrast to the volatilities of other
predictors, which are at least an order of magnitude lower than that of the returns.
These appealing statistical properties of the CTF make it a suitable forecaster
of returns.

From an economic perspective, the source of the CTF predictability also


differs from that of other widely used conditioning variables. Campbell and
Shiller (1988b), Fama and French (1989), Campbell (1991) and others argue that
macro variables like the dividend yield and interest rates capture time variation in
expected returns. In contrast, the CTF goes one step further and is constructed
to take full advantage of investor heterogeneity. To the extent that investors or
investor groups have different preferences (Chan and Kogan (2002) and Cohen
(1999)), the predictability may be due to the CTF containing information about
the representative agent’s risk aversion and hence the market risk premium.

We analyze two other implications of the model. First, we investigate the


empirical relation between the CTF and the returns of the 25 size and book-
to-market sorted benchmark portfolios. The predictability of the growth firms
should be stronger than that of the value firms because the CTF captures
the market risk premium and growth firms bear more discount rate risk than
value firms (Brennan, Wang, and Xia (2004) and Campbell and Vuolteenaho
(2004)). We find that the CTF forecastability for growth companies is indeed
more significant than for value companies. Second, we test whether the CTF
is related to contemporaneous market returns. As stated in the model, a high
CTF should be associated with a low contemporaneous market return because a

7
high CTF signals an upward revision of the representative agent’s risk aversion
and prices adjust downward. Consistent with this conjecture, we show that the
CTF is negatively correlated with concurrent market returns controlling for other
variables.

Finally, we investigate the CTF from two empirical perspectives. First, we


try to distinguish the CTF from investor disagreement in the data. While our
model abstracts from differential information among investors, it is possible that
the CTF actually measures disagreement among investors and it is well known
that high disagreement predicts low returns (Miller (1977) and Park (2005)).
The CTF could relate to disagreement if the contrarian investors act against the
direction of the representative agent because their views on fundamentals differ
from the consensus. We confront this hypothesis by adding earnings forecast
dispersion as an additional control. We find that the CTF displays low correlation
with the proxy for disagreement and its forecasting power is not affected. Second,
we examine who acts contrarian more frequently because contrarian investors
should demonstrate a more stable risk tolerance according to theory. We show
that flows from households are negatively correlated with the CTF and it is
consistent with the fact that households is less likely to act contrarian because
habit induces a time varying preference.

The paper is structured as follows. We lay out our model in section 2. In


section 3, we describe the data set. In section 4, we present the predictability
results of the CTF . We conduct out-of-sample robustness checks in Section 5
and conclude in Section 6 with some final remarks.

8
1.2 Model

We consider an economy in which investors have heterogeneous preferences but


homogenous information. The economy contains two assets, a risky asset (stock)
and a risk-free asset (bond). The supply of the risk-free asset is assumed to
be perfectly elastic. The per capita supply of the risky asset, however, is time
varying and corresponds to the share issuance and repurchase by the corporation.
Trading is driven by the time-varying preferences of a group of investors as well as
the changing supply of the risky asset. The setup extends the model in Campbell,
Grossman, and Wang (1993) with a key innovation. Campbell, Grossman, and
Wang (1993) assume that the supply of the risky asset is constant. When the
net supply of the risky stock is fixed, every buy is matched with a sale and prices
adjust. When we consider the more realistic and more interesting situation in
which the per capita stock supply fluctuates, we can distinguish the prevailing
investors 1 from the contrarian investors. More importantly, if the change in
the supply is negatively related to the representative agent’s risk aversion, the
contrarian investors holdings will be informative about the market risk premium.
This is the key innovation of the model.

1.2.1 The Economy

The economy is further specified as follows. There are two groups of investors,
A and B, with weights of w and 1 − w respectively. Both groups have constant
absolute risk aversion. The group A investors’ coefficient of absolute risk aversion
equals a and the group B’s risk aversion bt varies over time.

1
The name reflects the fact that change in the representative agent’s preference is driven by
this group of investors.

9
The return on the risk-free asset is constant R = 1 + r . The stock pays a
dividend Dt per share at time t. Dt = D̄ + D̂t and D̂t follows the process,

2
D̂t = αD D̂t−1 + uD,t , uD,t ∼ N (0, σD ), (1.1)

where 0 ≤ αD < 1 and uD,t is i.i.d. normal. All investors receive a public signal
St about next period’s dividend,

uD,t+1 = St + εD,t+1 , St ∼ N (0, σS2 ), εD,t+1 ∼ N (0, σε2 ) (1.2)

where St and εD,t+1 are uncorrelated.

We assume that investors are myopic and each trading period t they maximize
next period’s expected utility,

max Et [− exp(γWt+1 )], γ = a, bt , (1.3)


H

subject to
Wt+1 = RWt + Ht (Pt+1 + Dt+1 − RPt ), (1.4)

where Wt is wealth, Ht is the stock holding, and Pt is the share price.

Lemma 1. There exists a representative agent with the coefficient of absolute


risk aversion γt given by
abt
γt = . (1.5)
(1 − w )a + wbt

Proof of Lemma 1 is in Appendix A. It is a standard result that the


representative agent’s risk tolerance (inverse of risk aversion) is the weighted
average of those of all investors’. When w = 1 and group A investor dominates,
γt = a. Conversely, when w = 0 and all investors are type B, γt = bt . We assume

10
that γt = γ̄ + γ̂t and γ̂t follows the process,

γ̂t = αγ γ̂t−1 + uγ,t , uγ,t ∼ N (0, σγ2 ), (1.6)

where 0 ≤ αγ < 1 and uγ,t is i.i.d. normal.

The net supply of the risky asset at time t is 1 + X̂t . X̂t is assumed to follow
the process,
2
X̂t = αX X̂t−1 + uX ,t , uX ,t ∼ N (0, σX ), (1.7)

where 0 ≤ αX < 1 and uX ,t is i.i.d. normal. If uX ,t is independent of


uγ,t , the supply shock is simply mean-reverting noise and the model becomes
the conventional rational expectation model. However, there are reasons to
believe that the supply of risky asset (share issues) is not independent of the
representative agent’s risk preference. Theoretically it is apparent that a lower
discount rate will render projects more appealing and lead to more share issues.
Empirically the risky asset supply is shown to correlate with subsequent stock
returns in Baker and Wurgler (2000) and Boudoukh, Michaely, Richardson, and
Roberts (2007). Therefore, we assume that the uX ,t is negatively correlated with
uγ,t , ρX ,γ < 0 and derive testable implications in a reduced form model.

1.2.2 The Equilibrium and Excess Returns

The price of the stock is the discounted sum of expected future dividends minus
a risk premium that depends on the risk aversions of all investors and the total
supply of the risky asset. We can denote the discounted sum of all future
R D̄ R 1
(including today’s) dividends by Ft , where Ft = r + R−αD D̂t + R−αD St , and
R2 1
the variance of Ft , σF2 = σ2
(R−αD )2 ε
+ σ2 .
(R−αD )2 S

Proposition 1. There exists an equilibrium price of the stock with the following

11
form,
Pt = Ft − Dt + p0 + p1 γt + p2 Xt , (1.8)
2 2

(σF +σQ )γ̄ R−αγ αγ −R+ (αγ −R)2 −4σp2 σF
2
where p0 = − r − p1 γ̄, p2 = R−αX p1 γ̄, p1 = 2
2σp
< 0,
and σp2 is a positive constant with its precise expression in the appendix. The
excess return of the risky asset is,

Qt+1 = εF ,t+1 −r (p0 + p1 γ̄) + p1 (αγ −R)γ̂t + p1 uγ,t+1 + p2 (αX −R)X̂t + p2 uX ,t+1 ,
(1.9)
R 1
where εF ,t+1 = R−αD εt+1 + R−αD St+1 .

Proof of Proposition 1 is standard. See Appendix A. Both coefficients p1 and


p2 are negative, consistent with the intuition that a high risk aversion or a high
supply depresses the stock price. The last term in the price function, p2 Xt is new
compared to that in Campbell, Grossman, and Wang (1993) and that term is
crucial in helping to distinguish the two types of investors by their capital flows
as shown next.

Given the return process above, the conditional expected excess return of the
stock is as follows,

Et [Qt+1 ] = −r (p0 + p1 γ̄ + p2 X̄ ) + p1 (αγ − R)γ̂t + p2 (αX − R)X̂t . (1.10)

Because p1 , αγ − R, p2 and αX − R are all negative, the expected excess


return is high when the representative agent is more risk averse or the supply of
the stock is high.

12
1.2.3 Contrarian Flows

The holdings of the risky asset by the two groups of investors at t are,

w
Hta = γt (1 + Xt ) (1.11)
a
1−w
Htb = γt (1 + Xt ) (1.12)
bt

and the net purchases by the groups are,

w
∆Hta = [γt (1 + Xt ) − γt−1 (1 + Xt−1 )] (1.13)
a
1−w 1−w
∆Htb = γt (1 + Xt ) − γt−1 (1 + Xt−1 ) (1.14)
bt bt−1

It is straightforward to verify that Hta + Htb = 1 + Xt , and ∆Hta + ∆Htb = ∆Xt .

The net purchase of group A investors corresponds approximately to the


change in the representative agent’s risk aversion. Note that the change in the
representative agent’s risk aversion is driven by the group B investors only. From
equation (1.5) we can derive the following,

aγt
bt = (1 − w ), (1.15)
a − w γt

and the group B investor’s trading becomes,

a − w γt a − w γt−1
∆Htb = (1 + Xt ) − (1 + Xt−1 ). (1.16)
a a

Proposition 2. The net purchase of groups B investors is positively related to the


γ̄(1−αX αγ )
aggregate supply shock, Cov(∆Htb , ∆Xt ) > 0. If −1 < ρX ,γ < − 1−αX · σX
σγ ,
we have the following results. The net purchase of group A investors is negatively
correlated with the supply shock, Cov(∆Hta , ∆Xt ) < 0, and the supply shock

13
negatively predicts next period’s excess return, Cov(Qt+1 , ∆Xt ) < 0.

Proof of Proposition 2 is presented in the Appendix B. The change in share


supply ∆Xt is absorbed by the representative agent. This proposition states
that the flows from group B investors go to the same direction as those from the
representative agent’s. Moreover, if the correlation between the supply shock and
the risk aversion shock is negative enough, the share supply negatively relates to
the next period’s expected return of the risky asset and group A investors’ flows
are against those of the representative agent’s. Admittedly, the condition requires
the correlation to be more negative than a critical value. However, it is proved in
the appendix that the parameter condition required for the negative covariance
between share supply shock and expected risk premium is more stringent than
that required for the negative covariance between the supply shock and group B
investors’ flow. Therefore, to the extent that the aggregate change in risky asset
supply predicts a negative return as documented in the literature and tested
further in the empirical part of this study, we should also obtain that group A
investors’ flow are against the direction of the representative agent’s flows.

The proposition provides a foundation for us to distinguish between market


participants by how their flows relate to the representative agent’s flows. Since
the net purchases of group A investors are negatively related to those of the
representative agent, we designate them ”contrarian investors”. We will further
focus on the relation between the flows from the contrarian investors and market
excess returns.

Proposition 3. The net purchase of group A investors predicts next period’s risk
premium, Cov(Qt+1 , ∆Hta ) > 0.

Proof of Proposition 3. See Appendix C. The flow from group A investors


correlates with expected risk premium positively because group A investors flows

14
are mainly driven by the change in the representative agent’s risk aversion.
Note that this relation does not require any restrictions on the parameters.
Intuitively, when group B investors become more risk averse and ask for a higher
compensation for risk, they unload part of the shares they hold. Part of the
shares sold by group B investors are absorbed by group A investors who are also
compensated by a higher risk premium. As a result a high capital flow from
group A investors reveals a big upward revision in representative agent’s risk
aversion and therefore a high risk premium. As seen in Proposition 2, group A
investors are the contrarian investors under certain conditions. Therefore, flows
from contrarian investors are revealing about the market risk premium. In the
empirical part, we will directly test this conclusion and other implications derived
from this conclusion.

We have established the link between group A investors’ flows and the
representative agent’s changing risk preference, an immediate implication is that
group A investors’ flows negatively relate to the contemporaneous returns of the
risky asset. The intuition is formalized in the following proposition.

Proposition 4. The net purchase of group A investors is negatively correlated


with the contemporaneous excess return, Cov(Qt − Et−1 [Qt ], ∆Hta ) < 0

Proof of Proposition 4 is shown in Appendix D. The rationale is as follow.


High group A investors’ flow indicates an upward revision in the representative
investor’s risk aversion which naturally depresses the current share price. Hence,
the flows from the group A investors are negatively related to contemporaneous
returns. This generates another testable implication for our empirical exercise.

15
1.3 Data

We have three types of time series: portfolio flows into or out of the stock market
by different types of market participants, returns of the aggregate stocks market
and other stock market based variables, such as the dividend price ratio, and
bond market variables, such as the risk-free interest rate, the default spread and
the term spread. We describe each type of time series separately for clarity.

Flow of Funds Data: The quarterly aggregate corporate equity holdings


as well as flows of each major investor class are recorded by the Flow of
Funds Accounts, published by the Federal Reserve Board. The major market
participants include life insurance companies, property insurance companies,
mutual funds, security brokers, households, private, federal and state pension
funds, foreigners, commercial banks and saving institutions. The holdings and
flows are shown in dollar amounts and note that the flows of all investors do not
necessarily cancel out and the gap between inflows and outflows balances the net
issuance of all firms.

Under the representative agent framework, the net flow from all investors is
the flow of the representative agent. In practice, different (groups) of investors
have diverse endowments in both wealth and information and face various
institutional restrictions, transaction costs or tax treatments. Consequently,
their investment decisions demonstrate the heterogeneities. Every quarter, we
distinguish between investors by whether their flows are in the same direction
as that of the representative investor’s. More specifically, we have the following
definitions.

Definition 1. Each period, an aggregate investor type is classified as contrarian


if this type of investor’s flow is in the opposite direction from the flow of the

16
representative agent. Conversely, an investor is classified as prevailing investor
in this period if this type of investor’s flow is in the same direction with the
representative agent’s flow.

We characterize the trading activity of the contrarian investors. For each


quarter between the first quarter of 1960 to the last quarter 2005 we aggregate
the flows of the contrarian investors and compute the contrarian flow (CTF) as
follows,

X Flowt−1→t,i
CTFt = , for all contrarian investors. (1.17)
Holdt,i
i

This measure corresponds to group A investors’ capital flow in the model


except for the scaling. We normalize the flow for each type of investor by their
total holdings so that no type will dominate the measure just by sheer size.
The scaling also ensures that the flows are stationary. This measure captures
the capital flow of the contrarian investors. Note that the composition of the
contrarian investor is not fixed and different investors might act contrarian over
time. The results are not sensitive to the choice of the scaling variable. We
could normalize the flows by last period’s holdings and the results are virtually
unchanged because the holdings of the participants are all very stable.

Return Series and Other Stock Market Variables: We have quarterly returns
of the CRSP value weighted index from January 1960 to December 2005. The
index return is in excess of the one-month Treasury Bill rate, denoted by Rte . The
quarterly returns of the Fama-French 25 size and book-to-market portfolios as
well as those of the benchmark factor HML are aggregated from monthly returns
obtained from Ken French’s website. We also construct the log dividend price
ratio of the CRSP value weighted index (DPt ) using the moving sum of dividend

17
over the past 4 quarters. We measure the net issues by NYSE listed companies
normalized by the total market capitalization of all NYSE stocks (NTISt ), similar
with Dichev (2007). The net issues that takes in account of IPOs, SEOs as well
as share repurchase is computed as follows,

MVt − MVt−4 · (1 + vwretxt−4,t )


NTISt = , (1.18)
MVt

where MVt is the total market cap of all NYSE listed stocks and vwretxt is the
value weighted return of all NYSE stocks excluding distributions. It measures
the growth in total market values not attributable to capital gains.

Bond Market Variables: We also have the 3-month Treasury Bill rate (Tblt ),
the term spread (TMSt ), calculated as the difference between the yield on
long term government bonds and the T-bill as in Campbell (1987), and the
default spread (DEFt ) computed as the difference between BAA and AAA-rated
corporate bond yields, as in Fama and French (1989).

1.4 Empirical Results

In this section we describe the data and test the major implications of the model.

1.4.1 Summary Statistics

In this section, we describe the statistical properties of the CTF . To facilitate


comparison, all numbers are annualized except the CTF . Table 1.1 Panel A
presents summary statistics of the main variable CTF , returns of the CRSP
value weighted index and other control variables. The average CTF is −15.4
percent with a standard deviation of 27.5 percent. The average CTF is negative

18
because by definition contrarian flows are of the opposite sign of the flows from
the representative agent and the representative agent has been buying risky asset
from corporations over the last several decades. The CTF is also much more
volatile than the dividend price ratio and therefore the CTF could potentially
explain more variation in returns than the dividend price ratio as the latter is
often regarded as too smooth.2 It is also worth noting that the autocorrelation
of CTF is 0.21, far from unit-root territory and therefore it suffers little from
the spurious regression problem created by very persistent or even near unit
root regressors, as pointed out by Stambaugh (1999). The autocorrelation of the
CTF is significant but rather modest compared with other commonly used return
predictors, all of which show very high autocorrelations. The persistence of those
variables are also well documented in previous studies.

In Panel B of Table 1.1 we display the contemporaneous correlations between


the CTF and all other variables. The CTF shows a statistically significant −0.371
correlation with net issuance. This shows that the CTF is not a substitute for
the net issuance although they demonstrate a high correlation by construction.

1.4.2 Predictability Results

1.4.2.1 Excess Market Returns

To investigate whether the contrarian investors’ flows (CTF) predict the


aggregate market returns, we run the quarterly predictive regression

e
Rt+1 = α + βCTF t + εt+1 (1.19)

2
The standard deviation of the dividend price ratio is less than 1 percent, while the standard
deviation of the log dividend price ratio shown in Table 1.1 is much larger.

19
e
where Rt+1 is the excess return of the CRSP value weighted index return over
the Treasury Bill rate from t to t + 1, CTFt is the quarterly measure normalized
flows from the contrarian investors. This is a direct test of Proposition 3 which
links the contrarian investors’ flows to the market risk premium. The predictive
regression is performed with a list of additional forecasting variables observable
at t, such as the log dividend price ratio (DPt ), the short term (3-month) T-bill
rate (Tblt ), the net issues by NYSE listed companies (NTISt ), the term spread
(TMSt ) and the default spread (DEFt ). The dividend yield along with other
variables might not act as perfect predictors especially at the most recent history
(Boudoukh, Michaely, Richardson, and Roberts (2007), Goyal and Welch (2006))
but they help us understand whether the CTF contains additional information
and provide us a yardstick to measure the information content of the CTF .

Table 1.2 Panel A presents various specifications of the regressions. The t-


statistics reported in parentheses below the estimates are computed using Newey
and West (1987) heteroscedasticity and autocorrelation robust standard errors. In
column 1, we display the univariate case of the CTF as the only return predictor.
Its coefficient is positive and statistically significant as predicted by Proposition
3. This is one of the key results in this paper. The contrarian flow explains a
modest part of the quarterly return variations and the 1.6% R 2 measures up well
against the dividend price ratio results in the next column. The sign is in line
with what we expect from economic intuition and the model laid out above. The
CTF measures the flows into or out of the market by the contrarian investors.
A high CTF reveals that the representative agent is becoming more risk averse
and sells stocks, while the contrarian investors are net buyers of the market.

In column 2, we demonstrate the benchmark case of the dividend price ratio


as the single predictor, because it has been used a a return predictor in numerous

20
empirical studies, for example Campbell and Shiller (1988a, 1988b) and Fama and
French (1988). The coefficient on the dividend yield is positive but not significant
and the variable explains little variation in quarterly excess returns because the
dividend yield is very smooth and has becomes insignificant in the most recent
history as discussed in Valkanov (2003) and Goyal and Welch (2006). In column 3,
we regress returns on both the contrarian flow and the dividend price ratio side
by side. The point estimates of the two coefficients remain almost unchanged
compared with those in the simple regressions. The model fit appears to improve
on that obtained by the dividend price ratio alone as the adjusted R 2 increases
to 1.8%.

In column 4 to 8, we incrementally add other variables that are known to


predict returns. In column 4, we include lagged market excess return to test
whether the observed predictability is due to a mechanical autocorrelation in
returns. The coefficient of the lagged return is insignificant and does not help
to improve the model fit. Its addition does not affect the predictive power of
the contrarian flow. In column 5, we add the short term T-bill rate because
Fama and Schwert (1977), Campbell (1987), Hodrick (1992) and Ang and Bekaert
(2007) show that the short term risk-free rate is negatively related to future stock
returns. The coefficient of the short term risk-free rate is significantly negative,
which agrees with the extant literature. It is also interesting to note that the
predictive power of dividend price ratio is greatly enhanced by the addition of
the T-bill rate and its coefficient turns significant, which is in sharp contrast
with the case where the dividend price ratio acts as the single predictor. The
same phenomenon is documented in Ang and Bekaert (2007). In column 6, we
further control the net issues of NYSE listed companies, which measures the
change in the aggregate supply of risky assets. Baker and Wurgler (2000) and
Boudoukh, Michaely, Richardson, and Roberts (2007) employ related measures

21
for predictability tests and find that the change in equity supply negatively
forecasts returns. Controlling for the net issues is crucial for two reasons.
Theoretically, under the economic framework presented in the last section the two
measures both contain information about the representative agent’s risk aversion.
Empirically, the contrarian flows are by definition in the same direction with the
corporate issues and therefore the two are significantly correlated as shown in
Table 1.1. The coefficient of the net issues is negative and marginally significant,
consistent with the previous studies. More importantly, the additions of the two
variables do not diminish the forecast power of the contrarian flow, whose point
estimates and statistical significance remain almost unchanged.

Finally, in column 7 and 8, we control for the term spread and the default
spread, which have been studied by Keim and Stambaugh (1986) and Fama and
French (1989) as proxies for business conditions. The coefficient for the term
spread is insignificant and that for the default spread is positive and marginally
significant. The results for both of them are consistent with previous findings.
Furthermore, it must be noted that the inclusion of the business cycle variables
does not alter the point estimate or the significance of the contrarian flow.
Therefore, the contrarian flow captures information about future risk premia
that is orthogonal to that in other predictors.

Although the coefficient of the contrarian flow is statistically significant, we


have yet to compare its magnitude to the benchmark predictor, the dividend price
ratio. To understand the economic magnitude of the predictability, it is helpful
to compare the impact of a one-standard-deviation shock in each one of these
variables to excess returns. A one-standard-deviation shock to the dividend yield
0.393
results in 37 basis points (0.019 × √ ) increase in next quarter’s CRSP index
4
return. A similar shock to the contrarian flow produces a 115 basis points change

22
(0.083× 0.275
√ ) in next period’s returns.3 Hence, at a quarterly basis, the economic
4
significance of the contrarian flow is 3 times as high as that of the dividend yield.
The coefficients on the T-bill rate, the net issues and the default spread are also
significant or marginally significant. The impacts of a one-standard-deviation
shock for these three variables are 113 bps, 48 bps and 79 bps respectively. Only
the short term T-bill rate shows a comparable economic significance with the
contrarian flow.

To summarize the findings in Table 1.2, all variables enter with the
economically expected sign in predicting the CRSP value weighted index return
and replicate existing studies. However, the most significant predictors are the
contrarian flow and the short term T-bill rate. Whether the forecasting ability of
the contrarian flow is consistent with the economic intuition explained in Section
2 is something we investigate extensively below.

1.4.2.2 Returns at Long Horizons

We have shown that the contrarian flow predicts the market excess returns of
the following quarter. Here we investigate longer horizon predictability for two
reasons. First, under the economic framework we presented in Section xx, the
change in the representative agent’s preference could carry its impact for an
extended period of time. Second, aggregating returns helps to eliminate the
noise in quarterly returns. In this section, we address these issues by examining
whether the contrarian flow predicts the excess market returns at horizons of two
quarters, three quarters, and up to three years.

In Panel B of Table 1.2 we display the long horizon regression results. The
dependent variables are k -period excess returns of the CRSP value weighted

3
The standard deviations of the variables are in Table 1.1.

23
e
index, Rt,t+k . The first row shows that the contrarian flow forecasts returns
significantly at all horizons up to 12 quarters. The predictive power of the
contrarian flow becomes stronger as we move from short to intermediate horizons.
However, the coefficients do not increase mechanically as the horizons and peak at
6 quarters. The coefficients decrease gradually at horizons longer than 6 quarters.
This hump-shaped pattern reassures that the long horizon predictability is not an
artifact created by persistent regressors, as suggested by Boudoukh, Richardson,
and Whitelaw (2005). The rest of the rows present the forecasting power of other
variables at long horizons. The second row and the third row display that the
T-bill rate and the dividend yield are significant only at short horizons, up to
3 quarters. The findings are consistent with those in Ang and Bekaert (2007)
who conclude that the predictability of the short rate is only significant at short
horizons.

The longer horizon results suggest that the forecastability of the contrarian
flow is mostly observable at horizons up to 12 quarters. The magnitude of the
contrarian flow coefficients increases first and decreases gradually and becomes
insignificant after 12 quarters or so. The fact that the coefficients form a hump
shape excludes the concern of spurious regressions produced by persistent return
predictors.

1.4.2.3 Predicting Value and Growth Returns

We have shown that the contrarian flow predicts the CRSP value weighted
index at quarterly and longer horizons as a high contrarian flow reveals a high
discount rate, consistent with the theoretical prediction in Proposition 3. One
important and testable implication of Proposition 3 is that the contrarian flow
should exhibit stronger predictive power for growth stocks than for value stocks

24
because growth stock returns have high covariances with market discount rates,
as in Brennan, Wang, and Xia (2004) and Campbell and Vuolteenaho (2004).
Brennan, Wang, and Xia (2004) and Campbell and Vuolteenaho (2004) both
establish that growth stocks are more correlated with proxies for discount rates
or investment opportunities and therefore growth stocks are less risky in the long
term and earn lower returns than value stocks. Despite that our goal is not to
explain the value premium in the cross-section, the fact that growth stocks are
more affected by discount rates provides us another angle to test our theory by
examining whether the contrarian flow predicts growth stocks more than value
stocks and hence could forecast the value premium.

To investigate whether the contrarian flow can forecast the value premium,
we run the quarterly predictive regression

HML
Rt+1 = α + ρCTF t + γZt + et+1 (1.20)

HML is the return of the zero investment portfolio of long value stocks
where Rt+1
and short growth stocks as proposed by Fama and French (1993) and Zt is a
vector of control variables as discussed above.

Table 1.3 Panel A displays various specifications of the regressions. In column


1 we display the univariate regression using contrarian flow as the sole predictor.
The coefficient is negative and statistically significant. This confirms our theory
that high contrarian flow predicts high discount rates and as a result growth
stocks outperform value stocks in the future. In column 2, we control for
the dividend price ratio because it has been shown to predict excess market
returns. The coefficient for the dividend price ratio is negatively close to 0
and insignificant and it actually decreases the adjusted R 2 suggesting it is not
informative about the future value premium. Column 3 further controls for the

25
lagged value premium to exclude the chance that the predictability is due to
the autocorrelation in the HML. The lagged value premium appears to improve
the model fit albeit the coefficient itself is insignificant. Furthermore, the the
point estimates of our main variable are barely altered in both magnitude and
statistical significance. From column 4 to 7, we incrementally add the T-bill rate,
the net issues, the term spread and the default spread, none of the controls
are statistically significant, with the possible exception of the T-bill, which
is marginally significant under certain specifications. All but the T-bill rate
also impair the model fit judged by the diminished adjusted R 2 and the slope
coefficients of the contrarian flow actually become stronger, if anything. The
increase in the magnitude and statistical significance is undoubtedly due to the
addition of the T-bill rate which has a significant positive correlation with the
main variable of interest.

For the same reason as in the last section, the change in the representative
agent’s risk price should affect the discount rates in the intermediate horizons.
We examine the long horizon regressions for the value premium and the results
are displayed in Panel B of Table 1.3. Similar with the one quarter results, the
contrarian flow is the only significant variable at any horizons. The adjusted R 2
peaks at one year horizon and the coefficients gradually decrease and become
insignificant at horizons longer than two years. This confirms the results in
Table 1.2 that the predictability is at short to intermediate intervals and our test
statistics do not grow over time rules out the possibility of a mechanical relation
manifested by overlapping returns and persistent regressors.

To recap the findings in Table 1.3, the contrarian flow is a significant predictor
for the value premium at short to intermediate horizons while other common
return predictors could not forecast the value premium. The

26
It is natural to ask whether this predictability coming from the long side or
the short side or both. In order to answer this question, we do the following tests

ff
Ri,t+1 = αi + φi CTF t + ei,t+1 , i = 1, 2, ...25 (1.21)

ff
where Ri,t+1 represents the excess returns of the 25 Fama-French size and book-
to-market ratio sorted portfolios.

Table 1.4 reports the regression statistics for the CTF . First of all, all
φ coefficients are positive and 19 out of 25 are statistically significant at 5%
level. These are consistent with the notion that CTF predicts the aggregate
market with a positive coefficient. Secondly, holding size constant, the coefficients
are declining from low book-to-market (growth) stocks to high book-to-market
(value) stocks. Figure 1.1 plots each portfolio’s coefficient against its own size and
book-to-market group. The declining pattern from growth to value is rather clear.
When CTF is low and contrarian investors are selling more stocks, both value
and growth stocks will underperform as well as the aggregate market. However,
the effect on growth stocks are almost twice as much as those on value stocks.

1.4.2.4 Contemporaneous Returns

We have shown that high contrarian investors’ flows reflect high risk aversion of
the representative agent and therefore the contrarian flow predicts the market
discount rate and the predictability is stronger for growth stocks than for value
stocks. For the same reason, , prices and the contemporaneous returns should
move against the contrarian flow. This exact point is formulated in Proposition 4.
Therefore, we can directly test Proposition 4 by examining whether the contrarian
flow is negatively related to the contemporaneous market return. This provides

27
us one straightforward yet important cross check of our theory.

In order to examine this implication, we regress the contemporaneous market


return on the contrarian flow with other control variables for the entire sample
period and the results are presented in Table 1.5. Column 1 displays the
univariate regression of the market returns on the current quarter contrarian flow.
The coefficient for the contrarian flow is negative and significant. Consistent with
the theory, high contrarian flow indicates high risk aversion of the representative
investor and the price goes down to adjust for the increase in the risk aversion.
In column 2, we add the net issues to control for the fact that the contrarian
flow is related to the corporate issues. The net issues also enters negative
and statistically significant because the increase in supply exerts an immediate
price impact. Importantly, the point estimate and the statistical power of the
contrarian flow are not weakened by the addition of the net issues. In column
3 to 5, we gradually add controls for the lagged market return, the dividend
price ratio, the T-bill rate, the term spread and the default spread. Only the T-
bill rate and the default spread enter the regression significantly and the finding
that the short term T-bill rates negatively correlate with the contemporaneous
market returns is consistent with the existing literature, dating back to Fama
and Schwert (1977). Nonetheless, none of these controls diminishes the negative
coefficients of the contrarian flow.

In sum, as our theory expects, high contrarian flows reveals an upward


revision of the representative agent’s risk aversion and therefore prices and
contemporaneous market returns go down to reflect the change.

28
1.4.3 Disagreement or Risk

Not only heterogeneity in preference, but disagreement about future earnings


could lead to contrarian flows. If some investors forecast future earnings
differently than the representative agent, they could trade as contrarian investors.
If those investors are much more optimistic in their earning forecast than the
representative investor, they would buy from the other investors, while if they
have a much lower estimate of future earnings than the prevailing investors
they would sell to the other investors. It is important to test whether our
contrarian flows are driven by changing risk aversion of the representative agent
or the difference in opinions about the fundamentals. In this section, we try to
disentangle the two possibilities and provide more evidence to support that the
contrarian flows are reflecting the risk tolerance of the prevailing investors.

1.4.3.1 Analyst Dispersion

Miller (1977) conjectures that high disagreement leads to low future returns
because the price reflects the most optimistic investors’ valuation in the presence
of short-sale constraints. Park (2005) tests the conjecture for the aggregate
stock market (S &P 500 index) and finds that the dispersion in analysts earning
forecasts negatively predicts the market returns at intermediate horizons. The
findings are consistent with the Miller’s hypothesis that high dispersion precedes
lower market returns. We use the analyst forecast dispersion as a proxy for the
difference in beliefs to establish that the contrarian flows are not generated by
disagreement of fundamentals and the predictability from contrarian flows is not
affected altogether.

The analyst earning forecasts for the S &P 500 index are extracted from
the Institutional Brokers Estimate System (I/B/E/S) database. We follow Park

29
(2005) to construct the dispersion (DISPt ) variable as follows,

wq σ2,t + (1 − wq )σ2,t
DISPt = , for the first three quarters (1.22)
et
σ2,t
DISPt = , for the last quarter of the year (1.23)
et

where σ1, t and σ2, t are the standard deviations of the current earning forecasts
and the subsequent year earning forecasts made in the current quarter, et is
moving 12 months actual earnings of the S &P 500 index, and wq is 43 , 1
2 and
1
4 respectively for the first three quarters of the year. The normalization by the
actual earnings is to ensure that the dispersion does not grow mechanically with
the magnitude of the actual earnings. The weighting scheme between the current
year and the following year is to keep the forecast horizon relatively stable as the
calender time progresses. I/B/E/S started reporting forecasts of the S &P 500
earnings per share for the current and the following year since 1982 but there
were not a large enough number of analyst estimates to calculate the standard
deviation until 1984.

Table 1.6 compares the forecasts made by the analyst dispersion and the
contrarian flow with other control variables for the sample of 1984 to 2005. Panel
A displays the one quarter predictability tests of both variables. We observe that
the contrarian flow predicts future CRSP returns even after controlling for analyst
dispersion and other macro variables. The point estimate of the contrarian flow’s
coefficient in this subsample, 0.093, is very similar to that of the full sample in
Table 1.2, 0.085. As expected, the DISPt predicts next quarter’s market return
negatively and it is statistically significant as a single predictor. The predictive
power of the disagreement measure weakens as we add in the contrarian flow and
finally becomes insignificant as we control for a full fledge of macro variables. The
results are consistent with Park (2005) in that the predictive power of the DISPt

30
is stronger at intermediate horizons. Panel B presents the long horizon results.
The contrarian flow remains significant at almost all horizons in the presence
of the disagreement measure. The point estimates for DISPt are significant at
intermediate horizons, from 2 to 7 quarters, which is qualitatively very similar to
those in Park (2005).

The comparisons between the contrarian flow and the analyst dispersion are
important because differences in belief could serve as one primary motive for
trading. Indeed, it could be argued that contrarian investors trade a lot when they
disagree most with the representative agent. The fact that the contrarian flow is
not affected by the disagreement proxy excludes the possibility that differential
information is the explanation for our main results.

1.4.3.2 Who Acts Contrarian?

It is useful to dissect the investor groups by the direction they adjust their
risky asset holdings and each investor group is labeled differently through time.
However, it is also interesting to examine how each group’s flow correlates with
the contrarian flow and who make these contrarian trades more often. To answer
this question, we are going to look at two measures.

The first measure we adopt is the linear correlation between each investor
group’s normalized flow and the contrarian flow. The correlations are presented
in Table 1.7 column 1 with p-values in parentheses. Out of 11 investor
groups, 5 groups’ (households, life insurance companies, private pensions, saving
institutions and foreigners) flows negatively correlates with the contrarian flow
and 1 group’s (federal pension funds) 4 flow positively correlates with the
contrarian flow. The rest do now show significant correlations with the contrarian

4
Federal pension funds first appeared in the data in 1987.

31
flow. Although it is difficult to state the exact reason why each group has positive
or negative correlation with the contrarian flow, it is certainly plausible that
households avoid risky assets in bad times which makes them trade in the opposite
direction as the contrarian investors. Similarly, federal pensions could adopt an
investment plan that contributes to the stock market constantly regardless of the
market conditions and the prevailing risk appetites.

The second measure we adopt is the frequency each investor group being
a contrarian investor. However, the raw frequency measure do not take into
account of historical trends. On the one hand, corporations issues shares to the
representative agent for more than half of the time and therefore the contrarian
investors have negative flows for those periods. On the other hand, households
direct holdings of the stock market declined dramatically since the 60s and
institutional investors holdings increased over the same time, as documented by
Gompers and Metrick (2002) and Cohen (1999). Therefore, if we only look at
the raw frequency, households trade as a contrarian investor for about 67% of the
time and it simply reflects the institutionalization and the downward trend in
household direct holdings. As a result, we adjust the frequency measure for the
part that is only created by the trend. Let’s illustrate the case of the households as
an example. The contrarian flow is negative for about three quarters of the time
and positive for the rest one quarter of the time. At the same time, households
flow is negative for about 90% of the time and positive for 10% of the time,
which is the historical trend. Therefore, the households have a probability of
0.70 = 0.75 · 0.90 + 0.25 · 0.10 of being labeled as contrarian just by random
chance. Considering this trend, the adjusted frequency of being a contrarian
investor is 0.67 − 0.70 = −0.03. The adjusted frequency uncovers that the actual
tendency for the households acting like contrarian is negative despite the raw
frequency of being a contrarian investor is the high. Column 2 of Table 1.7

32
reports the raw frequency . The households have the highest raw frequency
of being contrarian and all other groups are contrarian for less than half of
the time. The pattern simply reflects the households have reduced their direct
stock holdings and institutions have increased their stock holdings. Column 3
displays the adjusted frequency with t-statistics in the parentheses. The adjusted
frequency corrects the biases created by the historical trend and describes the
propensity of being contrarian more accurately. The adjusted frequencies are
generally in line with the correlation measure. There are 4 groups (households,
life insurance companies, private pensions, and foreigners) that have significantly
negative adjusted frequencies and 1 group (federal pensions) that has positive
adjusted frequency.

1.5 Robustness

In this section, we provide several robustness checks of the main results in Table
1.2. We address the in-sample and out-of-sample predictability issue discussed in
Goyal and Welch (2006) and further provide subsample results.

1.5.1 Out of Sample Tests

Goyal and Welch (2006) demonstrate that most known return predictors perform
poorly judged by out-of-sample tests and conclude that those predictors would
hardly help an investor to time the market. Therefore it is of both economical as
well as statistical importance to examine the out-of-sample performance of the
ContrarianFlow variable.

We adopt the out-of-sample test methodology proposed by Goyal and Welch


(2006). We first choose an initial estimation period and run simple regressions

33
using the data within the initial estimation period to obtain an initial estimate
of the model. Then we perform one quarter out of sample forecast based on the
initial estimate. We reestimate the model adding one data point one at a time and
predict returns with the rolling estimates. Finally we compare the conditional
forecast with the unconditional forecast based on only historical mean. The
benchmarks we employ are based on the mean square error (MSE) and the root
mean square error (RMSE). We denote the mean square error from the rolling
historical mean MSEU (unconditional) and that from the rolling regression MSEC
(conditional). The out-of-sample statistics are calculated as follows,

MSEC
R2 = 1 − ,
MSEU
p p
∆RMSE = MSEU − MSEC ,
MSEU − MSEC
MSE -F = (T − h + 1) · ( ), (1.24)
MSEC

where MSE -F is the McCracken (2004) F -statistics and tests the null hypothesis
that ∆MSE = 0. h is the degree of overlap (h = 1 for no overlap). The MSE -
F follows a non-standard distribution and we will bootstrap to compute the
critical values. Under the null of no predictability we first estimate the following
specification using the whole sample,

e
Rt+1 = α + u1,t+1

CTF t+1 = κ + ρCTF t + u2,t+1 , (1.25)

and keep the residuals. We then resample the residuals to generate 10, 000 paths
using the data generating process above. For each of the 10, 000 time series, we
perform the same out-of-sample analysis as we did in the actual data and calculate
the distributions of the OOS R 2 , the ∆RMSE and the MSE -F statistics. In this

34
way, we obtain the p-values of the three benchmarks.

There is little theoretical guidance in choosing the initial estimation period


except that we consider two basic and intuitive principles. The first criterion is
that we need to have enough data to estimate the model reliably. The second is
that it requires a long enough out-of-sample evaluation period. Note that these
two criteria are out of both economical and statistical concerns. It is apparent
that a large sample helps to eliminate noise and improve the statistical power.
Moreover, it is preferable to establish stable economical relations over a long
period of time. Weighing the two principles, we select the first half of the sample
(23 years) as the estimation period (1960-1982) and evaluate the model with the
second half of the sample (1983-2005).

The out-of-sample results are presented in Table 1.8. The one quarter
non-overlapping regression achieves an out-of-sample adjusted R 2 of 2.4% and
it is remarkably similar with the in-sample adjusted R 2 of the second half
of the sample (shown in Table ...). Judging by the MSE and RMSE ,
we can reject the null hypothesis that the CTF variable contains no more
information about expected returns than the unconditional mean. In this
benchmark case, the out-of-sample test substantiates the in-sample regression
statistics and the predictability is economically meaningful for outside investors.
Moreover, in all horizons our conditional estimates improve on their unconditional
counterparts. The fact that our out-of-sample estimates improve greatly on the
unconditional forecasts is a clear demonstration of the economical significance of
our predictability results.

35
1.6 Conclusion

In this paper, we use the capital flow of contrarian investors to study the
implication for the market risk premium of heterogeneity in investor risk
preferences. We first show that the flow of contrarian investors contain
information about risk aversion of the representative agent in a model in
which investors have heterogeneous risk preferences . A high contrarian flow
reveals to econometrician that the representative agent becomes more risk averse.
Therefore, it is reasonable to suspect that the CTF reflects market risk premium
required by the representative agent and we explore this link by asking whether
the CTF forecasts market excess returns.

Our results show that in a time-series regression, the CTF predicts the
CRSP value weighted index at quarterly horizon. Economically this forecasting
relationship is about 3 times as high as the widely used benchmark, the dividend
yield. The economical and statistical significance is robust to the addition of
other control variables. We conjecture that the CTF predicts market risk premia
because our theory suggests that it reflects the risk aversion of the representative
agent. We verify this hypothesis by examining the predicability for value and
growth stocks separately and find that the forecastability is stronger for growth
stocks than for value stocks. The difference in forecastability, together with the
notion that growth stocks bear more discount rate risk, confirms our hypothesis
that the contrarian flow measures the market risk premium.

Traditional literature has studied the time varying risk premia through
aggregate flow or issuance. Our study introduces the CTF as a measure of the
risk premia required by the market especially when investors are heterogeneous in
their risk preferences. This measure provides a new way to uncover information
lost in the aggregate variables.

36
1.7 Proofs

1.7.1 Proof of Lemma 1 and Proposition 1

The proof of Lemma 1 is as follows. We first show that there exists a


representative agent through aggregation. Because both groups of investors are
assumed to be myopic, their demands for the risky asset are,

Et [Qt+1 ]
Hta = w
aVart [Qt+1 ]
Et [Qt+1 ]
Htb = (1 − w ) , (1.26)
bt Vart [Qt+1 ]

where Qt+1 = Pt+1 + Dt+1 − RPt , is the excess return of the stock. The demand
functions are standard results following the mean-variance analysis.

The sum of the two groups’ demands is as follows,

w Et [Qt+1 ] 1 − w Et [Qt+1 ]
Hta + Htb = +
a Vart [Qt+1 ] bt Vart [Qt+1 ]
w 1 − w Et [Qt+1 ]
= ( + )
a bt Vart [Qt+1 ]
wbt + (1 − w )a Et [Qt+1 ]
=
abt Vart [Qt+1 ]
Et [Qt+1 ]
= , (1.27)
γt Vart [Qt+1 ]

abt
where γt = wbt +(1−w )a
. It is clear that the demands of all the investors is
equivalent to that of a representative agent with his risk aversion γt defined as
above.

In order to prove Proposition 1, we guess the form of the price function,

Pt = Ft − Dt + p0 + p1 γt + p2 Xt , (1.28)

37
R D̄ R 1
where Ft = r + R−αD D̂t + R−αD St . Therefore, the excess return of the risky
asset is,

Qt+1 = Pt+1 + Dt+1 − RPt


R 1
= εt+1 + St+1 − r (p0 + p1 γ̄)
R − αD R − αD
+ p1 (αγ − R)γ̂t + p1 uγ,t+1

+ p2 (αX − R)X̂t + p2 uX ,t+1 (1.29)

The conditional expected return is,

Et [Qt+1 ] = −r (p0 + p1 γ̄) + p1 (αγ − R)γ̂t + p2 (αX − R)X̂t (1.30)

and the conditional variance of excess return is,

R 1
Vart [Qt+1 ] = ( )2 σε2 + ( )2 σS2 + p12 σγ2 + p22 σX
2
+ 2p1 p2 σX ,γ ,
R − αD R − αD
| {z } | {z
2
}
2 σQ
σF
(1.31)
where σX ,γ = ργ,X σX σγ .

To solve for the equilibrium price function, we impose the market clearing
condition, so that Hta + Htb = 1 + Xt ,

Et [Qt+1 ]
= 1 + Xt ,
γt Vart [Qt+1 ]
Et [Qt+1 ] = (γt Vart [Qt+1 ])(1 + Xt )

= γt (σF2 + σQ
2
)(1 + Xt )

= (σF2 + σQ
2
)(γ̄ + γ̄ X̂t + γ̂t + γ̂t X̂t ), (1.32)

38
substitute equation 1.30 into equation 1.32, we have

−r (p0 + p1 γ̄) + p1 (αγ − R)γ̂t + p2 (αX − R)X̂t = (σF2 + σQ


2
)(γ̄ + γ̄ X̂t + γ̂t + γ̂t X̂t ),
(1.33)
we obtain the following equations by matching terms,

2
(σF2 + σQ )γ̄ = −r (p0 + p1 γ̄)

(σF2 + σQ
2
)γ̂t = p1 (αγ − R)γ̂t

(σF2 + σQ
2
)γ̄ X̂t = p2 (αX − R)X̂t (1.34)

and p1 solves the following quadratic equation,

σP2 p12 − (αγ − R)p1 + σF2 = 0, (1.35)

α −R α −R
where σP2 = σγ2 + ( αxγ −R )2 γ̄ 2 σX
2 + 2( γ
αx −R )γ̄σX ,γ > 0. Both p0 , and p2 are linear

functions of p1 . Both roots of the quadratic equation above are negative and we
choose the one that equals 0 as σF2 is 0. Therefore,

(σF2 + σQ
2 )γ̄
p0 = − − p1 γ̄,
r q
αγ − R + (αγ − R)2 − 4σP2 σF2
p1 = ,
2σP2
αγ − R
p2 = p1 γ̄ (1.36)
αX − R

39
1.7.2 Proof of Proposition 2

Proof of Proposition 2 is as follows. From the proof of Proposition 1 above, we


can derive the flows of both groups of investors,

w
∆Hta = [γt (1 + Xt ) − γt−1 (1 + Xt−1 )]
a
w
= [(αγ − 1)γ̂t−1 + uγ,t + γ̄(X̂t − X̂t−1 )]
a
a − w γt a − w γt−1
∆Htb = (1 + Xt ) − (1 + Xt−1 )
a a
w w
= − (γt − γt−1 ) + Xt − Xt−1 − (γt Xt − γt−1 Xt−1 )
a a
w w
= − [(αγ − 1)γ̂t−1 + uγ,t ] + (1 − γ̄)(X̂t − X̂t−1 ) (1.37)
a a

Therefore,

w
Cov(∆Hta , ∆Xt ) = Cov(γ̂t − γ̂t−1 + γ̄(X̂t − X̂t−1 ), X̂t − X̂t−1 )
a
w
= [Cov(X̂t , γ̂t ) − Cov(X̂t , γ̂t−1 ) −
a
− Cov(X̂t−1 , γ̂t ) + Cov(X̂t−1 , γ̂t−1 ) + γ̄Var(X̂t − X̂t−1 )]
w 1 αX αγ 1
= [ σX ,γ − σX ,γ − σX ,γ + σX ,γ +
a 1 − αX αγ 1 − αX αγ 1 − αX αγ 1 − αX αγ
2
+ γ̄(2 − αX )σX ]
w 2 − αX − αγ 2
= [ σX ,γ + γ̄(2 − αX )σX ], (1.38)
a 1 − αX αγ

and

w w
Cov(∆Htb , ∆Xt ) = − Cov(γ̂t − γ̂t−1 , X̂t − X̂t−1 ) + (1 − γ̄)Var(X̂t − X̂t−1 )
a a
w 2 − αX − αγ w 2
= − · σX ,γ + (1 − γ̄)(2 − αX )σX (1.39)
a 1 − αX αγ a

Cov(∆Htb , ∆Xt ) > 0 as σX ,γ is assumed to be negative.

40
γ̄(2−αX )(1−αX αγ ) σX
We can denote ρ∗ = − 2−αX −αγ · σγ . It is clear from equation (1.38)
that if ρX ,γ < ρ∗ , Cov(∆Hta , ∆Xt ) < 0

Cov(∆Xt , Et [Qt+1 ]) = Cov(∆X̂t , p1 (αγ − R)γˆt + p2 (αX − R)X̂t )


1 − αγ 2
= p1 (αγ − R) σX ,γ + p2 (αX − R)σX
1 − αγ αX
1 − αγ 2
= p1 (αγ − R) σX ,γ + p1 (αγ − R)γ̄σX
1 − αγ αX
1 − αγ 2
= p1 (αγ − R)( σ + γ̄σX ) (1.40)
| {z } 1 − αγ αX X ,γ
>0

From equation (1.40), if ρX ,γ < ρ∗∗ , Cov(∆Xt , Et [Qt+1 ]) < 0, where ρ∗∗ =
γ̄(1−αX αγ ) αγ
− 1−αX · σσXγ . ρ∗∗ < ρ∗ as long as 2 − αX > αX . That is satisfied if αX is not
too small compared with αγ . The two autocorrelation coefficients, αX and αγ
are approximately the same in the data.

41
1.7.3 Proof of Proposition 3

The covariance of expected returns and the group A’s flows can be simplified
α −R
using the fact that p2 = p1 γ̄ αXγ −R ,

w
Cov(∆Hta , Et [Qt+1 ]) = Cov(∆γ̂t + γ̄∆X̂t , p1 (αγ − R)γ̂t + p2 (αX − R)X̂t )
a
w
= [p1 (αγ − R)Cov(∆γ̂t , γ̂t ) + p1 (αγ − R)γ̄Cov(∆X̂t , γ̂t ) +
a
+ p2 (αX − R)Cov(∆γ̂t , X̂t ) + p2 (αX − R)γ̄Cov(∆X̂t , X̂t )]
w 1 − αγ
= [p1 (αγ − R)σγ2 + p1 (αγ − R)γ̄ σX ,γ +
a 1 − αγ αX
1 − αγ 2
+ p2 (αX − R) σX ,γ + p2 (αX − R)γ̄σX ]
1 − αγ αX
w 1 − αγ
= [p1 (αγ − R)σγ2 + p1 (αγ − R)γ̄ σX ,γ +
a 1 − αγ αX
1 − αX
+ p1 (αγ − R)γ̄ σX ,γ + p1 (αγ − R)γ̄ 2 σX 2
]
1 − αγ αX
w 2 − αγ − αX
= p1 (αγ − R)(σγ2 + γ̄ · · σX ,γ + γ̄ 2 σX2
) (1.41)
a 1 − αγ αX

2−αγ −αX
σγ2 + γ̄ · 1−αγ αX · σX ,γ + γ̄ 2 σX
2 > 0, because the discriminant of the quadratic

2−αγ −αX
form is less than 0 as 1−αγ αX < 2. We also have p1 < 0, and αγ − R < 0.
Therefore,
Cov(∆Hta , Et [Qt+1 ]) > 0 (1.42)

42
1.7.4 Proof of Proposition 4

w
Cov(∆Hta , Qt − Et−1 [Qt ]) = Cov(∆γ̂t + γ̄∆X̂t , p1 uγ,t + p2 uX ,t )
a
w
= Cov)(uγ,t + γ̄uX ,t , p1 uγ,t + p2 uX ,t )
a
w
= [p1 σγ2 + (p1 γ̄ + p2 )σX , γ + p2 γ̄σX
2
]
a
w αγ − R
= p1 [σγ2 + (1 + )γ̄σX ,γ + γ̄ 2 σX2
(1.43)
],
a αX − R

αγ −R
where p1 < 0. The quadratic form σγ2 + (1 + αX −R )γ̄σX ,γ + γ̄ 2 σX
2 is positive

because its discriminant is negative as the two autocorrelation coefficients,


αX and αγ are approximately the same. So,

Cov(∆Hta , Qt − Et−1 [Qt ]) < 0 (1.44)

43
Table 1.1: Summary Statistics

Panel A reports the summary statistics for all the variables. CTF represents the flows from the
contrarian investors defined in equation (1.17). R e is the quarterly excess return of the CRSP
value weighted index. NTIS is the twelve-month moving sums of net issues by NYSE stocks
divided by the year end total market capitalization. Tbl is the three-month Treasury Bill rate.
DP is the log dividend price ratio of the CRSP index. TMS is the term spread. DEF is the
default spread. Panel B reports the correlation coefficients of the CTF with other variables
and the p-values are displayed in the parentheses.

Panel A: Summary Statistics

Mean Std Skewness Kurtosis AR1


CTF -0.154 0.275 0.850 3.656 0.210
Re 0.059 0.170 -0.500 0.760 0.027
NTIS 0.016 0.016 -0.698 0.537 0.909
Tbl 0.056 0.028 1.097 1.788 0.920
DP -3.478 0.393 -0.756 -0.102 0.982
TMS 0.016 0.015 -0.308 0.082 0.811
DEF 0.010 0.004 1.324 1.778 0.904

Panel B: Correlations

CTF NTIS Tbl DP TMS


NTIS -0.371
(0.000)
Tbl 0.169 -0.251
(0.022) (0.001)
DP 0.042 -0.069 0.689
(0.574) (0.350) (0.000)
TMS 0.055 -0.003 -0.469 -0.194
(0.462) (0.974) (0.000) (0.008)
DEF 0.076 -0.223 0.538 0.540 0.175
(0.303) (0.002) (0.000) (0.000) (0.017)

44
Table 1.2: Predictive Regressions of Index Excess Returns
This table reports predictive regressions of CRSP value weighted index returns by the CTF
variable and other control variables at different horizons. The dependent variables are the
excess returns of the CRSP value weighted index. CTFt represents the flows from the contrarian
investors defined in equation (1.17). Rte is the CRSP index excess return (lagged). DPt is the
log dividend price ratio of the CRSP index. Tblt is the three-month Treasury Bill rate. NTISt
is the twelve-month moving sums of net issues by NYSE stocks divided by the year end total
market capitalization. TMSt is the term spread. DEFt is the default spread. The t -statistics
in parentheses are corrected for heteroscedasticity and autocorrelation. Panel A compares the
predictive regression of the next quarter’s index excess returns under different specifications.
Panel B examines the return predictability at longer horizons of up to twelve quarters using all
the predictors (most exhaustive specification from Panel A). The numbers indicate the number
of quarters the returns leading all the explanatory variables. For example, the column labeled
e
4 shows the regression of Rt,t+4 on all the explanatory variables at time t .

Panel A: Index Excess Returns

CTFt 0.083 0.081 0.085 0.105 0.082 0.079 0.085


(2.415) (2.295) (2.444) (3.171) (2.148) (2.096) (2.334)
DPt 0.019 0.017 0.018 0.052 0.055 0.053 0.049
(1.041) (0.969) (0.984) (2.490) (2.578) (2.455) (2.238)
Rte 0.055 0.031 0.006 0.002 -0.027
(0.695) (0.402) (0.070) (0.024) (-0.347)
Tblt -0.795 -0.899 -0.819 -1.275
(-2.861) (-3.164) (-2.651) (-3.173)
NTISt -0.644 -0.625 -0.596
(-1.675) (-1.635) (-1.573)
TMSt 0.242 -0.362
(0.525) (-0.687)
DEFt 3.966
(1.713)
Const 0.018 0.080 0.078 0.079 0.243 0.268 0.253 0.235
(2.868) (1.291) (1.255) (1.263) (3.004) (3.209) (2.878) (2.623)

Adj R 2 0.016 0.002 0.018 0.015 0.051 0.058 0.054 0.067

Panel B: Long Horizons


1 2 3 4 5 6 7 8 12
CTFt 0.107 0.111 0.154 0.220 0.270 0.303 0.268 0.246 0.276
(3.369) (1.983) (2.493) (3.417) (3.024) (2.826) (2.491) (2.407) (2.485)
Tblt -1.168 -1.637 -1.674 -1.703 -1.498 -1.255 -0.189 0.606 1.678
(-2.941) (-2.515) (-1.989) (-1.648) (-1.232) (-0.914) (-0.117) (0.341) (0.676)
DPt 0.046 0.083 0.111 0.127 0.141 0.157 0.154 0.154 0.171
(2.142) (2.258) (2.093) (1.890) (1.729) (1.654) (1.401) (1.257) (1.224)
TMSt -0.318 -0.070 0.735 1.189 1.726 2.696 4.188 5.485 9.052
(-0.595) (-0.077) (0.627) (0.820) (1.040) (1.533) (2.167) (2.610) (2.922)
DEFt 4.042 5.582 4.410 3.802 1.886 -0.395 -6.267 -11.149 -17.163
(1.778) (1.513) (1.001) (0.756) (0.311) (-0.061) (-0.862) (-1.355) (-1.354)

Adj R 2 0.067 0.073 0.092 0.109 0.112 0.131 0.122 0.134 0.190

45
Table 1.3: Predictive Regressions of HML

This table reports predictive regressions of HML by the CTF variable and other control
variables at different horizons. The dependent variables are the excess returns of the long
the value stocks and short the growth stocks. CTFt represents the flows from the contrarian
investors defined in equation (1.17). HMLt is the return of the zero investment portfolio of
long value stocks and short growth stocks (lagged). DPt is the log dividend price ratio of the
CRSP index. Tblt is the three-month Treasury Bill rate. NTISt is the twelve-month moving
sums of net issues by NYSE stocks divided by the year end total market capitalization. TMSt
is the term spread. DEFt is the default spread. The t -statistics in parentheses are corrected
for heteroscedasticity and autocorrelation. Panel A compares the predictive regression of the
next quarter’s HML under different specifications. Panel B examines the return predictability
at longer horizons of up to eight quarters using the optimal specification (most exhaustive
specification from Panel A). The numbers in the square brackets indicate the number of quarters
the returns leading all the explanatory variables. The dependent variables are overlapping. For
example, the column labeled 4 shows the regression of HMLt,t+4 on all the explanatory variables
at time t .

Panel A: HML Prediction

CTFt -0.052 -0.051 -0.050 -0.060 -0.050


(-2.115) (-2.056) (-2.126) (-2.617) (-2.212)
DPt -0.007 -0.008 -0.023 -0.009
(-0.442) (-0.505) (-1.042) (-0.440)
HMLt 0.134 0.131 0.134
(1.316) (1.280) (1.487)
Tblt 0.352 0.222
(1.700) (1.186)
VSt -0.063
(-2.347)
Const 0.012 -0.013 -0.015 -0.088 (0.127)
(2.841) (-0.236) (-0.306) (-1.046) (1.650)

Adj R 2 0.014 0.011 0.023 0.036 0.079

Panel B: HML Long Horizons

1 2 3 4 5 6 7 8
CTFt -0.050 -0.057 -0.085 -0.180 -0.174 -0.169 -0.188 -0.215
(-2.081) (-1.689) (-2.003) (-3.450) (-2.769) (-2.021) (-2.205) (-2.193)
Tblt 0.150 0.171 0.129 0.128 0.072 0.065 0.000 0.058
(1.184) (0.645) (0.319) (0.260) (0.126) (0.102) (0.000) (0.078)
VSt -0.067 -0.132 -0.151 -0.161 -0.173 -0.226 -0.264 -0.274
(-2.144) (-2.215) (-1.848) (-1.635) (-1.679) (-2.066) (-2.228) (-2.210)
Const -0.097 -0.184 -0.194 -0.198 -0.196 -0.261 -0.301 -0.305
(-2.019) (-1.994) (-1.552) (-1.308) (-1.240) (-1.548) (-1.652) (-1.623)

Adj R 2 0.069 0.100 0.081 0.100 0.081 0.101 0.117 0.115

46
Table 1.4: Predictive Regressions of the Fama-French Portfolios

This table reports the coefficients from predictive regressions of the excess returns of the 25
Fama-French portfolios by the CTF variable. The dependent variables are the 25 Fama-
French benchmark portfolios. CTF represents the flows from the contrarian investors defined in
equation (1.17). This table compares the univariate predictive regressions of the next quarter’s
25 Fama-French portfolios. The t -statistics in parentheses are corrected for heteroscedasticity
and autocorrelation.

Small 2 3 4 Big
Low 0.169 0.171 0.145 0.133 0.125
(2.272) (2.585) (2.396) (2.510) (3.476)
2 0.151 0.130 0.109 0.091 0.085
(2.534) (2.678) (2.451) (2.170) (2.628)
3 0.153 0.103 0.084 0.074 0.068
(2.581) (2.151) (2.188) (1.742) (2.374)
4 0.109 0.090 0.068 0.066 0.059
(1.952) (2.071) (1.440) (1.801) (2.004)
High 0.112 0.102 0.089 0.088 0.063
(1.740) (1.668) (2.037) (2.428) (2.067)

47
Table 1.5: Contemporaneous Returns

This table reports the contemporaneous regressions of CRSP value weighted index returns on
the CTF variable under control variables. The dependent variables are the excess returns
of the CRSP value weighted index. CTFt represents the flows from the contrarian investors
e
defined in equation (1.17). DPt is the log dividend price ratio of the CRSP index. Rt−1 is
the lagged CRSP index excess return. Tblt is the three-month Treasury Bill rate. NTISt is
the twelve-month moving sums of net issues by NYSE stocks divided by the year end total
market capitalization. TMSt is the term spread. DEFt is the default spread. The t -statistics
in parentheses are corrected for heteroscedasticity and autocorrelation.

CTFt -0.082 -0.123 -0.115 -0.122 -0.106


(-2.086) (-2.707) (-2.582) (-2.685) (-2.540)
DPt 0.015 0.010 0.004
(0.623) (0.421) (0.168)
e
Rt−1 -0.021 -0.027 -0.010
(-0.282) (-0.372) (-0.134)
Tblt -0.636 -0.421 -1.053
(-2.194) (-1.317) (-2.640)
NTISt -1.081 -1.306 -1.247 -1.142
(-2.809) (-3.335) (-3.129) (-2.817)
TMSt 0.640 -0.256
(1.479) (-0.573)
DEFt 5.657
(2.456)
Const 0.013 0.029 0.121 0.080 0.052
(1.975) (3.230) (1.271) (0.827) (0.512)

Adj R 2 0.016 0.047 0.060 0.064 0.097

48
Table 1.6: Contrarian Flows and Analyst Dispersions
This table compares the predictive regressions of CRSP value weighted index returns by the
CTF variable and the DISP under control variables at different horizons. The dependent
variables are the excess returns of the CRSP value weighted index. CTFt represents the flows
from the contrarian investors defined in equation (1.17). DISPt is the analyst dispersion of
the S &P 500 earning forecast. DPt is the log dividend price ratio of the CRSP index. Tblt is
the three-month Treasury Bill rate. NTISt is the twelve-month moving sums of net issues by
NYSE stocks divided by the year end total market capitalization. TMSt is the term spread.
DEFt is the default spread. The t -statistics in parentheses are corrected for heteroscedasticity
and autocorrelation. Panel A compares the predictive regression of the next quarter’s index
excess returns. Panel B examines the return predictability at longer horizons of up to twelve
quarters using all the predictors (most exhaustive specification from Panel A). The numbers
indicate the number of quarters the returns leading all the explanatory variables. The sample
period is from 1984 to 2005.

Panel A: Analyst Dispersion and Contrarian Trades

CTFt 0.099 0.093 0.118


(3.717) (3.436) (3.674)
DISPt -0.272 -0.239 -0.264
(-2.229) (-1.836) (-1.359)
DPt 0.060
(1.484)
Tblt -1.593
(-2.221)
NTISt -0.448
(-1.027)
TMSt -0.381
(-0.317)
DEFt -1.286
(-0.367)

Adj R 2 0.012 0.036 0.042 0.054

Panel B: Long Horizons


1 2 3 4 5 6 7 8 12
CTFt 0.118 0.101 0.101 0.140 0.104 0.138 0.143 0.156 0.240
(3.674) (1.903) (2.466) (2.982) (2.358) (2.474) (1.994) (1.652) (3.014)
DISPt -0.264 -0.606 -0.915 -1.181 -1.238 -1.216 -1.323 -0.762 -1.244
(-1.359) (-2.024) (-2.346) (-3.165) (-4.239) (-3.589) (-2.554) (-1.356) (-1.635)
DPt 0.060 0.092 0.112 0.136 0.188 0.212 0.207 0.250 0.401
(1.484) (1.407) (1.305) (1.289) (1.413) (1.347) (1.090) (1.165) (1.838)
Tblt -1.593 -1.850 -1.986 -2.269 -2.723 -2.921 -2.029 -1.757 -4.199
(-2.221) (-1.679) (-1.371) (-1.279) (-1.214) (-1.188) (-0.669) (-0.468) (-0.916)
TMSt -0.381 -0.119 0.441 0.086 -1.194 -0.210 2.351 4.314 4.034
(-0.317) (-0.070) (0.215) (0.034) (-0.378) (-0.056) (0.548) (0.822) (1.331)
DEFt -1.286 0.709 2.255 7.984 12.330 12.377 8.045 -8.632 -12.242
(-0.367) (0.119) (0.304) (0.781) (1.005) (0.917) (0.540) (-0.492) (-0.470)

Adj R 2 0.054 0.071 0.112 0.162 0.176 0.191 0.227 0.221 0.277

49
Table 1.7: Contrarian Likelihood of All Investors

This table reports the likelihood of being a contrarian for all types of investors through three
measures. Column Correlation is the linear correlations of each type of investor’s flows with the
contrarian flows p-values in the parentheses and column Frequency is the frequency of being a
contrarian for each type of investor. Column Adj . Freq. is the frequency of being a contrarian
for each type of investor adjusting for the time trend in each type of investor’s flows and the
p-values are in the parentheses.

Correlation Frequency Adj . Freq.


Commercial Banks -0.096 0.419 -0.045
(0.194) (0.108)
Saving Institutions -0.171 0.359 -0.053
(0.020) (0.060)
Life Insurance -0.196 0.255 -0.060
(0.008) (0.005)
Property Insurance -0.121 0.370 -0.042
(0.103) (0.110)
Mutual Funds -0.048 0.353 -0.004
(0.519) (0.449)
Brokerage 0.109 0.429 -0.047
(0.142) (0.099)
State Pensions -0.118 0.294 -0.030
(0.111) (0.115)
Federal Pensions 0.336 0.343 0.068
(0.004) (0.000)
Private Pensions -0.187 0.304 -0.087
(0.011) (0.004)
Foreigners -0.317 0.310 -0.068
(0.000) (0.016)
Households -0.394 0.674 -0.034
(0.000) (0.031)

50
Table 1.8: Out of Sample Tests

This table presents the out of sample tests of the CTF variable in a simple regression. OOS
R 2 is the out of sample adjusted R 2 for the CTF variable. ∆RMSE is the improvement of
the root-mean-square-error of the conditional model over the unconditional mean. MSE -F is
the McCracken (2004) F -statistics testing the null hypothesis that there is no improvement
in the MSE . Horizon is in quarters. The p-values in the parentheses and the critical values
(not reported here) are obtained by bootstrap using 10000 simulated time series. The initial
estimation period is from 1960 to 1982 and the evaluation period is from 1983 to 2005

Horizon OOS R 2 ∆RMSE MSE -F


1 0.024 0.002 3.334
(0.028) (0.029) (0.035)
2 0.022 0.002 3.083
(0.026) (0.030) (0.026)
3 0.063 0.005 7.017
(0.003) (0.004) (0.003)
4 0.101 0.010 10.819
(0.001) (0.001) (0.001)
5 0.068 0.008 7.147
(0.013) (0.019) (0.013)
6 0.066 0.009 6.834
(0.033) (0.044) (0.033)
7 0.110 0.016 10.962
(0.009) (0.011) (0.009)
8 0.146 0.023 14.407
(0.005) (0.006) (0.005)

51
Figure 1.1: Coefficients for 25 Fama-French Portfolios

This graph pictures the shape of the regression coefficients of the 25 Fama-French portfolios on
lagged CTF variable.

Regression Coefficients of the Fama−French Portfolios

0.18

0.16

0.14

0.12
Coef

0.1

0.08

0.06

0.04
Low
2 Small
3 2
3
4
4
BM High Big
Size

52
CHAPTER 2

Does the Early Exercise Premium Contain


Information about Future Underlying Returns

We investigate the information content of the call (put) Early Exercise Premium,
or EEP , defined as the normalized difference in prices between otherwise
comparable American and European call (put) options. The call EEP specifically
captures investors’ expectations about future lump sum dividend payments as
well as other state variables such as conditional volatility and interest rates.
From that perspective, the EEP should also be related to future returns of
the underlying security. Little is known about the EEP , largely because it is
usually unobservable for most underlying securities. The FTSE 100 index is an
exception in that regard, because it has both American and European options
contracts that are traded in large volumes. We use data of the FTSE 100 index,
and its American and European options contracts, from which we compute a
time series of the EEP . Interestingly, we find that the EEP is a good forecaster
of returns at daily horizons. This forecastability is not due to time-variation in
market risk premia or liquidity. Importantly, we find that the predictability stems
primarily from the ability of the EEP to forecast innovations in dividend growth,
rather than other components of unexpected returns. Overall, we use several
empirical and simulation methods to establish predictability of the underlying
with an options market variable, link this predictability to information about

53
cash flow fundamentals, and thereby provide clear support for Black’s (1975)
conjecture that informed investors prefer to trade on their superior information
about fundamentals in the options market relative to the underlying.

54
2.1 Introduction

Black (1975) was one of the first to suggest that informed investors prefer to
trade on their superior information about fundamentals in the options market
rather than in the underlying asset market because they can easily take on more
leveraged positions. An important implication of this argument is that, over
short horizons, option prices will reflect news about fundamentals that is yet to
be incorporated into prices of the underlying security. Since the transmission
of information across markets, and particularly between the options and the
underlying market, is of central importance in finance, it is not surprising that
this conjecture has generated a lot of theoretical and empirical interest. On the
theoretical side, Biais and Hillion (1994), Easley, O’Hara, and Srinivas (1998)
and others elaborate and formalize the theoretical conditions under which Black’s
(1975) conjecture will hold.

Unfortunately, the empirical literature on this topic has yet to reach a


consensus. The debates revolve around two related points. First, from an
empirical perspective, to what extent do option prices as well as other information
in the option market predict movements in the underlying security? Amin
and Lee (1997), Anthony (1988), Chakravarty, Gulen, and Mayhew (2004), and
Pan and Poteshman (2004) find that options market variables (such as changes
in option prices, implied volatility, and option volume) predict returns of the
underlying security at short horizons. In contrast, Stephan and Whaley (1990)
and Chan, Chung, and Johnson (1993) fail to find such predictive relations.
Second, if predictability is observed, is it due to traders’ information about
fundamentals? This second point is particularly important in order to establish
whether the predictability supports Black’s (1975) conjecture or whether it is due
to some other deviations from perfect markets.

55
In this paper, we take an altogether new look at the connection between
prices of options and the underlying security, and its link with information about
fundamentals. More specifically, we focus on the difference in prices between
otherwise comparable European and American call options, which is known as the
call early exercise premium, or call EEP . Merton (1973) was the first to show that
the call EEP must be zero if the underlying asset pays no dividends.1 Roll (1977),
Geske (1979), and Whaley (1981) prove that in the presence of a known lump
sum dividend, prices of European and American calls are not necessarily equal,
because American option holders might want to exercise the option right before
the ex-dividend date. In the more realistic case of multiple dividends that are
not known with certainty, the EEP will depend on both the expected magnitude
and the lumpiness of these dividends.2 Conditional on dividends being non-zero,
the EEP will also depend on other factors that affect option prices (volatility,
interest rates, etc.). Therefore, when dividends are non-zero and lumpy, changes
in expectations about future cash flows and discount rates of the underlying
asset ought to be reflected in a non-zero mean and variations of the early exercise
premium.

We focus on the call EEP rather than on other option market predictors,
because of its close and unambiguous connection with lump sum dividends. The
arguments in Roll (1977), Geske (1979), and Whaley (1981) suggest that the call
EEP is very sensitive to changes in expectations about future lumpy dividends.
Furthermore, based on an empirical exercise of S&P 100 index options, Whaley

1
The intuition is that, in the absence of dividends, the intrinsic value obtained from
exercising an American call option is always less than the value of the option. An investor
would therefore rather sell the option in the open market rather than exercise it early.
2
When dividends are lumpy and non-uniform, the probability of early exercise is higher, and
the magnitude of the early exercise premium is dependent not just on the last dividend before
option maturity, but also on the other (lumpy) dividends over the life of the option.

56
(1982) and Harvey and Whaley (1992a, 1992b) conclude that the magnitude
and timing of dividends is critically important in determining the early exercise
premium. More specifically, Whaley (1982) remarks that the “magnitude of the
early exercise premium is importantly influenced by the amount of the dividend
payment.” Harvey and Whaley (1992a) provide additional evidence and make
this point even more forcefully by concluding that: “From a practical standpoint
of pricing (or trading) S&P 100 index options, knowing the amount and timing of
S&P 100 index cash dividends appears to be critical.” In the context of the FTSE
100 index options, the dependence of dividends of the EEP should be even more
important since FTSE 100 index dividends have been clustered once every two
weeks, and hence been highly lumpy and non-uniform. In contrast, predictors
such as the change in prices of American options, volume, open interest, and
even the put EEP depend not only on the lumpy dividends but on all the other
factors that affect option prices. For simplicity, by “EEP ” we refer to the call
EEP unless otherwise specified.

Clearly, analytic arguments and extant empirical findings suggest that the
EEP is very sensitive to fluctuations about future dividend payments.3 We
conjecture that, to the extent that dividend expectations influence returns and
to the extent that Black’s (1975) argument holds, the EEP should also be a
particularly good predictor of underlying returns at short horizons of a few days–
horizons that lie within the period that it takes for information to be impounded
into underlying prices. In the context of Black (1975), if informed investors
trade primarily in the options market, their information will be incorporated
first in the markets for European and American options rather than in the market

3
Even though ex-post dividends tend to be highly persistent, there is a great degree of ex-
ante uncertainty about their innovations, and the importance of these is recognized both by
academics and practitioners.

57
for the underlying. Since dividend information specifically has a different effect
on American relative to European options, the EEP should capture dividend
information faster than the underlying.4

In practice, the EEP is rarely directly observable, because virtually all


underlying assets have either American or European options, but not both.5
One notable exception is the FTSE 100 index which has both American and
European contracts that are traded in large volumes. Both types of contracts
have co-existed on the London International Financial Futures Exchange (LIFFE)
exchange from 1990 until the present and have high liquidity, very similar
maturity, and other characteristics. This presents us a unique opportunity to
directly observe the EEP and to revisit the debate about the information flow
between the options and the underlying security markets, and, more importantly,
to associate the information flow to information about fundamentals.

In this paper, we investigate the empirical relation between the early exercise
premium of call options and the returns of the FTSE 100 index using daily data
from June 1992 to January 1996. First, we describe the statistical properties of
the early exercise premium. We document that, for calls and puts, the average
EEP is non-zero and its magnitude is economically and statistically significant.
The time series of the EEP also exhibits significant serial correlation at horizons
up to one week. This finding suggests that the call EEP might be related to
lumpy dividend payments of the underlying index. We use the Longstaff and
Schwartz (2001) simulation approach to show that, indeed, a simple model which

4
The private information that can be differentially incorporated into prices is not necessarily
market-wide, but also firm-specific since only a handful of stocks go ex-dividend on any
particular ex-dividend date and the dividend expectations that influence the early exercise
premium relate to these firm-specific dividends.
5
The S&P 500 index had both American and European contracts from April 2, 1986 through
June 20, 1986.

58
calibrates the lumpy sum dividends, the conditional volatility, and the interest
rate processes has little trouble to replicate the magnitudes of the average EEP
and its serial correlation that are found in the data.

Second, we investigate the information content in the EEP . We do that by


first looking at whether the EEP can forecast subsequent FTSE 100 returns. This
forecasting relation is motivated by the fact that since FTSE 100 index dividends
are paid approximately once every two weeks in lump sums,6 the call EEP ought
to contain information, most importantly about future dividend yields, but also
about future volatility, and future interest rates, all of which have been used
extensively as forecasters of returns (Campbell and Shiller (1988a), Ghysels,
Santa-Clara, and Valkanov (2005b), and Campbell (1991)). Hence, since the EEP
captures fluctuations in these variables, it should arguably be a useful forecaster
of FTSE 100 returns. We find that the EEP does actually forecast the index
returns at one- and two-day horizons. This predictive relation is robust to the
addition of other control variables such as dividend yield, short interest rate,
implied volatility, and changes in volume. The forecasting relation also persists
in subsamples. The forecastability disappears at longer horizons.

The ability of the EEP to forecast returns is markedly different from that
of the dividend yield, the volatility, and interest rates in two important aspects.
From a statistical perspective, the EEP contains information about dividend
yields, volatility, and interest rates but does not suffer from the well known-
statistical problems (such as extreme persistence and low volatility) which have
rendered forecasting with these predictors quite problematic (Stambaugh (1999),
Ferson, Sarkissian, and Simian (2003), Torous, Valkanov, and Yan (2005)). The
autocorrelation of the EEP , while significantly different from zero, is not near

6
We will explain the institutional requirements in the data section.

59
the boundary of non-stationarity and will not significantly bias the estimates in
forecasting regressions. Also, the volatility of the EEP is actually larger than that
of the FTSE 100 returns. This is in contrast to the volatilities of other predictors,
which are at least an order of magnitude lower than that of the returns. These
appealing statistical properties of the EEP make it a suitable forecaster of returns,
especially at short horizons.

From an economic perspective, the source of the EEP predictability also


differs from that of other widely used conditioning variables. Campbell and
Shiller (1988b), Fama and French (1989), Campbell (1991), Ghysels, Santa-Clara,
and Valkanov (2005b) and others argue that the dividend yield, volatility, and
interest rates capture the time variation in expected returns. In support of that
assertion, these conditioning variables have been related to longer horizon returns
of monthly, quarterly, or annual frequencies. In contrast, the EEP forecasts
returns at daily horizons when expected returns are unlikely to vary significantly.
Moreover, to the extent that informed investors trade in the options market
(Black (1975)) and that information does not diffuse instantaneously across
markets (Shiller (2000), Sims (2001)), the predictability may be due to the EEP
containing information about changes in expectations about future fundamentals
rather than time-varying expected returns.

Third, we analyze the source of this forecasting relation using two alternative
approaches. First, we use Campbell’s (1991) VAR framework and decompose
realized returns into expected returns and shocks to dividend growth, excess
returns, and interest rates. We find that the EEP predicts mainly the dividend
shock component of the underlying index return. This result supports the
conjecture that the call EEP captures changes in expectations about future cash
flows of the underlying index. In a second approach, rather than relying on a

60
return decomposition, we test directly whether the EEP captures fluctuations in
future dividend growth. The results not only collaborate the VAR findings that
the EEP forecasts fluctuations in future dividend growth but they also suggest
that the predictive signal in the EEP is concentrated right before dividends are
announced. In sum, both the VAR and the direct regression approaches link the
predictability of the underlying returns by an option market variable to cash flow
news. To our knowledge, this link has not been previously established.

The last two results lead us to the conclusion that the call EEP is positively
related with subsequent underlying index returns at daily frequency mainly
because it contains information about future cash flows. These findings support
Black’s (1975) conjecture that informed investors prefer to trade in the options
market. The fact that we don’t observe predictability beyond two-day horizons
implies that the options and the underlying markets are reasonably well-
integrated. The last two findings are also consistent with the first empirical and
simulations results, which suggested that the EEP is sensitive to changes in cash
flows. Our results support the claims of Kothari and Shanken (1992) and Torous,
Valkanov, and Yan (2005) who argue that the commonly used proxy for expected
future dividends may contain measurement error and are too smooth to forecast
future returns at short horizons. The EEP responds rapidly to changes in cash
flows and is thus more suitable to detect short horizon predictability. In sum, the
novel contributions of this paper are to propose the call EEP as a short horizon
predictor of the underlying return, to argue for the economic reasons behind the
predictability, and to provide supporting empirical evidence. More broadly, we
establish predictability of the underlying returns with an options market variable,
and link this predictability to information about cash flows fundamentals, and
thereby provide clear support to the Black (1975) conjecture.

61
The paper is structured as follows. We describe the dataset in Section 2.
In section 3, we present summary statistics of the EEP and use simulations to
show that important statistical properties of the EEP can be replicated when
dividends, volatility, and interest rates are calibrated to the data. In section 4,
we present the predictability results and link them to the ability of the EEP to
forecast changes in future dividend growth. We conduct a series of robustness
checks in Section 5 and conclude in Section 6 with some final remarks.

2.2 Data

We have three types of time series: values of the FTSE 100 index, prices of
European and American calls and puts on the FTSE 100 index, and other
variables, such as short interest rate, the lumpy dividend stream of the FTSE
100 index and the volume of its shares traded, and the implied volatility of the
index. We describe each time series separately for clarity.

FTSE 100 Index and Index Futures: We compute daily log returns of the
FTSE 100 cash index over a four-year period from June 1992 to January 1996.
All the stocks are traded at the London Stock Exchange (LSE). The log returns
exclude dividend payments. The index return in excess of the one-month (riskfee)
UK interest rate is denoted by Rt . We also compute the daily log return of the
FTSE 100 index futures,7 which is traded at the London International Financial
Futures Exchange (LIFFE). The index futures return is denoted by Rtfut .

FTSE 100 Index Options: We have all the bid-ask quotes recorded for
all European and American FTSE 100 index options traded on the London
International Financial Futures Exchange from June 1992 to January 1996.

7
The returns are computed from the “on-the-run” contracts. We roll over to the next on-
the-run contract one month before the current on-the-run contract expires.

62
Since these contracts were heavily traded, there were no designated market-
makers obliged to stand ready to buy and sell. Liquidity in these markets was
generated in a CBOT-style auction hand-signal pit-trading environment with
voluntary dealers and direct interaction of buyer’s and seller’s agents. Nothing
in this process should generate any microstructure-related systematic differences
between European and American prices. To minimize possible data errors and
to make all contracts comparable, we apply several filters. For instance, we use
only synchronous quotes of the European and American option, or quotes that are
posted within 60 seconds of each other. This results in a sample of 47960 matched
quotes for call options and 41270 matched quotes for put options. We also exclude
prices lower than the intrinsic values. Furthermore, if we define moneyness as the
spot price divided by the strike price, S /K , we use only options that are within
the range 0.9 and 1.1. Finally our data includes 41891 matched pairs of calls and
35961 matched pairs for puts.

The European and American index option contracts have identical exercise
dates. At any time there are five different maturities of both types of options, one
month, two months, three months, four months and a long-dated one. For a given
maturity, American option exercise prices are multiples of 50 while European
option exercise prices are multiples of 25 but not 50. In order to directly compare
the prices of the American and the European option, we linearly interpolate
the prices volatilities of the two adjacent American options whose strike prices
straddle the strike price of the European option that we are trying to estimate.
With this method, we obtain the synchronous prices of American and European
contracts with the same strike price and maturity.

The early exercise premium is computed as the difference in prices between


two otherwise identical American and European options, normalized by the price

63
of the European contract. We use the normalization mainly because the non-
normalized difference is affected by the index level, the volatility, and other
variables that enter the option pricing formula. Through the normalization, we
control for the level of these variables and measure the premium of the American
relative to the European contract. We are careful not to normalize by the level
of the index itself, because doing so would induce an automatic correlation with
next-period index returns.

Instead of tracking every contract every day, it is more appropriate to


aggregate the information according to the moneyness and maturities. Since
no single option has the same moneyness on every single day, we compute the
EEP for standardized at-the-money options, which is a natural benchmark. For
every trading day, we linearly interpolate the EEP s of all the matched pairs on
that day in the moneyness space and use the fitted value of the EEP at the
moneyness equal to 1 as the EEP measure for at-the-money options on that
day. In this way, we construct a time series of 894 daily EEP s, which explicitly
use information from options that are in and out of the money.8 It might also
be tempting to interpolate the data along the maturity space and construct a
constant maturity EEP . Unfortunately, this is not a straightforward exercise,
because in the presence of lumpy dividends, the relation between maturity and
the EEP depends crucially (and non-linearly) on the timing of dividends. Since
a simple interpolation procedure cannot take this dependence into account, we
present most of the results without interpolating along the maturity space.
Results from a linearly-interpolated, constant maturity EEP and from an average
EEP of all contracts in a day, presented in the robustness section, produce very

8
While this interpolation introduces a bias in the EEP measure (because the EEP is not
exactly linear in the moneyness space), the bias is not large in economic or statistical terms.
We also provide further robustness check for this interpolation method.

64
similar (and sometimes more significant) results.

The American index option holders also have a wildcard option. During
the sample period, London option market’s close was 4:10pm and London stock
market’s close was 4:30pm. American index option holders have the right to
exercise (but not trade) the option up to 4:31pm at a settlement price based on
the index level either at 4:10pm or later.

Dividends and Other Variables:

We have daily data of the one-month stochastically detrended U.K. interest


rate (Rft ) (following Campbell (1991)), the dividend yield distributed to index
holders (DYt ), the implied variance of the index portfolio return (Vart ), and
the changes in volume of FTSE 100 shares traded (∆Vlmt ). The stochastically
detrended short rate is obtained by subtracting a lagged 3-month moving average
from the raw one-month interest rate, similarly to Campbell (1991). Vart is
implied daily from closest to the money European call options.9

To understand the source of the lumpiness and uncertainly in dividends,


it is necessary to understand the LSE regulations. Over our sample period,
the LSE synchronized the ex-dividend dates across all firms to fall on the first
trading day of the week. Any company that planned to go ex-dividend had to
declare that via a Regulatory Information Service no later than four business
days before the ex-dividend date, otherwise the ex-dividend date had to be
deferred until the following week. From the dividend stream, we construct one-
month moving average of the dividend and then compute the dividend price
ratio. The lumpiness in the dividends series is due to two factors. First, in
our sample, ex-dividend dates occurred typically every other week. Because of

9
Easley, O’Hara, and Srinivas (1998) provide a theoretical model and convincing evidence
that options volume is related to future underlying returns. Unfortunately, we do not have
historical FTSE 100 option volume data.

65
this mandated synchronization, the dividend price ratio is distributed about once
every two weeks rather than uniformly. Second, U.K. companies typically pay
out semi-annually (rather than quarterly, as in the U.S.). This creates additional
lumpiness as well as uncertainty in dividends. Finally, while the dividend series
are ex-post persistent, there is a significant ex-ante uncertainty about their actual
realizations, which is a product of concern for a lot of financial analysts who follow
LSE-listed firms. In that respect, the U.K. and the U.S. stock markets are quite
similar.

2.3 The EEP: Magnitude and Dynamics

2.3.1 Summary Statistics

In this section, we describe the statistical properties of the early exercise premium.
Table 2.1 presents the summary statistics of the EEP for calls and puts. To
facilitate comparison, all numbers are expressed in annualized percents except
the EEP s and the trading volume.10 We see that the average EEP of calls and
puts is very different from zero. For calls, the EEP is 3.5 percent with a standard
deviation of 1.8 percent. The EEP of puts is even higher at 7.6 percent with a
standard deviation of 4.3 percent. In comparison, the average annualized return
of the FTSE 100 index is 9.6 percent with a standard deviation of 12.5 percent
and the average annualized dividend yield is 4 percent with a standard deviation
of 2.2 percent. The EEP s are more volatile, very skewed and exhibit significant
kurtosis compared to the returns of the underlying asset. This is due to the
convex payoff of options and to their natural leverage.

In Panel B of Table 2.1, we display the partial autocorrelation functions of

10
We multiply the daily returns by the number of trading days, 252.

66
the EEP s, the index return and the index futures return. Column 1 shows the
autocorrelations for call EEP and column 2 shows that of the put EEP . For
calls and puts, the EEP s are positively serially correlated and the correlations
are significant at up to 5 daily lags. In contrast, the FTSE 100 returns are
uncorrelated at all but one-day lag, as can be seen from their autocorrelation
which is displayed in the column 3. Finally the last column shows that the
partial autocorrelation of the index futures returns are close to zero at all lags.

The observed serial correlation in the EEP s implies that the difference in
European and American prices is not white noise. Since the EEP will not be
zero when the dividends are non-zero or are paid in lump sums, we conjecture
that the persistence might be due to dividend shocks. Whether the empirically
observed serial correlations can be generated by dividends is an issue that we
tackle next.

2.3.2 Explaining the Magnitude and Dynamics of the EEP

It is interesting to know whether EEP s of such magnitudes and with such


statistical properties can be obtained using calibrated option pricing models.
Geske and Johnson (1984), Kim (1990), Carr, Jarrow, and Myneno (1992), Geske
and Roll (1984b), Barone-Adesi and Whaley (1987), Gukhal (2001) and several
others study the theoretical pricing of the American call and put options.11 The
first three papers mainly focus on the pricing of American put options and almost
all of them assume that the dividend yield is continuous because this assumption
simplifies the modeling of the option prices. white noise. Allowing for lump-sum
dividends leads to a problem that generally does not have a closed-form solution.

11
These papers attack the problem by solving a partial differential equation with a moving
boundary which is a problem that generally does not have a closed-form solution.

67
So does allowing for realistic fluctuations in the conditional variance and risk-free
interest rate.

In this paper, we take a different approach. We use the Longstaff and


Schwartz (2001) simulation method to value American and European options,
and then compute the EEP . Besides its simplicity, the Longstaff and Schwartz
(2001) approach is appropriate in our study for two reasons. First, it allows
for flexible and realistic calibration of the conditional variance and dividends of
the underlying security as well as the risk-free interest rate. Second, our goal
is to understand how the EEP varies with respect to the underlying parameters
rather than finding the exact solution to the American option pricing problem.
Therefore, calculating the EEP numerically and drawing comparative statistics
serve our goal well.

We conduct two simulations. First, we examine how the dividend yield and
volatility affect the EEP in a constant dividend yield Black-Scholes model. The
goal of this exercise is to see whether realistic magnitudes the lumpy dividends
and volatility can generate the empirically observed magnitudes of the average
EEP . In this simulation, we assume that the stock price follows a geometric
Brownian motion and the interest rate, the dividend yield and the volatility are
all constant. The underlying stock will pay a lumpy dividend in two weeks. Under
this setting, we calculate the price of a one-month at the money American option
with the Longstaff-Schwartz simulation. Then we compare it with the price of
an European option with the same contract details and obtain the EEP .

Figure 2.1 plots the magnitude of the EEP for different magnitudes of the
dividend yield and volatility. The top panel shows that the levels of the simulated
call EEP are close to the average EEP that we observe in the data. For instance,
when we set the dividend yield and the volatility to their average values from

68
the data (4% and 12%, respectively), our simulation generates an EEP of 0.028.
Recall that, in Table 2.1, the mean of the call EEP is 0.035. Similar results
obtain for the put EEP , shown in the bottom panel of the figure.

Figure 2.1 also illustrates the sensitivity of the EEP to changes in the dividend
yield and volatility. In the top panel, the call EEP surface is monotonic in both
directions. The call EEP increases with the dividend yield. This is intuitive
because, as the dividend yield increases, American option holders are more
incentivized to exercise before ex-dividend date in order to profit from the high
dividends, which raises the premium. Importantly, when volatility is in the 0.10
to 0.15 range, the call EEP is more convex in the level of the dividend yield.
Hence, in normal volatility regimes, the EEP is very sensitive to changes in the
dividend yield.

The call EEP is quite flat in the volatility space when the dividend yield is low
and decreases with volatility when the dividend yield is high. This latter effect
is due to two factors. First, volatility boosts the option value when American
option holders face the exercise decision and reduces the chance of exercising
early. Second, our EEP measure scales the absolute premium by the price of the
European contract which increases with volatility. For put options, the EEP is
decreasing in the dividend yield as higher dividends will reduce the incentives for
early exercise. The put EEP is also decreasing in volatility for the same reasons
as calls.

In a second simulation, we examine the dynamics of the EEP under the risk-
neutral measure when the interest rate and dividend yield follow an AR(1) process
and the volatility follows a GARCH(1,1) process. More specifically, we investigate
whether we can reproduce the serial correlation in the EEP observed in the data.
To do so, we generate the risk-free rate, dividend yield, and conditional volatility

69
from

Rft+1 = φRft + vt+1

DYt+1 = ρDYt + ut+1


2
σt+1 = κ + αε2t + βσt2 .

where Rft+1 is the risk-free rate, DYt+1 is the dividend yield, and σt2 is the
variance of excess returns. α and β are the GARCH(1,1) coefficients for σt2 ,
and φ and ρ are the AR(1) coefficients for the risk-free rate and the dividend
yield, respectively. There is no risk premia for the state variables. Under this
dynamic setting, we simulate the underlying asset for 1000 steps, re-price the
same American option as above, and calculate the EEP . From the simulations
we obtain a time series of the EEP and calculate the AR(1) coefficients of the
call and the put EEP , respectively.

Table 2.2 shows the simulation results of the AR(1) coefficients of the call and
put EEP for a set of different parameters of the data generating processes. The
first row uses the parameters that are estimated from our data. The call EEP
has an AR(1) coefficient of 0.361 which is remarkably similar to the one from the
data (0.377). In the second set of rows, we vary the persistence of the volatility.
The third and fourth set of rows show similar results for various persistence levels
of the risk-free rate and the dividend yield, respectively. These simulation shows
that the more persistent are the volatility, the dividend yield and the interest
rate, the higher is the AR(1) coefficient of the call EEP .

In particular, the serial correlation of the call EEP is very sensitive to the
persistence of the dividend yield process. A change in ρ from 0.906 to 0.800
results in a drastic reduction of its AR(1) coefficient from 0.361 to 0.084, which

70
represents a decrease of 76.7 percents. The persistence of the risk-free rate and of
the volatility have a more significant impact on the observed serial correlation of
the put EEP than on the call EEP . For instance, a decrease in the GARCH
parameter β from 0.890 to 0.800 results in a reduction of the AR(1) of the
put EEP of 32.1 percents ((.171 − .252)/.252) and of the AR(1) of the call of
22.1 percents ((.281 − .361)/.361). Since the φ and ρ parameters are likely to
be downward biased (Andrews (1993)), using higher values of these parameters
results in even higher serial correlation of both the call and the put EEP s.

These simulations suggest that the call EEP is particularly sensitive to the
level and the serial correlation of the lump sum dividends. In particular, the
ability to simulate an EEP process that, in the presence of lumpy dividends, is
very similar to the data leads us to conjecture that unexpected fluctuations in
the dividend yield process might be captured by the EEP . This is a hypothesis
that we investigate in the next section.

2.4 The Information Content of the EEP

We have so far established the dependence and sensitivity of the EEP to


variations in dividend yield, volatility, and interest rates. In this section, we
address two natural questions. First, does the EEP contain information related
to future returns of the underlying asset? Second, if such a relation exists, what
is its provenance?

71
2.4.1 Predictive Regressions

2.4.1.1 Excess Market Returns

To investigate whether the EEP contains information about future stock returns,
we run the daily predictive regression

Rt+1 = α + βEEPt + γXt + εt+1 (2.1)

where Rt+1 is the excess return between the FTSE 100 index return and the
one-month UK treasury rate from t to t + 1, EEPt is the daily call EEP and Xt
is a vector of additional predictive variables observable at t, such as the dividend
yield (DYt ), the risk-free rate (Rft ), the implied variance of the at-the-money
option (Vart ), and the changes in its volume (∆Vlmt ). The dividend yield along
with other variables might not act as perfect predictors especially at short horizon
but they help us understand whether the EEP contains additional information
and provide us a yardstick to measure the information content of the EEP .

The results from different specifications of the regressions are shown in Table
2.3 Panel A. In column 1, we display the benchmark case of the dividend
yield as the only predictor of returns, because it has been used as a return
predictor in numerous studies (e.g., Campbell and Shiller (1988b)) with U.S.
data. The coefficient on the dividend yield is positive but not significant and
the variable explains little variation in daily excess returns because the dividend
yield is very smooth as discussed in Valkanov (2003). The t-statistics reported
in parentheses below the estimates are computed using Newey and West (1987)
heteroskedasticity and autocorrelation robust standard errors.

In column 2 of Table 2.3 Panel A, we add the EEP which measures the relative
premium of American relative to European call options. Its coefficient is positive

72
and statistically significant. This is one of the main results of the paper. Adding
EEP also appears to improve the model fit as the R 2 increases to a modest level
of 0.7 percent for a daily predictive regression. The sign is in line with what
we expect from economic intuition and from the simulations displayed in Figure
2.1 and Table 2.2. A higher and persistent EEP implies that investors expect
higher lump sum dividend payments and are ready to pay a higher premium for
American relative to European options.

In column 3 to 5, we incrementally add other variables that are known to


predict returns. In column 3, we include the lagged daily index excess return to
test whether the observed predictability is due to a mechanical serial correlation
in returns. The coefficient of the lagged return is statistically insignificant but
it increases the R 2 of the regression. More importantly, its addition does not
diminish the predictive power of the EEP , whose point estimate and statistical
significance are almost unchanged. In column 4, we add the at-the-money
European option implied variance and the risk-free rate. We include a measure
of the conditional variance, because Ghysels, Santa-Clara, and Valkanov (2005b)
show that it is positively related to future returns for the US stock market. The
implied variance can also be interpreted as a proxy for the wildcard option that is
included in the EEP . Fleming and Whaley (1994) model this wildcard premium
explicitly and find that it is mostly affected by the volatility of the at-the-money
option. Campbell (1991) argues that the risk-free rate is also a good predictor of
excess returns because it is a proxy for variations of the investment opportunity
set. In Table 2.3, the volatility and the risk-free rate estimated coefficients have
signs that agree with studies for the US stock market, but they are not statistically
significant. The lack of significance at the one-day horizon of these variables and
of the dividend yield is not surprising, because they are persistent and have low
variance and are thus better at predicting returns at monthly, quarterly, or annual

73
horizons.

Finally, in column 5 we control for lagged changes in FTSE 100 share volume
over the previous day as proxy for liquidity in the FTSE 100 market at day
t. Amihud and Mendelson (1986) and Pastor and Stambaugh (2003) show that
liquidity has a large impact on future returns. In Table 2.3, high liquidity precedes
lower future returns, which is consistent with previous findings. The change in
volume is the only significant variable, in addition to the EEP . However, it
must be noted that the inclusion of the change in volume does not alter the
point estimate or the significance of the EEP . Hence, the EEP ought to capture
information about future returns that is orthogonal to that in the other predictors.

Although the coefficient of the EEP is statistically significant, its magnitude


might appear small when compared to the benchmark forecaster, the dividend
yield. To understand the economic magnitude of the predictability, it is helpful
to compare the impact of a one-standard-deviation shock in each one of these
variables to excess returns. A one-standard-deviation shock to the dividend yield
0.022
results in 4.9 basis points (0.350 × √ ) increase in next day’s index return. A
252
similar shock to the EEP produces a 6.5 basis points change (0.036 × 0.018) in
next period’s returns.12 Hence, at a daily basis, the economic significance of the
EEP is almost 50 percent higher than that of the dividend yield. The coefficient
on the change in volume is difficult to compare to the EEP because it is not in
percents.

To summarize the findings in Table 2.3, all variables enter with the
economically expected sign in predicting the FTSE 100 index return and replicate
studies for the US stock market. However, the only significant predictor at the
daily frequency is the EEP and the changes in trading volume. Whether the

12
The standard deviations of the variables are in Table 2.1.

74
forecasting ability of the EEP is spurious or the result of various microstructure
issues is something we investigate extensively below.

2.4.1.2 Index Futures Returns

We have shown that the EEP predicts the market excess returns at a daily
horizon. However, there are two potential issues with the FTSE 100 stock index
results. First, they may be due to non-synchronous trading. Indeed, some stocks
in the index may not trade in the closing hour of the market and therefore our
results from the cash index may be due to stale quotes. Second, in our sample
period, the stock market trading ceases at 4:30pm and the options market closes
at 4:10pm. Although the stock market closes later than the derivative markets
and we are not using any future information when conducting our prediction
study, it is interesting to investigate whether the return predictability we found
is caused by the movement of the stock market between 4:10pm to 4:30pm.
Since this 20-minute window is also the period when the wildcard option can
be exercised, investigating the exact timing provides us one more way to control
the effects of the wildcard option.

We address both of these issues by using the returns of the FTSE 100
index futures. This futures index is not subject to the non-synchronous trading
problem. Table 2.1 Panel B indicates that the index futures returns exhibit
little autocorrelation even at one-day lag. This implies that the returns of index
futures do serve our goal well in mitigating the caveat of stale prices. This is not
surprising because the futures contracts are actively traded with high liquidity.
Moreover, the futures market closes at 4:10pm similarly to the options market.

Table 2.4 Panel A shows the same predictive regressions that are in Table 2.3
but the forecasted variable is the FTSE 100 futures, instead of the spot, return.

75
In all specifications, our main predicting variable, the EEP , is economically and
statistically significant. The coefficients of the EEP are all equal to 0.043 across
all specifications and they are about 15% higher than those in the market excess
return prediction. Economically, the EEP is more important in predicting the
returns of the index futures and the statistical significance is comparable with
the previous case. All other predicting variables remain insignificant except the
changes in trading volume. One thing worth noting is that the point estimate
of the lagged return is negative and very close to zero. In other words, using
the index futures returns does help us eliminate the non-synchronous trading
problem. More importantly, we clearly see that our predictability is not due to
the 20-minute return before the stock market close. Therefore, it precludes the
possibility that the documented predictability is due to the wildcard option.

In summary, the predictability of the EEP is strengthened in forecasting the


index futures returns. This predictive power is not caused by the wildcard option
or potential non-synchronous trading in the cash index.

2.4.1.3 Returns at Longer Horizons

We have shown that the EEP predicts the market excess returns and the
index futures returns of the following day. Here we investigate longer horizon
predictability for two reasons. First, it is interesting to see how rapidly
information diffuses from the options market to the underlying asset. Second,
it is possible that microstructure-related dynamics could potentially generate
spurious predictability. In this section, we address these issues by examining
whether the EEP predicts the excess market and the index futures returns at
horizons of two days, three days, and up to two weeks.

In Panel B of Tables 2.3 and 2.4 we display the results for excess market

76
returns and index futures returns. Since the two sets of results are virtually
the same, we will focus on the case of index market returns in Panel B of
Table 2.3. Column one in that panel contains the results from a forecasting
regression of two period returns, Rt+1,t+2 . The point estimate of the EEP
is 0.035, slightly lower than the estimate of 0.036 obtained in the one period
regressions. The t-statistics and the R 2 are also lower. In column 2 to 4,
we present the results where the forecasted variable is Rt+2,t+3 , Rt+3,t+4 , and
Rt+4,t+5 , respectively. The coefficients of EEP s in these regressions are even lower
and become statistically insignificant. The explanatory powers also declines. The
forecastability completely disappears at horizons longer than three days. Finally
in column 5, in the forecast of Rt+6,t+10 , the coefficient of the EEP goes down
further and the EEP does not carry any predictive power for the next week’s
weekly return. The high R 2 is evidently due to the overlapping of the weekly
returns.

The longer horizon results suggest that the forecastability of the EEP is
mostly observable at horizons of one and two days. At longer horizons, the
magnitude of the EEP coefficients decreases gradually and becomes insignificant
after two days or so. This pattern suggests that it takes about one to two days
for the stock market to digest the information in the EEP . The fact that our
findings are robust even when we include the lagged returns makes it unlikely
that the result is due to market microstructure effects. Given the results in this
section, from now on we concentrate on one-day returns, Rt+1 .

2.4.2 The Provenance of the Predictability

We conjecture that the predictive power of the EEP , documented in the


previous sub-section, is due to its sensitivity to changes in expectations about

77
future lump sum dividend payments. We test this conjecture below using two
alternative approaches. The approaches differ with respect to the identification
and frequency of the dividend growth shocks. However, the empirical results are
remarkably similar which indicates that our findings are a robust feature of the
data.

2.4.2.1 The Campbell-Shiller (1988) Decomposition

The EEP predictability of FTSE 100 returns is at short horizons. In order to


identify the source of this predictability, we use the structural VAR approach
of Campbell and Shiller (1988a) and Campbell (1991). More specifically, we
specify a vector zt+1 which contains the (demeaned) excess FTSE 100 returns, the
2 , rf
DYt+1 , σt+1 t+1 and ∆Vt+1 . Then we estimate the VAR, zt+1 = A(L)zt +wt+1

where A(L) = A1 + A2 L + A3 L2 + ... + Ap Lp−1 . The residuals in vector wt+1


are the one-period ahead forecasting errors. More specifically, the first term
(R)
in wt+1 , wt+1 is the difference between the realized and the forecasted return,
(R)
or wt+1 = Rt+1 − Et (Rt+1 ), where Et denotes the conditional expectation
(R)
formed from the VAR at the end of period t. However, wt+1 has no economic
interpretation.

The VAR results are shown in Table 2.5 where p = 1 and the order of the VAR
was chosen with sequential pre-testing. The first column is similar to Table 2.3
with the exception that the EEP is omitted from the system. We do not include
the EEP in the VAR because our goal is to understand whether it forecasts
Et (Rt+1 ) or some components of wt+1 . Including the EEP in the VAR would
imply that, by construction, it would be uncorrelated with the residuals wt+1
and it would be correlated with Et (Rt+1 ). There are also economic reasons for
not including the EEP in the set of conditioning information. First, the EEP is

78
not directly observable for most assets. Unlike the dividend yield, the short rate
or the conditional variance, it is virtually inaccessible to investors. Second, as
argued above, the EEP is unlikely to be a good proxy for expected returns as are
the other variables in the VAR.

Following Campbell and Shiller (1988b) and Campbell (1991), we use a


linearized version of the dynamic Gordon growth model to decompose unexpected
returns into innovations due to changes in dividend growth, changes in discount
rates, and changes in interest rates. If rft is the log risk-free rate and d is the log
dividend, then we can write

(R)
wt+1 = Rt+1 − Et (Rt+1 )

X ∞
X ∞
X
j j
= (Et+1 − Et ) ρ ∆dt+1+j − (Et+1 − Et ) ρ rft+1+j − (Et+1 − Et ) ρj Rt+1+j
j =0 j =0 j =0
= ηd ,t+1 − ηRf ,t+1 − ηR,t+1 , (2.2)

where ∆ denotes a one-period difference, and the linearization parameter ρ is a


constant related to the long-run dividend yield and it is smaller than 1. This
equation has the following economic interpretation. If the unexpected return is
positive, then either expected future dividend growth ηd ,t+1 must be higher than
previously expected, or the risk-free rate ηRf ,t+1 must be lower than expected, or
the excess future returns ηR,t+1 must be lower than expected, or any combination
of these three must hold true.

After estimating the VAR, the ex-post return is Rt+1 = Êt (Rt+1 ) + η̂d ,t+1 −
η̂Rf ,t+1 −η̂R,t+1 where “ˆ” denotes the estimated values. Since Rt+1 is forecastable
by the EEP , it is interesting to investigate where the forecastability is coming
from. We note that while Êt (Rt+1 ) is uncorrelated by construction with η̂d ,t+1 ,
η̂Rf ,t+1 , and η̂R,t+1 , the latter three shocks to returns are correlated.

79
To disentangle the source of the predictability, we regress the four components
of realized return, Êt (Rt+1 ), η̂d ,t+1 , η̂Rf ,t+1 , and η̂R,t+1 on the previous day EEP .
The results from these regressions are reported in Table 2.6. The EEP forecasts
changes in the dividend growth process. The coefficient in front of η̂d ,t+1 is
positive and significant. These results are in agreement with economic intuition
and the simulation results in section 3. Higher unexpected lump sum dividends
lead to a larger American option premium and larger EEP , as the American
contract is more likely to be exercised prior to the ex-dividend date in order to
take advantage of the larger dividend payout. As we will see below, this result
is a robust feature of the data. This finding is also consistent with the findings
in Amin and Lee (1997), who document that option traders initiate a greater
proportion of long (short) positions a few days before good (bad) earning news.

The EEP does not forecast changes in expected excess returns. The coefficient
on η̂R,t+1 in Table 2.6 has a positive but insignificant sign. The sign of the
coefficient in front of η̂Rf ,t+1 is negative but also insignificant, which implies
that an increase in the EEP leads to an (insignificant) increase in future returns
through an unexpected lowering of the interest rates.13 Finally, the EEP is
negatively correlated with forecasted next day returns, Êt (Rt+1 ). The coefficient
is negative and is only significant at the ten percent level. The negative estimate
is probably due to the fact that the forecasted daily returns are a noisy proxy of
expected returns, which are better estimated at longer horizons. The sub-sample
findings presented in Table 2.6 will be discussed in the robustness section.

If we take the results from Table 2.6 at face value, the forecasting ability of the
EEP is due to its significant correlation with future changes in dividend growth.
To understand the economic significance of this correlation, we compute the effect

13
Lower unexpected interest rates lead to higher returns. Note that η̂Rf ,t+1 enters with a
negative sign in equation (2.2).

80
of a one standard deviation shock of EEP on subsequent returns. For the dividend
growth, this is 7.2 basis points (0.040 × 0.018). The standard deviation of η̂d ,t+1
in the VAR is 97 basis points. In other words, a one standard deviation shock of
EEP leads to an almost 10% change of the volatility of η̂d ,t+1 .

2.4.2.2 An Alternative Method

We have thus far showed that the EEP forecasts future changes in the dividend
growth process, where the innovations in dividend growth were obtained using
the Campbell and Shiller (1988a) VAR decomposition. It is reasonable to ask
whether this decomposition accurately identifies the dividend growth innovations
in returns. To answer this question, we take a more direct empirical approach
at isolating dividend growth shocks. Using the FTSE 100 dividends, Dt , we
construct a series of dividend growth rate, DGt = log(Dt ) − log(Dt−1 ). The
dividends are available on the ex-dividend date for the FTSE 100 index. If the
hypothesis that the EEP contains information about future dividend growth rates
is correct, then the daily EEP series must forecast fluctuations in the DGt series.
The advantage of this approach is that the DGt series is directly observable and
does not have to be identified from the returns series.

Two obstacles stand in our way of investigating more directly whether the
EEP forecasts fluctuations in dividend growth rates. First, as mentioned above,
the lumpy DGt series are available at bi-weekly frequency, whereas the EEP series
are daily. Aggregating the EEP to a bi-weekly horizon is not suitable in this case,
because as shown in Table 2.3, the forecasting relation occurs at frequencies of no
more than a couple of days. In other words, running the forecasting regressions
at bi-weekly frequency would obfuscate the daily lead-lag effect. Second, there
are seasonalities in the dividends and dividend growth processes, which might

81
produce spurious correlation in a forecasting relation.

To address both of these concerns, we use the following mixed data sampling
(MIDAS) regressions (Ghysels, Santa-Clara, and Valkanov (2005b)).

K
X
DGHt = α + φ(L)DGH (t−1) + γ β(k , θ)EEPt−k + et (2.3)
k =1

where φ(L) = 1 + φ1 L + φ2 L2 + . . . φp Lp is a polynomial in L of order p. The


subscript of DGHt reflects the fact that the dividend growth rate is available
only once every H periods. In our case, H = 10 or the dividend growth rate
is observable once every two weeks. The AR(p) component captures seasonal
P
components in DGHt . The second part of the regression, γ K
k =1 β(i , θ)EEPt−k ,

is the MIDAS term. In that expression, we use lagged daily EEP s to forecast
the bi-weekly dividend growth rates. In other words, the dividend growth rate
of, say, July 1st, 1995 will be regressed on p own lags as well as on lagged daily
EEP rates starting June 30th and going back K days.

Since the number of lagged daily EEPs needed to capture the dynamics of the
dividend growth rate might be large, the unrestricted specification of the weights
results in a lot of parameters to estimate. The cost of parameter proliferation is
that the estimates will be estimated imprecisely and the regression will produce
poor out-of-sample forecasts. To reduce the number of coefficients to estimate,
we follow the MIDAS regression approach and parameterize the lags in front
of the EEPt−k using a function β(k , θ). The lag function is parsimoniously
parameterized and its parameters are collected in a vector θ. Ghysels, Santa-
Clara, and Valkanov (2005a) show that a suitable parameterization β(k , θ)
circumvents the problem of parameter proliferation and of choosing the truncation
point K . We also normalize the weights β(k , θ) to add up to one, which allows us

82
to estimate a scale parameter γ. The normalization is useful because γ captures
the overall predictive power of lagged EEPs, while the dynamics of the EEPs is
captured by the weights.

In general, there are many ways of parameterizing β(k , θ). We focus on the
Beta function specification (also used by Ghysels, Santa-Clara, and Valkanov
(2005a)), which has only two parameters, or θ = [θ1 ; θ2 ]:

f ( k , θ1 ; θ2 )
β(k , θ) = PK K j (2.4)
j =1 f ( K , θ1 ; θ2 )

where f (z , a, b) = z a−1 (1 − z )b−1 /β(a, b) and β(a, b) is based on the Gamma


function, or β(a, b) = Γ(a)Γ(b)/Γ(a + b). The flexibility of the Beta function
is well known. The function is often used in Bayesian econometrics to impose
flexible, yet parsimonious prior distributions. It can take many shapes, including
flat weights, gradually declining weights as well as hump-shaped patterns. While
MIDAS regressions are not limited to Beta distributed lag schemes, for our
purpose we focus our attention on this specification. We refer to Ghysels, Santa-
Clara, and Valkanov (2004, 2005b) for alternative weight specifications.

The predictive MIDAS regression (2.3) and (2.4) is estimated by quasi


maximum likelihood and the results are reported in Table 2.7. In the first column,
we report the baseline case of regressing the dividend growth rate on three of its
lags with no lagged EEPs (no MIDAS terms). The order of the lags was selected
by sequential pre-testing. The estimated lag coefficients are all significant which
confirms our assertion that the dividend series contain seasonal components. The
R 2 of this regression is 0.297.

In the second column of Table 2.7, we add the lagged daily EEP s, where K

83
is set to 45 days, or two months’ worth of daily returns.14 If the conjecture that
the early exercise premium contains information about future dividend growth is
correct, then we expect the γ coefficient to be positive and statistically different
from zero. Consistent with this conjecture, we obtain a γ estimate of 3.093. This
estimate is statistically significant at the 1 percent level. A joint F-test of the
significance of all the MIDAS parameters (γ, θ1 , and θ2 ) being equal to zero is
also statistically significant at the 1 percent level. Since the β(k , θ) function is
normalized to sum to one, we can interpret the coefficient estimates of γ as the
total impact of the lagged EEPs on future dividend growth.

The parameter estimates θ1 , and θ2 are difficult to interpret because they


have no economic meaning. In contrast, the shape of the polynomial β(k , θ)
has a clear economic interpretation. It captures the rate at which information
is incorporated from the EEP s into the dividend growth component. The shape
can be interpreted as the impulse response of the dividend growth rate to EEP
fluctuations. The estimated β(k , θ) plotted as a function of the daily lags is
displayed in Figure 2.2 using estimates of θ1 , and θ2 from Table 2.7. A few
interesting findings emerge.

First, most of the mass is concentrated on only four to five daily EEP s, which
suggests that the predictability is at short horizons. Otherwise, we would expect
to see more weight on a larger fraction of lagged EEP s. Second, the location
of the mass is on EEP s between 15 and 18 days before the ex-dividend date.
Dividend payments for the FTSE 100 index stocks are announced between 10
to 15 days before the ex-dividend date. That period is represented in shaded
pattern on the figure. The shape of the estimated weights clearly shows that
most of the predictability occurs right before the announcement period, which

14
We experimented with K as large as 130 days (about 6 months) and the results were almost
identical.

84
suggests that our MIDAS procedure accurately captures the timing of when
information is incorporated into prices. To summarize the findings in the figure,
the concentration of the mass and the location of the weights corroborate the
evidence from the previous section that the predictability is at short horizons
and it is due to news about dividend growth rates.

2.4.3 Discussion and Related Literature

The EEP forecasts the underlying returns at short horizons. Moreover, its
predictive ability is related mostly to innovations to the dividend growth
component of returns rather than discount rates or expected returns. Both
of these findings are consistent with the view that information about future
cash flows is first revealed in option prices rather than in the price of the
underlying security. This result is consistent with Black’s (1975) view that
informed investors prefer to trade in the options market. Moreover, the very
short horizon nature of the predictability indicates that while information does
not flow instantaneously between the options and the underlying markets, it is
incorporated quite efficiently.

Informed trading might be even more prominent in individual stocks than in


the stock index. However, on any ex-dividend date, at most a fraction of the
companies go ex-dividend. From that perspective, trading in the index is a close
but not perfect substitute to trading in the individual stock with a lower price
impact. Also, the predictability that we observe at the index level is likely to be
an attenuated version of the information diffusion we would observe if we had
data for individual stock options. Unfortunately, as mentioned above, the FTSE
100 index is the only asset with comparable European and American option
contracts.

85
The daily lag in the information flow is consistent with Sims (2001) and Shiller
(2000) who explore the implications of limited information-processing capacity
for asset prices. These authors argue that investors, rather than possessing
unlimited-processing capacity, are better characterized as being only boundedly
rational. The inability of investors to immediately incorporate all relevant
information into prices gives rise to short horizon predictability across markets.
Hong, Torous, and Valkanov (2004) make a similar point by linking the slow
diffusion of cash flow information across industries to short horizon cross asset
return predictability. We are the first to document a similar phenomenon in
the options market, by linking the underlying return predictability to the EEP ’s
ability to forecast mainly innovations to dividend growth.

Our paper is related to several others that use option market information to
forecast underlying returns. Manaster and Rendleman (1982) show that if we take
the volatility as given and impute the implied stock prices from the options, this
implied stock price will predict future stock return by one day. Anthony (1988)
shows that shocks to option trading volume leads shocks to stock trading volume
by one day. However, Stephan and Whaley (1990), Chan, Chung, and Johnson
(1993) and others find no evidence that price changes in option markets lead
price changes in the underlying. Easley, O’Hara, and Srinivas (1998) find that
option market volume predict underlying returns which is consistent with the view
that informed investors trade in the options market. Pan and Poteshman (2004)
also find that option trading volume contains information about future stock
price movements and argue that the source of the predictability is non-public
information possessed by option traders. In relation to previous work, the novel
contributions of this paper are: (i) the introduction of the call EEP as a short
horizon predictor of the underlying return; (ii) to argue for the economic reasons
behind the predictability; and (iii) to provide supporting empirical evidence that

86
link the predictability to news about future cash flows.

The presented evidence may also be used to sharpen theoretical discussions


that have followed Black (1975). For instance, Biais and Hillion (1994) show
that theoretically the option market can be more or less informative about the
underlying asset’s payoff depending on the modeling assumptions. Moreover,
Easley, O’Hara, and Srinivas (1998) show that, under certain conditions a pooling
equilibrium exists where some informed traders choose to trade in both the
options market and the underlying market. Their theoretical results in general
support Black’s intuition that the options market might convey some distinctive
information. However, as Back (1993) pointed out, if the options market as well
as the underlying market both work as in Kyle (1984), trades in the options will
move the underlying market as well. However, our specific trading strategy has
the property that it is neutral in the option market. It is long an American call
and short an European call. If the American and European contracts have similar
price impacts on the underlying market, then the long and short position will
have virtually no price impact on the underlying index. Therefore the underlying
market will not react immediately, which induces the lead lag relationship.

An alternative explanation for our findings might be that the early exercise
premium is purely driven by irrational financial market behavior which also has
an impact on underlying returns. While there is some evidence that individual
customers engage in irrational exercising of options, Poteshman and Serbin (2003)
show that larger traders exhibit no irrational exercise behavior. Hence, this is
not a compelling explanation for the FTSE 100 index options which are widely
held and traded by both individuals and institutional investors.

87
2.5 Robustness

In this section, we provide several robustness checks of the main results in Tables
2.3 and 2.6. Some of these tests are motivated by economic theory, while others
address statistical concerns.

2.5.1 Subsamples: Pre- and Post-1994

To investigate the stability of our predictability results, we break the 1992-1996


sample into two sub-samples: June, 1992 to July 18, 1994 and July 19, 1994 to
January 12, 1996. The July 19, 1994 break date was chosen for two reasons. First,
the exchange changed its settlement system on that date and this resulted in a
substantial change in the effect of short-selling restrictions. This change might
affect the hedging and therefore the trading behavior of options.15 Second, this
date splits our sample into sub-samples with approximately equal number of
observations.

Table 2.8 presents the predictive regression results in the two sub-sample
periods. The entire sample results (from Table 2.3) are also displayed in the
first column for convenience. We observe that the EEP predicts future FTSE
100 returns in both sub-samples even after controlling for all other commonly
used predictors. Interestingly, the point estimates of the EEP coefficient in the
sub-samples, 0.038 and 0.048, are very similar to that of the entire sample, 0.036.
Importantly, the estimates remain statistically significant despite the short sub-

15
Prior to July 18, 1994, the LSE followed a fixed date (rather than fixed period) settlement
regulation, in which all transactions within a two or three week “account settlement period”
were settled on the second Monday of the following account settlement period, making ex-
dividend dates two or three weeks apart. After July 18, 1994, even though the settlement
system changed to settle 5 trading days after a transaction, ex-dividend dates have largely
continued the historical practice of being only on the first day of the week, and typically every
two weeks.

88
samples. The slight reduction in the t-statistics is undoubtedly due to the fact
that we have fewer observations in the two sub-periods, which decreases the power
of our tests.

The stability of the forecasting relation that we document is quite remarkable,


especially at short horizons. For instance, we notice that none of the other
variables predict the FTSE 100 in both sub-samples. The coefficient on change
in volume which was significant in the entire sample is also significant in the first
subsample, but not in the second one.

To investigate whether the predictability in sub-samples is due to unexpected


fluctuations in dividend growth, we re-do the VAR decompositions and present
the results in the bottom two rows of Table 2.6. The results for the sub-
periods are that higher EEP predicts higher future dividends. The EEP does
not predict any of the other components in both sub-samples. These results are
is in agreement with the findings from the whole sample and also with economic
intuition. Remarkably, the point estimates in the sub-samples, 0.058 and 0.058,
respectively, are identical to each other (to the third digit) and are also similar to
the estimate from the entire sample, 0.040. In the first sub-sample, the coefficient
on η̂d ,t+1 is significant only at the 10 percent, probably because of the lack of
power of the test in the short sample.

The sub-sample results in Tables 2.6 and 2.8 lead us to conclude that the
predictive ability of the EEP is a robust feature of the data.

2.5.2 Put EEP Results

Thus far, our focus has mainly been on the call EEP , even though the dividend
yield also has an impact on the EEP of put options. The concentration on the
call EEP was guided by two main reasons. First, the EEP of put options are

89
positive even when the underlying security does not pay dividends or when the
dividend stream is continuous. In contrast, a positive call EEP can arise only
when dividends are paid in lump sums. To put it differently, American put
options can be optimally exercised earlier than maturity for reasons other than
lumpy dividends. Therefore, the put EEP is not as sensitive and unambiguous
an indicator of expected future dividends as is the call EEP . The second reason
for not including put EEP in our main analysis is that higher dividend yields
increase the cost of early exercising put options, all else equal. Indeed, we have
seen in Figure 2.1 that the dividend yield impacts the EEP of call and put options
in opposite directions.

With these arguments in mind, we expect that the predictive ability of the
put EEP will be lower than that of the call EEP and the sign on the predictor
will be reversed. In Table 2.9, we use the put EEP to run the same predictive
regression as we did with the call EEP (Table 2.3). As expected, the coefficient
on the put EEP is negative, because higher put EEP indicates lower expected
future dividends, everything else equal. Also expected is the fact that the put
EEP coefficient is not statistically significant. While the point estimates are
stable in the sample and across sub-samples, the t-statistics are never above one.
As anticipated, the put EEP is a much noisier predictor of the underlying stock’s
returns, because it is a function of many other variables in addition to dividends.

The put EEP results serve as an additional robustness check that our findings
are not spurious. Indeed, it may be argued that the predictability is due
to market micro-structure differences between the options and the underlying
market. The fact that we don’t observe the predictability with put EEP is a
clear demonstration that our results are not due to such automatic correlations
and indirectly supports our main premise.

90
2.5.3 Alternative EEP Aggregation Methods

In the construction of our EEP variable, we do not control for the time-to-
maturity of each contract. While the EEP certainly depends on the time span of
the options, this dependence is complicated by the timing and lumpiness of the
dividend payments and is therefore highly non-linear.

As an attempt to investigate the impact of the time-to-maturity on our results,


we provide the following simple, albeit not fully satisfactory, robustness check.
For every trading day, we linearly interpolate the EEP s of all the matched pairs
on that day in the moneyness and time-to-maturity space and use the fitted value
of the EEP at the moneyness equal to 1 and time-to-maturity equal to 1-month as
the EEP measure for at-the-money constant maturity option on that day. With
this new interpolation, we construct a time series of 894 daily EEP s. The daily
interpolated, constant-maturity EEP s are denoted by EEPtMat . As an alternative
measure of the daily EEP s, we construct a daily EEP measure by averaging the
EEP of all contracts every day. This daily averaged EEP serves as yet another
robustness check. The daily averaged EEP is denoted by EEPtAvg .

Panels A and Panel B in Table 2.10 contain the results from the predictive
regressions with these two new EEP forecasters. The predictive regressions also
include the other forecasting variables. We provide the results for the entire
sample as well as for the two sub-samples. The results in Table 2.10 are very
similar to those of the previous tables in terms of magnitudes of the estimates as
well as statistical significance. These additional robustness checks are reassuring
that our results are not driven by the particular construction of the EEP .

91
2.6 Conclusion

In this paper, we use the call EEP to examine the information flow between
the stock market and the derivative market. We first show that the empirically
observed level and serial correlation of the EEP can be reproduced when the
dividend yield process is lumpy and persistent. Therefore, it is reasonable to
suspect that the observed EEP reflects the market participants’ expectation
about future dividends. We explore the information content of the EEP by
asking whether the EEP forecasts the FTSE 100 index return. Based on the
estimation results of time-series regression models, we further identify the source
of this predictability.

Our results show that in a time-series regression the EEP predicts the
underlying FTSE 100 index at daily horizon. Economically this forecasting
relationship is about 50% higher than the widely used benchmark, the dividend
yield. The economical and statistical significance is robust to the addition of other
control variables. We conjecture that the EEP predicts the underlying asset’s
return because it is a forward-looking variable that contains the information
about expected future dividends. We verify this hypothesis by decomposing
the realized returns into expected returns and three different components of
unexpected returns and find that the EEP indeed predicts the dividend shock
component of the index return. This result confirms our hypothesis that the EEP
reflects the option market’s expectation about future fundamentals.

Traditional literature has studied the information flow between the options
market and underlying asset market through option prices, volume and signed
volume. Our study introduces the EEP as a short-horizon predictor of underlying
returns, and links this forecastability to the fundamentals of the underlying
market. This link provides a clear support for the Black (1975) conjecture that

92
informed investors prefer to trade on their information about fundamentals in
the options market rather than the underlying market.

93
Table 2.1: Summary Statistics

Panel A reports the summary statistics for all the variables. EEP Call and EEP Put represent
the EEP of call and put options. Rt is the FTSE 100 index return. Rtfut is the FTSE 100
index futures return. DYt is the one-month moving average of the dividend yield. Rft is the
one-month stochastic detrended risk free rate. Vart is the implied variance of the closest to the
money European call option. ∆Vlmt is the change in share volume in million. All variables
except ∆Vlm are annualized. Panel B shows the partial autocorrelations of the call EEP , the
put EEP , the index return, and the index futures return. The t -statistics are in the parentheses.

Panel A: Summary Statistics

Mean Std Skewness Kurtosis Observations


EEP Call 0.035 0.018 2.052 9.120 894
EEP Put 0.076 0.043 2.384 10.491 894
Rt 0.096 0.125 0.210 2.663 916
Rtfut 0.055 0.146 -3.082 3.892 916
DYt 0.040 0.022 0.806 0.401 916
Rft -0.001 0.004 -1.818 4.108 916
Vart 0.025 0.011 1.442 5.911 894
∆Vlmt 0.030 0.248 3.852 22.219 916

Panel B: Partial Autocorrelations of EEPs and Returns

Lag EEPCall EEPPut Rt Rtfut


1 0.377 0.250 0.059 -0.018
(3.840) (4.213) (1.889) (-0.540)
2 0.184 0.223 0.045 -0.007
(3.232) (4.825) (0.964) (-0.198)
3 0.128 0.089 -0.022 -0.034
(3.905) (4.100) (-1.073) (-1.000)
4 0.009 0.165 0.032 0.005
(0.152) (4.633) (0.628) (0.135)
5 0.006 0.096 0.021 0.023
(0.482) (3.569) (0.542) (0.695)
10 0.074 0.043 0.004 -0.005
(3.512) (1.000) (0.033) (-0.159)
20 -0.020 0.025 0.017 0.023
(-0.494) (0.747) (0.555) (0.676)

94
Table 2.2: Simulating the Dynamics of the EEP

This table reports the dynamics of the EEP for calls and puts when the options are priced using
numerical valuation. Its purpose is to illustrate the serial correlation in the EEP for various
persistence levels of the underlying dividend yield, interest rate, and volatility processes. The
parameters α and β are the GARCH(1,1) coefficients for the underlying asset’s volatility. φ
and ρ are the AR(1) coefficients for the risk-free rate and the dividend yield, respectively. Each
sample simulates 1000 steps of the underlying asset. The EEP s for calls and puts are calculated
and the AR(1) coefficients are obtained from the calculated results.

σt2 Rft DYt EEP Call EEP Put


α β φ ρ AR(1) AR(1)
0.104 0.890 0.992 0.906 0.361 0.252

0.104 0.800 0.992 0.906 0.281 0.171


0.104 0.700 0.992 0.906 0.240 0.070

0.104 0.890 0.900 0.906 0.117 0.020


0.104 0.890 0.800 0.906 0.095 0.016

0.104 0.890 0.992 0.800 0.084 0.099


0.104 0.890 0.992 0.700 0.073 0.036

95
Table 2.3: Predictive Regressions of Index Excess Returns

This table reports predictive regressions of excess returns by the early exercise premium and
other forecasters for different horizons. The dependent variables are the excess return of the
FTSE 100 index. DYt is the one-month moving average of the dividend yield. EEPtCall is the
early exercise premium of call options. Rt is the FTSE 100 index excess return (lagged). Vart
is the implied variance of the closest to the money European call option. Rft is the one-month
stochastically detrended risk free rate. ∆Vlm is the change in share volume (in million shares).
The t -statistics in parentheses are corrected for hetroskedasticity and autocorrelation. Panel
A compares the predictive regression of the next day’s index excess returns under different
specifications. Panel B examines the return predictability at longer horizons of up to two weeks
using all the predictors (most exhaustive specification from Panel A). The numbers in the
square brackets indicate the number of days the returns leading all the explanatory variables.
For example, the column labeled [2, 3] shows the regression of Rt+2,t+3 on all the explanatory
variables at time t .

Panel A: Index Excess Returns

DYt 0.511 0.350 0.492 0.313 0.490


(0.162) (0.111) (0.165) (0.104) (0.163)
EEPCall
t 0.036 0.038 0.039 0.036
(2.623) (2.783) (2.872) (2.890)
Rt 0.074 0.071 0.078
(1.432) (1.345) (1.476)
Vart 5.717 5.716
(0.830) (0.827)
Rft -5.076 -3.630
(-0.321) (-0.229)
∆Vlmt -0.512
(-2.239)
R2 0.000 0.007 0.012 0.014 0.018

Panel B: Longer Horizons

[1, 2] [2, 3] [3, 4] [4, 5] [6, 10]


DYt 0.917 0.249 -3.234 -2.075 1.437
(0.300) (0.079) (-0.993) (-0.702) (0.743)
EEPCall
t 0.035 0.021 0.011 0.028 0.011
(2.623) (1.528) (0.826) (1.740) (1.224)
Rt 0.043 -0.023 0.019 0.014 -0.035
(1.208) (-0.616) (0.519) (0.388) (-1.950)
Vart -1.219 4.780 6.527 16.699 7.847
(-0.140) (0.553) (0.753) (1.724) (1.112)
Rft -15.267 -5.602 -3.371 8.125 -5.128
(-0.947) (-0.350) (-0.203) 0.489 (-0.476)
∆Vlmt 0.275 0.119 0.348 0.015 -0.103
(0.959) (0.598) (1.680) (0.064) (-1.182)
R2 0.010 0.004 0.006 0.012 0.020

96
Table 2.4: Predictive Regressions of Index Futures Returns

This table reports predictive regressions of excess futures returns by the early exercise premium
and other forecasters for different horizons. This table is similar to Table 2.3 above with the
exception that the dependent variable is the return of FTSE 100 futures contracts rather than
the index itself. DYt is the one-month moving average of the dividend yield. EEPtCall is the
early exercise premium of call options. Rtfut is the FTSE 100 return of the futures contract
(lagged). Vart is the implied variance of the closest to the money European call option. Rft
is the one-month stochastically detrended risk free rate. ∆Vlm is the change in share volume
(in million shares). The t -statistics in parentheses are corrected for hetroskedasticity and
autocorrelation. Panel A compares the predictive regression of the next day’s index excess
returns under different specifications. Panel B examines the return predictability at longer
horizons of up to two weeks using all the predictors (most exhaustive specification from Panel
A). The numbers in the square brackets indicate the number of days the returns leading all the
explanatory variables. For example, the column labeled [2, 3] shows the regression of Rt+2,t+3
on all the explanatory variables at time t .

Panel A: Returns of the Index Futures

DYt 1.972 1.777 1.768 1.819 2.032


(0.543) (0.493) (0.488) (0.504) (0.563)
EEPCall
t 0.043 0.043 0.043 0.043
(2.659) (2.623) (2.663) (2.679)
Rtfut -0.013 -0.012 -0.005
(-0.363) (-0.331) (-0.129)
Vart -2.135 -2.205
(-0.277) (-0.286)
Rft -10.175 -8.431
(-0.550) (-0.455)
∆Vlmt -0.644
(-2.385)
R2 0.000 0.008 0.008 0.008 0.013

Panel B: Longer Horizons

[1, 2] [2, 3] [3, 4] [4, 5] [6, 10]


DYt 1.701 1.274 -3.080 -1.989 0.910
(0.490) (0.354) (-0.853) (-0.583) (0.450)
EEPCall
t 0.036 0.017 0.010 0.025 0.009
(2.379) (1.099) (0.585) (1.516) (1.050)
Rtfut -0.016 -0.035 -0.005 0.024 -0.038
(-0.429) (-0.965) (-0.140) (0.676) (-2.474)
Vart -2.025 1.377 5.480 10.206 4.901
(-0.215) (0.156) (0.624) (1.097) (0.811)
Rft -14.135 -5.043 -1.478 7.417 -6.021
(-0.781) (-0.288) (-0.081) (0.412) (-0.549)
∆Vlmt 0.329 0.260 0.391 -0.021 -0.111
(0.950) (1.018) (1.661) (-0.074) (-1.156)
R2 0.008 0.003 0.004 0.005 0.015

97
Table 2.5: VAR Results

The table contains the vector autoregression (VAR) estimates of A in zt+1 = Azt + wt+1 where
the vector zt is demeaned and is defined as zt = [Rt , DYt , Vart , Rft , ∆Vlmt ]′ . Rt is the FTSE
100 index excess return. DYt is the one-month moving average of the dividend yield. Vart is
the implied variance of the closest to the money European call option. Rft is the one-month
stochastically detrended risk free rate. ∆Vlmt is the change in share volume in million. The
order of the VAR was chosen with sequential pretesting. The t -statistics in parentheses are
corrected for hetroskedasticity and autocorrelation.

Rt+1 DYt+1 Vart+1 Rft+1 ∆Vlmt+1

Rt 0.071 -0.0001 4.832 × 10−5 -3.805 × 10−6 0.003


(1.341) (-0.316) (0.167) (-0.102) (0.688)

DYt 0.536 0.907 0.007 -0.001 0.713


(0.176) (51.877) (0.681) (-0.172) (2.041)

Vart 5.512 0.035 0.708 -0.012 -1.989


(0.781) (0.892) (14.852) (-1.477) (-2.812)

Rft -1.306 0.061 -0.206 0.965 -0.089


(-0.082) (0.823) (-1.522) (48.303) (-0.073)

∆Vlmt -0.506 -0.001 -0.002 -0.0003 -0.102


(-2.210) (-1.482) (-2.791) -0.897 (-3.683)

R2 0.010 0.824 0.555 0.877 0.020

98
Table 2.6: Identifying the Source of the Forecastability

This table reports the regressions of different components of returns on the EEP . Êt (Rt+1 ) is
the fitted value from the VAR regression. η̂R,t+1 represents news about future excess returns.
η̂d,t+1 represents news about cash flow. η̂Rf ,t+1 represents news about risk-free rate. All these
four components are regressed on the call option EEP at time t . The t -statistics in parentheses
are corrected for hetroskedasticity and serial correlation.

R
Sample Period ŵt+1 Êt (Rt+1 )
η̂d,t+1 η̂R,t+1 η̂Rf ,t+1
6/1/92 to 1/12/96 0.040 -0.000 0.001 -0.003
(2.642) (-0.394) (0.460) (-1.714)
R2 0.007 0.000 0.000 0.005

6/1/92 to 7/18/94 0.058 0.001 0.02 -0.004


(1.752) (0.730) (0.879) (-1.232)
R2 0.005 0.000 0.001 0.004

7/19/94 to 1/12/96 0.058 -0.000 0.012 -0.001


(2.376) (-1.564) (1.793) (-0.734)
R2 0.016 0.005 0.004 0.002

99
Table 2.7: Dividend Growth Forecast: MIDAS Regression

This table shows results for the following mixed-data sampling (MIDAS) of the bi-weekly
dividend growth rate (DGt ) on its own lags and lags of daily call EEPs. DGt = α+φ(L)DGt−1 +
P
γ K k =1 β(k , θ)EEPt−k /14 + et . Details about the MIDAS regression are in the text. The t -
statistics are in parentheses. The F -statistic tests the MIDAS model against the benchmark
model in column 1 under the null hypothesis that lagged call EEP s do not forecast the dividend
growth rate. The F -statistic is shown and its p-value is in parentheses.

DGt−1 -0.286 -0.289


(-3.019) (-3.056)
DGt−2 -0.534 -0.540
(-4.886) (-4.942)
DGt−3 -0.606 -0.617
(-6.877) (-6.995)

γ 3.093
(21.525)
θ1 289.014
(1.385)
θ2 500.000
(1.406)

R2 0.297 0.326
F 4.766
(0.004)
Sample Size 116 116

100
Table 2.8: Predictive Regression in Subsamples
This table reports predictive regressions of excess futures returns by the early exercise premium
and other forecasters for different sub-sample periods. In Panel A, the forecasted variable is
the index excess return, whereas in Panel B, it is the index futures contract return. DYt is
the one-month moving average of the dividend yield. EEPtCall is the early exercise premium
of call options. Vart is the implied variance of the closest to the money European call option.
Rft is the one-month stochastic detrended risk free rate. ∆Vlm is the change in share volume
in million. The first column shows the whole sample period and the next two columns display
the result for the two sub-samples. The choice of the sub-samples is explained in the text. The
t -statistics in parentheses are corrected for hetroskedasticity and autocorrelation.
Panel A: Index Excess Return

Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96


DYt 0.490 -1.350 1.480
(0.163) (-0.294) (0.349)
EEPCall
t 0.036 0.038 0.048
(2.890) (2.117) (2.118)

Rt 0.078 0.107 0.020


(1.476) (1.489) (0.359)
Vart 5.716 6.646 11.849
(0.827) (0.742) (0.892)

Rft -3.630 5.273 -32.259


(-0.229) (0.221) (-0.703)
∆Vlmt -0.512 -0.898 -0.042
(-2.239) (-3.085) (-0.130)

R2 0.018 0.022 0.016


Sample Size 894 528 376

Panel B: Index Futures Return

Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96


DYt 2.032 1.336 2.924
(0.563) (0.245) (0.574)
EEPCall
t 0.043 0.041 0.056
(2.679) (2.026) (1.903)

Rtfut -0.005 0.028 -0.059


(-0.129) (0.582) (-1.039)
Vart -2.205 -3.866 4.643
(-0.286) (-0.415) (0.288)

Rft -8.431 -7.553 -9.087


(-0.455) (-0.293) (-0.161)
∆Vlmt -0.644 -1.051 -0.0148
(-2.385) (-3.158) (-0.364)

R2 0.013 0.017 0.016


Sample Size 894 528 376

101
Table 2.9: Robustness Check: Put Results

This table reports the predictive regression of return using the put EEP (instead of call EEP ).
The dependent variable is the excess return of the FTSE 100 index. DYt is the one-month
moving average of the dividend yield. EEPtPut is the early exercise premium of put options.
Rt is the (lagged) FTSE 100 index excess return. Vart is the implied variance of the closest to
the money European call option. Rft is the one-month stochastically detrended risk free rate.
∆Vlm is the change in share volume in million. The first column shows the whole sample period
and the next two columns show the result for two sub-samples. The t -statistics in parentheses
are corrected for hetroskedasticity and autocorrelation.

Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96


DYt 1.813 0.452 1.668
(0.614) (0.102) (0.400)
EEPPut
t -0.007 -0.008 -0.006
(-0.926) (-0.939) (-0.440)

Rt 0.063 0.085 0.020


(1.276) (1.285) (0.365)
Vart 10.870 12.771 12.301
(0.949) (1.020) (0.411)

Rft -2.312 -0.008 -33.606


(-0.120) (0.000) (-0.722)
∆Vlmt -0.474 -0.798 -0.069
(-2.081) (-2.751) (-0.216)

R2 0.012 0.020 0.004


Sample Size 894 528 376

102
Table 2.10: Predictive Regression under Alternative EEP Aggregation
This table presents the predictive regression of the future return under alternative EEP
aggregation methods. The dependent variables are the excess return of the FTSE 100 index.
DYt is the one-month moving average of dividend yield. EEPtMat is the early exercise premium
of call options aggregated daily by interpolating both moneyness and time to maturity. EEPtAvg
is the early exercise premium of call options aggregated daily by averaging the EEP of all
contracts. Rt is the FTSE 100 index excess return. Vart is the implied variance of the closest
to the money European call option. Rft is the one-month stochastic detrended risk free rate.
∆Vlm is the change in share volume in million. The first column shows the whole sample period
and the next two columns show the result for the two halves of the sample. The t -statistics in
the parentheses are corrected for hetroskedasticity and autocorrelations.
Panel A: EEP Controlled for Maturity
Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96
DYt 0.877 -0.945 1.761
(0.291) (-0.204) (0.410)
EEPMat
t 0.026 0.029 0.036
(2.529) (2.300) (2.008)

Rt 0.079 0.109 0.017


(1.485) (1.515) (0.325)
Vart 4.767 6.374 11.322
(0.687) (0.703) (0.967)

Rft -4.717 5.150 -25.770


(-0.262) (0.216) (-0.553)
∆Vlmt -0.508 -0.916 -0.161
(-2.198) (-3.148) (-0.457)

R2 0.015 0.026 0.014


Sample Size 894 528 376

Panel B: Average EEP


Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96
DYt -0.327 -1.463 -0.189
(-0.109) (-0.319) (-0.044)
EEPAvg
t 0.042 0.046 0.045
(3.115) (2.469) (1.976)

Rt 0.087 0.122 0.024


(1.638) (1.668) (0.422)
Vart 5.183 5.495 10.323
(0.752) (0.621) (0.784)

Rft -4.405 4.925 -31.313


(-0.279) (0.208) (-0.684)
∆Vlmt -0.505 -0.904 -0.023
(-2.187) (-3.078) (-0.072)

R2 0.019 0.029 0.014


Sample Size 894 528 376
103
Figure 2.1: Variation in the level of the EEP

The two graphs display the magnitudes of the call and put EEP s computed using numerical
valuations of an at-the-money, one-month American option contracts. The risk free rate is 8%.
The top (bottom) graph displays for the call (put) EEP for various levels of the dividend yield
and volatility.

Call EEP

0.1

0.08

0.06
EEP

0.04

0.02

0
0.08
0.06 0.35
0.3
0.04 0.25
0.02 0.2
0.15
Dividend Yield 0 0.1
Volatility

Put EEP

0.12

0.1

0.08
EEP

0.06

0.04

0.02

0
0.08
0.06
0.35
0.04 0.3
0.25
0.02 0.2
0.15
Dividend Yield 0 0.1
Volatility
104
Figure 2.2: MIDAS Weights

This graph pictures the shape of the β coefficients against the lagged
Pdays in the following mixed-
K
data sampling (MIDAS) regression, DGt = α + φ(L)DGt−1 + γ k =1 β(k , θ)EEPt−k /14 + et .
Dividend payments for the FTSE 100 index stocks are announced between 10 to 15 days before
the ex-dividend date. The shaded pattern denotes that period. The daily EEPs contain
information about future dividends 2 to 3 weeks before the ex-dividend date, right before
the announcement of the dividend payments.

MIDAS Weights
0.5

0.45

0.4

0.35

0.3
Weight

0.25

0.2

0.15

0.1

0.05

0
0 5 10 15 20 25 30 35 40 45
Daily Lags

105
CHAPTER 3

Alternative Variance Estimators for Pricing


Options

This paper examines a volatility estimation bias that may be commonly exhibited
by all option pricing models. Black and Scholes (1972) were the first to illustrate
the bias by showing that their model under priced options on relatively low
variance stocks and over priced options on relatively high variance stocks. The
bias is always observed in cross section among individual stocks. Thus, we
think this bias might have nothing to do with Black-Scholes or any option
pricing model but instead might be attributable to sampling error. If it is,
the bias should be observed with any option pricing model on any underlying,
not just equity, but also fixed income securities, mortgages, foreign exchange,
and commodities. To test this idea, we collect 100 months of call and put
equity option prices spanning 8 1/3 years from January, 1996 through April,
2004. The bias is indeed present and very significant in this sample. Alternative
variance estimators that use “shrinkage” techniques might be able to eliminate
the bias. We use shrinkage estimators of James-Stein detailed in Efron and Morris
(1976) and Ledoit and Wolf (2004b). While both shrinkage estimators utilize the
covariance matrix, Ledoit-Wolf (or LW hereafter) is unique because it does not
require matrix inversion. Thus the number of stocks can exceed the number
of observations, which is extremely advantageous for large portfolios. We show

106
that the variance bias can be eliminated using these improved estimators. Using
Theil’s decomposition, we study whether the prediction error is increased by the
“corrected” volatility shrinkage estimates while the volatility bias is eliminated.
We find that Stein estimators do increase the prediction error but the LW
estimator does not. We also find that the LW optimum shrinkage dominates
randomly chosen shrinkage factors. Finally we find there is differential optimal
shrinkage for stocks with different proportions of systematic and idiosyncratic
risks

107
3.1 Introduction

The Black and Scholes (1973) option pricing model exhibits systematic mis-
pricing of options on individual stocks and options on indexes of stocks.
This mis-pricing has been related to moneyness (S /K ), time to expiration,
and volatility. The mis-pricing has also been related to the Black-Scholes
distributional assumption, to their assumption of no dividend payouts, and to
the model’s European rather than American nature.1

This paper’s concern is the volatility bias observed in cross-section when


pricing options on individual stocks. Black and Scholes (1972) were the first to
report that their model under-priced options on low variance stocks and over-
priced options on high variance stocks. Black-Scholes used over the counter
option (OTC) data when they reported this variance bias because listed options
did not commence trading until April, 1973. OTC options are quasi-European
because OTC dividend protection eliminates the probability of early exercise.2
Black (1975) later reported that the model also under-priced out-of-the-money
options and near maturity options, while it over-priced in-the-money options
on individual stocks. Macbeth and Merville (1979), Rubinstein (1985), Whaley
(1982), Sterk (1982), Geske and Roll (1984b), and others discuss these biases but
do not focus on the volatility bias.

There have been many theoretical papers concerned with Black-Scholes


assumption of constant or deterministic stock return volatility. (Cf. Merton
(1976), Cox and Ross (1976), Geske (1979), Hull and White (1987), Heston
(1993), Bakshi, Cao, and Chen (1997), Heston and Nandi (2000).) However,

1
Black and Scholes (1972), Black (1975), Macbeth and Merville (1979), Rubinstein (1985),
Whaley (1982), Sterk (1982), Geske and Roll (1984b).
2
See Geske, Roll, and Shastri (1983).

108
these papers would potentially alter the prices of options on all individual stocks
without a particular focus on the observed cross-sectional mis-pricing of options
on low and high variance stocks. Thus, in this paper it is our thought to see
whether this variance bias observed in individual option cross-sectional prices
can be attributed to estimation error in the sample variance.

There is some a priori reason to suspect estimation error in the sample variance
rather than the model as the source of this particular mis-pricing. The reason is
that this variance related mis-pricing always arises in the context of an inter-stock
comparison. This is in contrast to other biases (moneyness, time to expiration),
which can be detected in an inter-option comparison. Unlike the strike price and
time until expiration parameters, the true variance is identical for all identical
expiration options on the same stock on a given date. Thus, investigation of the
variance related mis-pricing cannot rely on either the implied variance or other
more sophisticated option pricing models, but must instead be based on historical
estimates of actual stock return volatility.

There are many techniques to improve the accuracy of the volatility estimate
for individual stocks. (Cf. Boyle and Ananthanarayanan (1977), Parkinson
(1980), Garman and Klass (1980), and Butler and Schachter (1986), ARCH,
GARCH.) However, the essence of the present problem is that a number of
variances are estimated simultaneously, one for each stock, and then option mis-
pricing is related cross-sectionally to these multitudinous estimates.

The problem of simultaneously estimating multiple parameters has become


well-known in statistical theory. The cross-sectional sampling distribution
consists of two parts, variability in the true underlying population parameters and
variability in the estimation error. In any sample, larger estimates relative to the
cross-sectional mean are more likely to contain positive sampling errors and vice

109
versa for relatively smaller estimates. Thus, in a cross-sectional comparison of
option mis-pricing, estimation error alone will cause stocks with larger estimated
variances to over-price the market and stocks with smaller estimated variances to
under-price the market. The Black-Scholes model price, being a positive function
of the sample variance, should display a positive cross-sectional mis-pricing. This
is exactly the observed mis-pricing phenomenon.

When many variances are being estimated, one for each stock, a James and
Stein (1961) estimator is unambiguously superior to the standard univariate
estimator. The James-Stein estimator reduces estimation risk on average over
all stocks. Such an estimator “shrinks” each individual variance estimate toward
a target such as the grand mean of all estimates. Since the variance bias is
characterized by over-pricing options on high volatility stocks and under-pricing
options on low volatility stocks, adjusting each estimated volatility toward the
average volatility for all stocks obviously has the potential to reduce the observed
variance bias. In the multiple variance estimation setting, the superior James-
Stein estimation technique has the potential to eliminate this problem.

Geske and Roll (1984a) observed the variance bias and were the first to
attempt a correction based on a version of Stein’s technique described in Efron
and Morris (1976).3 However, this particular “shrinkage” technique involves two
difficult questions. First, how much historical data should be used to estimate
individual stock variances? Second, toward what target should individual stock

3
Subsequent to Geske and Roll (1984a), several other papers confront the same volatility
problem Karolyi (1993) uses a Bayesian approach. He describes the difference (p. 583) as
follows: “What distinguishes the Bayesian estimator of volatility from the “shrinkage” estimator
is in the adjustment process.” Karolyi considers only call options and he reports that the
Bayesian approach eliminates the volatility bias for high volatility stocks but there remains a
statistically significant but small bias for the low volatility stocks. Karolyi also reports that
the Bayesian estimator creates an under pricing bias in all the call options. Geske and Torous
(1990, 1991) use robust techniques to treat outliers when estimating volatility, and they also
examine the effects of a non-normal skewness and kurtosis on option prices.

110
variance estimates be shrunk? Until recently, the first question was usually
resolved by constraints on matrix inversion. The sample covariance matrix is
non-singular only when the time series sample size, N , exceeds the number of
stocks, k .4 Because of this requirement, smaller groups of stocks are often formed
to estimate parameters, and then results from the smaller groups are combined
and analyzed.

The second question of shrinkage target is more complex. The target should
have minimal free parameters (a lot of structure), should have less estimation
error, and should somewhat reflect the characteristics of the quantity to be
estimated. In three recent papers Ledoit and Wolf (2003, 2004b, 2004a) have
introduced techniques that provide solutions to these requirements.

Ledoit and Wolf start with the sample covariance matrix because it is unbiased
and easy to calculate. They recognize that it is subject to estimation error,
especially when there are fewer time series observations than individual stocks,
which is often the case in financial applications. They also recognize that
an estimator with more structure would have less estimation error, but would
likely be mis-specified and biased. Thus, they find a compromise by computing
an optimal linear convex combination of the sample covariance matrix and a
structured target. They provide results for three targets, the Sharpe single index
model, the identity matrix, and a constant correlation model. Herein we compare
a version of the James-Stein estimator to the Ledoit-Wolf technique. For Ledoit-
Wolf we shrink toward the simplest target, the identity matrix, which is well
conditioned, structured, and parsimonious.

Section 2 describes the data and test calculations. Section 3 describes


4
In addition, no two stocks can be perfectly correlated in sample. Although perfect
correlation is rarely an issue, very high correlation can cause instabilities in the resulting
shrinkage estimates.

111
alternative variance estimators. Section 4 reports the results and shows that
the shrinkage techniques of Stein and Ledoit-Wolf both eliminate the variance
bias, but that the Ledoit-Wolf technique is superior with respect to prediction
error. Section 5 concludes.

3.2 Data and Test Calculations

The data come from CRSP for daily stock returns and from Option Metrics (OM)
for call and put option prices, dividend distributions, and implied volatilities. The
OM data span the 100 months from January, 1996 through April, 2004 inclusive.

Stocks are screened one way and options are screened five ways. To assure
that stocks are actively traded, we use only the 500 largest stocks by market
capitalization on the last trading day of the previous year. Stocks are limited
to common shares with share codes 10 or 11. For options, the first screen limits
observations to the first trading day of each calendar month. This potentially
provides 100 monthly observations of options on 500 individual stocks and allows
estimators of volatility to be computed with return observations through the end
of each preceding month. The second screen limits options to being near-the-
money, which we define as 0.95 < K /S < 1.05 (with K the strike price and
S the stock price on the first day of the month.) Near-the-money options are
the most actively traded of all options with different times to expiration, and
since these are options on large companies they usually trade many times every
day. Also, near-the-money options should exhibit less moneyness bias. The third
option screen restricts the sample to options expiring on the third Friday of the
next month. Thus, all options have the same short time to expiration, which
should control somewhat for any time bias. Short-maturity options are also the
most actively traded of all options with different strike prices. Thus, near-the-

112
money, short-maturity options on large stocks should trade many times every
day. The fourth screen restricts options to those that actually did trade on each
day. The fifth option screen eliminates any detectable arbitrage violations (e.g.,
C > S − Ke −rT ; P > S − K ). After these screens, the sample has on average
about 300 call options and 200 put options per month .

Historical volatilities are computed for each individual stock using 126 days
(approximately 6 months) of previous CRSP daily data preceding each of the 100
first day of month observations for the stock price and option prices. Stock βs are
calculated using 504 days (approximately 2 years) of daily data preceding each of
the 100 first day of month observations, using the CRSP value weighted return as
the market index. (βs are inputs for the particular Stein estimator that assumes
a one-factor structure for the covariance matrix.) We also compute the sample
covariance matrix for all stocks in the sample using the preceding 6 months of
CRSP daily data; this is an input for the Ledoit-Wolf estimator.

3.3 Alternative Variance Estimators

The variance estimate should be forward-looking. An obvious choice for the


estimate of an expectation is the average from historical data. Stein (1955)
showed that when the number of expectations being estimated exceeds two,
estimating each of them by its own historical average is an inadmissible procedure.
In other words, no matter what the true values, there are estimation methods with
smaller total risk, where risk is defined as the expected value of the squared error
of the estimator. Stein and James provided an example of such an estimator.
The James and Stein (1961) estimator is given by an equation similar to the
following:
2
σ̂sj = σ̄ 2 + γj (σ̂Hj
2
− σ̄ 2 ) (3.1)

113
2 is the Stein estimator for stock j , σ̂ 2 is an historical estimate for the
where σ̂sj Hj

same stock, σ̄ 2 is the grand cross-sectional average of all the historical estimates,
and γj is a shrinking intensity factor bounded between zero and one.

Stein estimators are reminiscent of Bayesian methods. In the limit, as the


number of estimates becomes very large, Stein and Bayes’ converge. In practice,
the James-Stein estimator is often referred to as an “empirical Bayes” rule5 . The
shrinkage intensity factor is potentially a function of many things, including the
sample averages, the number of stocks in the sample, the number of observations
for each stock, the estimated historical volatility of each stock, and the grand
mean of all stock volatilities. For example, the covariance estimator provided by
Efron and Morris (1976) is given by

N − k − 2 −1 k + 1 − k2 2 −1 −1
Ŝs = [ ŜH + (σ̄ I) ] (3.2)
N −1 N −1

where S denotes covariance, with subscripts H and S denoting historical and


Stein, respectively, N is the time series sample size, k is the number of securities, I
is the (k ×k ) identity matrix and is the grand sample mean of historical variances.
In this case, shrinkage produces an estimate of the inverse covariance matrix with
shrinkage intensity approximately (N − k − 2)/(N − 1).

A major limitation of generalized Stein techniques for financial applications


is that the sample covariance matrix has too little structure. If, for example,
it is beneficial to use the sample covariance matrix of stock returns, but the
number of historical returns per stock, N , is of the same order of magnitude as
the number of stocks, k , then the total number of parameters to be estimated is

5
See Efron and Morris (1975). They discuss Stein’s rule as an empirical Bayes rule,
and present applications such as predicting baseball batting averages, estimating toxomosis
prevalence rates, and estimating the exact size of Pearson’s chi-square test.

114
of the same order as the total size of the data available. When k is larger than
N , the sample covariance matrix is always singular, even if the true covariance
matrix is known to be non-singular. Muirhead (1987) reviews the literature
on shrinkage estimators of the covariance matrix and shows that they all suffer
from two major limitations: (i) they break down when k > N and the matrix
cannot be inverted; (ii) they do not utilize a priori knowledge about correlations
between stock returns. We can circumvent the second limitation by assuming
that asset returns follow a factor model, say the single-factor market model akin
2 . By
to the CAPM. Therefore the off-diagonal entry i , j of Ŝs is simply β̂i β̂j σ̂m
imposing more structure in this fashion, one can make the sample covariance
matrix behave. Ledoit-Wolf techniques circumvent both of these problems.

For a given (N × k ) matrix X of de-meaned observations, Ledoit-Wolf derive


an “optimal” estimator, S ∗ , that is a linear combination of the sample covariance
matrix, S = XX ′ /N , and a target matrix, whose expected quadratic loss
E[kS ∗ − S k2 ] is a minimum. When the target is the identity matrix, they show
that S ∗ = γµI + (1 − γ)S depends on four scalars that while unobservable, can be
consistently estimated from their sample counterparts. These four scalars are:6

qm
m ≡ hS , Ii −→ µ (3.3)
qm
d 2 ≡ kS − mIk2 −→ ξ (3.4)
N
2 2 2 2 1 X qm
b = min(b̄ , d ), where b̄ = 2 kX.i X.iT − S k2 , b −→ β (3.5)
N
i=1
qm
a 2 ≡ d 2 − b 2 ; a −→ α (3.6)

6
See Ledoit and Wolf (2004b). The squared Frobenius norm kk2 is a quadratic form whose
inner product is hXX ′ i = tr(XX ′ )/N and the four unobservable scalars are µ = hΣ, I i,
α2 = kΣ − µIk, β 2 = E[kS − Σk2 ], and ξ 2 = E[kS − µIk2 ] and Σ is the true covariance
matrix and convergence is in quadratic mean, qm.

115
their linear combination of S and I that minimizes the expected quadratic loss
is:
b2 a2
S∗ = mI + S (3.7)
d2 d2
b2
Now if γ is defined as γ ≡ d2
, then S ∗ = γmI + (1 − γ)S .

3.4 Experimental Results

We shrink the standard historical volatilities estimates in three ways (two Stein
and one Ledoit-Wolf) and then compare the four estimators (including the
historical.) We form groups of 50 stocks each for Stein because it requires
the cross-section of individual stocks, k , to be smaller than the time series of
observations, N (herein N = 126). The two Stein estimators differ because
the first groups stocks randomly while the second estimator groups stocks to
maximize the volatility dispersion within each group. To achieve the volatility
dispersion, we first sort all the 500 stocks by their historical volatilities and
allocate the stocks ranked 1, 11 . . . 481, 491 to the first group, 2, 12, . . . to the
second group, and similarly for all 10 groups.

At the beginning of months the various volatility estimates are matched


near-the-money implied volatilities for options that expire the next month. For
example, on January 4, 1996, we choose the 500 largest stocks by market
capitalization at the end of 1995 and compute their 6 month historical volatilities
using the previous 126 days of daily data. Three shrunk volatilities are then
computed for each stock, and the implied volatility of the near-the money (call
and put) option expiring the next month (February 20, 1996) is computed.

Table 3.1 provides summary statistics for all the volatility estimators.
The statistics presented are time-series means of the cross-sectional summary

116
statistics. On average, all Stein-type estimators have a lower mean than the
original historical estimates as well as the Ledoit-Wolf estimator. The reason is
all the Stein-type estimators involve matrix inversions which decrease the average
due to the Jensen’s inequality. Moreover, all shrinkage estimators exhibit lower
cross-section dispersion as expected, consistent with the shrinkage process. The
LW estimator still preserves preserves more cross-sectional variation compared
with the Stein estimators, which suggests that the LW shrinkage intensity is
effectively smaller.

Bias can be measured by examining the log ratio of implied to estimated


volatility. Thus, for each stock we first compute four bias ratios, one for the
historical volatility and three for the shrinkage estimators and then regress
each bias ratio on the volatility estimator that generated the bias, and on the
moneyness of the option as a control. Specifically, we define:

σimp
Errorestimator = ln( ) (3.8)
σestimator

with estimator is Historical, James-Stein with random groups, James-Stein with


large dispersion groups, Ledoit-Wolf.

Then the following cross-sectional regressions are computed for each month:

σimp,i,t
log( ) = α + β σ̂i,t + γXi,t + ǫi,t (3.9)
σ̂i,t

Following Fama and MacBeth (1973), time series means of the cross-sectional
coefficients are compared against time series standard errors computed using a
Newey-West autocorrelation correction with 8 lags.

Table 3.2 presents the main results from these regressions. The historical
volatility column reports the coefficients and test statistics for equation (3.9)

117
when volatility is computed with the standard historical method; it shows clearly
the extent of the previously-observed volatility bias. The coefficient is large and
negative, −0.390, and very significant, (t = −8.686.) This is consistent with the
finding in Black and Scholes (1972) that in the cross-section options of low (high)
volatility stocks are under priced (over priced) by their model.

In columns next three columns for the panel using calls, the coefficients
(-0.057, 0.119, -0.029) for the three shrinkage estimators and test statistics
(−0.044, 0.883, −0.869) for James-Stein Random, James-Stein High Dispersion,
and Ledoit-Wolf show that the volatility bias has been eliminated. The control
for moneyness reveals that the moneyness bias is significant and is independent
of the volatility bias. Puts display an identical pattern for everything. Thus,
we conclude that both Stein and Ledoit-Wolf shrinkage techniques are able to
eliminate this volatility bias of under pricing options on low volatility stocks and
over pricing options on high volatility stocks.7

In Table 3.3 we present further analysis and comparisons of the historical


and shrinkage estimators. This table shows the average prediction errors of the
uncorrected historical estimator and of the corrected shrinkage estimators. We
wanted to see if the process of shrinking the volatility estimators increased the
prediction errors even though it eliminated the volatility bias. Row 1 for call
options shows that the uncorrected historical volatility estimator has the smallest
prediction error of 0.040. The prediction errors for both Stein 1 (random) and

7
In unreported results, we also examine other variants of the James-Stein estimators
with differing assumptions about the covariance matrix target. One target assumes that all
covariances are the same and equal to the average sample covariance. The other target assumes
that all covariances are zero. These calculations were again carried out with randomly sorted
groups and with groups organized to maximize intra-group volatility dispersion. In all cases,
the results are essentially the same as those reported in all tables (3.1 through 3.6 inclusive), for
the James-Stein estimators. The authors will be happy to provide detailed results to interested
readers.

118
Stein 2 (disperse) are larger (0.058 and 0.054) and very significantly different from
the historical estimator (t-stats of 3.843 and 3.460). However, the prediction error
for the Ledoit-Wolf estimator, 0.040, is almost the same size as the uncorrected
historical prediction error, 0.040, and the two are not significantly different. For
put options the results are very similar, with the only difference being that the
Ledoit-Wolf estimator now has the lowest prediction error, 0.035, but it is not
statistically different from the uncorrected historical estimator prediction error,
0.037. Thus, we see that while the Stein shrinking does eliminate the volatility
bias, it also increases the prediction error, and this increased error is statistically
significant. The Ledoit-Wolf estimator does not have this problem.

The prediction errors can be elucidated by using Theil’s decomposition, which


separates the error into three components: (i) the error attributable to bias in the
forecasts (UM); (ii) the error attributable to low correlation between the actual
and the forecast (UR); and (iii) the remaining prediction error (UD). This analysis
for call options shows that the portion of the prediction error attributable to bias
in the forecast (UM) is not significantly different from the historical estimator for
any of the shrinkage estimators. The portion of the prediction error attributable
to low correlation between the actual and the forecast (UR) is lowest for the
Ledoit-Wolf estimator, 0.001, and LW is very significantly different from the
uncorrected historical estimator. The remaining portion of the larger prediction
errors for the Stein estimators are significantly different from the uncorrected
historical estimate while the Ledoit-Wolf is not different. These results are similar
for put options where Table 3.3 shows the Ledoit-Wolf estimator exhibits the
lowest prediction error, 0.035.

It could be illuminating to examine whether stocks with different percentages


of systematic and idiosyncratic components of risk are shrunk differently. Thus,

119
we define the following relative indicator of systematic risk for each stock I at the
beginning of month t, based on approximately two prior years of daily returns:

σidiosyncratic,i,t 2
Sysi,t = 1 − ( ) (3.10)
σhistorical ,i,t

and the following indicator of relative shrinkage:

2
σshrunk 2
,i,t − σhistorical ,i,t
Shrinkagei,t = 2 (3.11)
σhistorical ,i,t

for each shrinkage estimator.

Then the following regressions is calculated within each monthly cross-section


and test statistics are taken from the time series of cross-sectional coefficients:

Shrinkagei,t = α + ωSysi,t + εi,t (3.12)

Table 3.4 presents the results. For the Stein estimators, as the systematic
portion of the risk increases, the negative coefficient indicates that the difference
between the corrected and uncorrected estimators decreases. Thus, it appears
that Stein’s shrinkage percentage decreases (increases) as the portion of
systematic risk increases (decreases), or alternatively, Stein shrinks more if the
proportion of idiosyncratic risk is larger. Ledoit-Wolf estimator appears to behave
opposite to Stein in this respect. As the proportion of systematic risk increases
the Ledoit-Wolf shrinkage percentage increases.

The higher prediction errors of the Stein-type estimators might arise because
the Efron-Morris method assumes normality while stock returns distributions
are leptokurtic. In order to investigate whether the leptokurtosis increases the
prediction errors, we first insert each stock’s 6-month kurtosis into equation 12.

120
The regression results are displayed in Table 3.5. Higher kurtosis induces a
upward bias in all the volatility estimates as evidenced by the significant negative
coefficients for all estimators. The same results hold in both call and put options.
However, the coefficients for this kurtosis variable are virtually the same across
all the estimators, implying that the impact of kurtosis is about the same across
all estimators and not likely to be the reason for the higher prediction errors of
the Stein-type estimators.

Table 3.6 examines the prediction errors in low and high kurtosis stocks. Each
month, we sort all the stocks into two halves by their previous 6-month kurtosis
and look at the prediction errors of all the estimators in each half. T -statistics
for the difference between shrinkage estimators and the historical estimator are
computed from the 100-month time series of each monthly difference in errors
with a Newey-West correction for autocorrelation using eight lags.

For call options, the prediction errors using the historical and Ledoit-Wolf
estimators are significantly lower in the low kurtosis group. For Stein-type
estimators, the gaps between low and high kurtosis groups are much smaller.
Therefore, among low kurtosis stocks, the prediction errors for the Stein-type
estimators remain significantly higher than for the historical and Ledoit-Wof
estimators. In the high kurtosis group, all estimators have similar prediction
errors and the differences are not significant statistically. Hence, the generally
higher prediction errors exhibited by the Stein estimators can be attributed
mostly to low kurtosis stocks.

The results for put options are quite similar except that the Stein-type
estimators produce errors that are also significantly greater than the historical
estimator even for high kurtosis stocks, though the significance level is higher for
low kurtosis stocks as it is for calls.

121
We also examine whether an optimal shrinkage estimator that minimizes the
sum of squared errors is important for the particular application of volatility
estimation. To do this, we compare LW’s optimal shrinkage to a random average
of combining the historical product moment sample matrix and the identity
matrix. In a similar comparison for estimation of the covariance matrix both
Jagannathan and Ma (2003) and Disatnik and Benninga (2007) report that
optimal shrinkage is no better than randomly choosing between the sample matrix
and the identity matrix , and thus optimality is not worth the effort.8 We find
that the LW optimal shrinkage estimator is much better than the random average
of the sample matrix and the identity target.

3.5 Conclusion

A volatility bias in option prices was first uncovered by Black and Scholes
(1972). They demonstrated that their model over-priced options on relatively
high volatility stocks under-priced options on relatively low volatility stocks. We
thought that this bias might have nothing to do with the Black-Scholes model
but instead could be attributable to sampling error because it is always observed
in cross section with inter stock differences. If this is true, this bias would be
observed with any option pricing model on any underlying, not just equity, but
also fixed income securities, mortgages, foreign exchange, and commodities.

To investigate this issue, we implemented the alternative variance estimators

8
Jagannathan and Ma (2003), p. 1667, and Disatnik and Benninga (2007), p. 60 report a
random average does as well as optimal shrinkage. Disatnik and Benninga state, “Theoretically,
the shrinkage estimator should perform better than any other weighted average of the two
estimators, as the proportions in the weighted average of the shrinkage estimator are obtained
from minimizing the quadratic risk (of error) function of the combined estimator. Yet it seems
that, in practice, estimating these specific proportions gives rise to a new type of error, and
overall the shrinkage estimator does not perform better than the random average.”

122
of James-Stein and Ledoit-Wolf, which correct historical volatility estimates by
shrinking them toward a central value, thereby reducing their cross-sectional
dispersion. While both shrinkage estimators utilize the covariance matrix, Ledoit-
Wolf is unique because it does not require matrix inversion and thus the number
of stocks can exceed the number of observations.

First, we verify that the same bias Black-Scholes originally observed was
present and very significant in both put and call option prices for the 100 months
during the period January, 1996 through April, 2004. Second, we find that
shrinkage variance estimators can eliminate this volatility bias, independent of
the presence of the moneyness bias. Third, we uncover a difference between the
Ledoit-Wolf and Stein estimators; the former does not increase the prediction
error, but the latter significantly increase prediction error for stocks with low
kurtosis. Fourth, we demonstrate that the Stein estimator’s shrinkage percentage
is greater the lower (higher) the proportion of a stock’s systematic (idiosyncratic)
risk. The Ledoit-Wolf estimator behaves oppositely, and has a greater (lower)
shrinkage percentage the higher (lower) the portion of systematic (idiosyncratic)
risk. Finally, we show that the optimal shrinkage estimator of Ledoit-Wolf is
superior to a random combination of the sample matrix and the target for this
volatility estimation problem.

123
Table 3.1: Summary Statistics of Annualized Volatility Estimates

This table shows time series averages of cross-sectional summary statistics of the historical
volatility estimates with 6-months of daily returns and corresponding shrinkage estimators.
Stein Random is the volatility shrunk by the Efron-Morris formula in random groups. Stein
High Dispersion is the volatility shrunk by Efron-Morris formula in groups formed to have larger
volatility dispersion. Ledoit-Wolf is the volatility shrunk by the Ledoit-Wolf method. The mean
is the average across all time series and cross-sections. The std is the time series average of
the cross-sectional standard deviation for each sample month. Minimum and maximum are the
time series averages of, respectively, the cross-sectional minimum and maximum in each sample
month.

Historical Stein Stein Ledoit


Random High Disp
Mean 0.41 0.368 0.367 0.421
Std 0.186 0.113 0.107 0.159
Min 0.131 0.085 0.092 0.213
Max 1.301 0.714 0.663 1.237

124
Table 3.2: Volatility Biases

Fama-MacBeth type tests were conducted for the following cross-sectional specification,
σ
log( imp,i,t
σ̂i,t ) = α + β σ̂i,t + γXi,t + ǫi,t . The upper panel is for call options, the lower panel
for put options. All t -statistics are computed from the 100-month time series of cross-sectional
coefficients with a Newey/West correction for autocorrelation using eight lags and are reported
below the corresponding coefficient means. The four columns correspond to different volatility
estimators. Historical is the standard estimator. James-Stein Random is an estimator shrunk
by the Efron-Morris formula in random groups. James-Stein High Dispersion is an estimator
shrunk by the Efron-Morris formula in groups with large volatility dispersion. Ledoit-Wolf is
an estimator shrunk by the Ledoit-Wolf method.

Panel A: Call Options

Historical Stein Stein Ledoit


Random High Disp
Const -0.345 -0.536 -0.566 -0.542
(-3.086) (-4.773) (-5.118) (-5.350)
σ
b -0.39 -0.057 0.119 -0.029
(-8.686) (-0.443) (0.883) (-0.869)
Moneyness 0.464 0.624 0.579 0.485
(4.443) (5.085) (4.684) (4.835)

Ave. R 2 0.158 0.065 0.08 0.05


Ave. Cross-section 302 302 302 302

Panel B: Put Options

Historical Stein Stein Ledoit


Random High Disp
Const -0.356 -0.239 -0.383 -0.501
(-3.798) (-3.567) (-5.592) (-5.772)
σ
b -0.389 -0.057 0.137 -0.052
(-9.957) (-0.404) (0.930) (-1.301)
Moneyness 0.497 0.367 0.425 0.477
(5.289) (4.274) (5.056) (5.093)

Ave. R 2 0.193 0.075 0.091 0.066


Ave. Cross-section 205 205 205 205

125
Table 3.3: Prediction Errors of Volatility Estimators

Average prediction errors are computed for the historical volatility estimator and all three
σ
shrinkage estimators, measured by the root mean square of log( imp,i,t
σ̂i,t ). T -statistics for the
difference between shrinkage estimators and the historical estimator are computed from the
100-month time series of each monthly difference in errors with a Newey-West correction for
autocorrelation using eight lags, and are reported below the corresponding coefficient means.
The four columns correspond to different volatility estimators. Historical is the standard
estimator. James-Stein Random is an estimator shrunk by the Efron-Morris formula in random
groups. James-Stein High Dispersion is an estimator shrunk by the Efron-Morris formula in
groups with large volatility dispersion. Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf
method. In Theil’s decomposition, UM is the proportion due to bias in the forecasts. UR is
the error due to a low correlation between the actual and the forecast. UD is the remaining
part. T -statistics in the parentheses are computed using Newey-West with 8 lags.

Panel A: Call Options

Historical Stein Stein Ledoit


Random High Disp
MSE 0.042 0.057 0.054 0.043
(3.473) (3.163) (0.238)
UM 0.010 0.012 0.012 0.014
(0.608) (0.376) (1.003)
UR 0.005 0.003 0.003 0.001
(-1.543) (-2.375) (-6.994)
UD 0.027 0.041 0.040 0.027
(6.760) (6.241) (0.481)

Panel B: Put Options

Historical Stein Stein Ledoit


Random High Disp
MSE 0.039 0.056 0.054 0.037
(3.872) (3.573) (-0.637)
UM 0.010 0.015 0.014 0.012
(1.648) (1.367) (0.497)
UR 0.006 0.004 0.003 0.001
(-1.974) (-2.979) (-7.789)
UD 0.024 0.037 0.036 0.024
(7.989) (7.664) (0.569)

126
Table 3.4: Shrinkage and Systematic Risk

For each of 100 month, cross-sectional regressions were computed to explain the shrinkage
proportion as a function of the systematic risk estimated over the previous two years
(approximately.) T -statistics, in parentheses, are computed from the time series of cross-
sectional coefficients using a Newey-West correction for autocorrelation with 8 lags. The
columns correspond to three alternative shrinkage estimators. James-Stein Random is an
estimator shrunk by the Efron-Morris formula in random groups. James-Stein High Dispersion
is an estimator shrunk by the Efron-Morris formula in groups with large volatility dispersion.
Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf method.

Stein Stein Ledoit


Random High Disp
SysI ,t -0.516 -0.462 0.478
(-8.417) (-5.642) (-7.033)
Const 0.047 0.04 0.021
(-3.234) (-2.104) (-2.774)
Ave. R 2 0.178 0.167 0.064

127
Table 3.5: Control for Kurtosis

Fama-MacBeth type tests were conducted for the following cross-sectional specification,
σ
log( imp,i,t
σ̂i,t ) = α + β σ̂i,t + γXi,t + ǫi,t . The upper panel is for call options, the lower panel
for put options. All t -statistics are computed from the 100-month time series of cross-sectional
coefficients with a Newey/West correction for autocorrelation using eight lags and are reported
below the corresponding coefficient means. The four columns correspond to different volatility
estimators. Historical is the standard estimator. James-Stein Random is an estimator shrunk
by the Efron-Morris formula in random groups. James-Stein High Dispersion is an estimator
shrunk by the Efron-Morris formula in groups with large volatility dispersion. Ledoit-Wolf is
an estimator shrunk by the Ledoit-Wolf method.

Panel A: Call Options

Historical Stein Stein Ledoit


Random High Disp
Const -0.306 -0.507 -0.539 -0.514
(-2.564) (-4.204) (-4.539) (-4.746)
σ
b -0.313 0.064 0.263 0.06
(-8.372) (0.496) (1.901) (2.092)
Moneyness 0.447 0.608 0.56 0.472
(4.032) (4.597) (4.201) (4.427)
Kurtosis -0.006 -0.006 -0.006 -0.005
(-22.052) (-18.672) (-18.241) (-18.704)

Ave. R 2 0.244 0.131 0.153 0.141


Ave. Cross-section 302 302 302 302

Panel B: Put Options

Historical Stein Stein Ledoit


Random High Disp
Const -0.334 -0.224 -0.373 -0.491
(-3.198) (-2.913) (-4.740) (-5.002)
σ
b -0.323 0.054 0.267 0.025
(-11.081) (0.385) (1.762) (0.738)
Moneyness 0.501 0.371 0.43 0.484
(4.849) (3.894) (4.593) (4.694)
Kurtosis -0.006 -0.006 -0.007 -0.005
(-18.978) (-14.684) (-14.113) (-20.238)

Ave. R 2 0.283 0.149 0.172 0.161


Ave. Cross-section 205 205 205 205

128
Table 3.6: Prediction Errors for Low and High Kurtosis Stocks

Average prediction errors are computed for the historical volatility estimator and all three
σ
shrinkage estimators, measured by the root mean square of log( imp,i,t σ̂i,t ) for stocks grouped
by kurtosis over the previous six months. T -statistics for the difference between shrinkage
estimators and the historical estimator are computed from the 100-month time series of each
monthly difference in errors with a Newey-West correction for autocorrelation using eight lags.
These t -statistics are given in parentheses below each mean prediction error. The four columns
correspond to different volatility estimators. Historical is the standard estimator. James-Stein
Random is an estimator shrunk by the Efron-Morris formula in random groups. James-Stein
High Dispersion is an estimator shrunk by the Efron-Morris formula in groups with large
volatility dispersion. Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf method. In Theil’s
decomposition, UM is the proportion due to bias in the forecasts. UR is the error due to a low
correlation between the actual and the forecast. UD is the remaining part. T -statistic in the
parentheses are computed using Newey-West with 8 lags.

Panel A: Call Options

Historical Stein Stein Ledoit


Random High Disp
Low-Kurt 0.032 0.061 0.054 0.031
(4.715) (4.342) (0.166)
High-Kurt 0.049 0.056 0.055 0.049
(1.622) (1.496) (0.065)

Panel B: Put Options

Historical Stein Stein Ledoit


Random High Disp
Low-Kurt 0.029 0.070 0.060 0.026
(4.971) (4.715) (0.913)
High-Kurt 0.044 0.058 0.056 0.043
(2.984) (2.806) (0.363)

129
References

Amihud, Yakov, and Haim Mendelson, 1986, Asset pricing and the bid-ask
spread, Journal of Financial Economics 17, 223–249.

Amin, Kaushik I., and Charles M. Lee, 1997, Option trading, price discovery and
earning news dissemination, Contemproray Accounting Research 14, 153–192.

Andrews, Donald, 1993, Exactly Median - Unbiased Estimation of First Order


Autoregressive / Unit Root Models, Econometrica 61, 139–165.

Ang, Andrew, and Geert Bekaert, 2007, Stock return predictability: is it there?,
Review of Financial Studies forthcoming.

Anthony, Joseph H., 1988, The interrelation of stock and options market trading-
volume data, Journal of Finance 43, 949–964.

Back, Kerry, 1993, Asmmetric information and options, Review of Financial


Studies 6, 435–472.

Baker, Malcolm, and Jeffrey Wurgler, 2000, The equity share in new issues and
aggregate stock returns, Journal of Finance 55, 2219–2257.

Bakshi, Gurdip, Charles Cao, and Zhiwu Chen, 1997, Empirical performance of
alternative option pricing models, Journal of Finance 52, 2003–2049.

Barone-Adesi, Giovanni, and Robert E Whaley, 1987, Efficient analytic


approximation of American option values, Journal of Finance 42, 301–320.

Biais, Bruno, and Pierre Hillion, 1994, Insider and liquidity trading in stock abd
options markets, Review of Financial Studies 74, 743–780.

Black, Fischer, 1975, Fact and fantasy in use of options, Financial Analyst
Journal 31, 36–41,61–72.

Black, Fischer, and Myron Scholes, 1972, The valuation of option contracts and
a test of market efficiency, Journal of Finance 27, 399–417.

Black, Fischer, and Myron Scholes, 1973, The pricing of options and corporate
liabilities, Journal of Political Economy 81, 637–659.

Boudoukh, Jacob, Roni Michaely, Matthew Richardson, and Michael R. Roberts,


2007, On the importance of measuring payout yield: implications for empirical
asset pricing, Journal of Finance 62, 877–915.

130
Boudoukh, Jacob, Matthew Richardson, and Robert F. Whitelaw, 2005, The
myth of long-horizon predictability, Working paper.
Boyle, Phelim P., and A. L. Ananthanarayanan, 1977, The impact of variance
estimation in option valuation models, Journal of Financial Economics 5, 375–
387.
Brennan, Michael J., Ashley W. Wang, and Yihong Xia, 2004, Estimation and
test of a simple model of intertemporal capital asset pricing, Journal of Finance
59, 1743–1776.
Butler, J. S., and Barry Schachter, 1986, Unbiased estimation of the Black-Scholes
formula, Journal of Financial Economics 15, 341–357.
Campbell, John Y., 1987, Stock returns and the term structure, Journal of
Financial Economics 18, 373–399.
Campbell, John Y., 1991, A variance decomposition for stock returns, Economic
Journal 101, 157–179.
Campbell, John Y., and John H. Cochrane, 1999, By force of habit: a
consumption-based explanation of aggregate stock market behavior, Journal
of Political Economy 107, 205–251.
Campbell, John. Y., Sanford J. Grossman, and Jiang Wang, 1993, Trading volume
and serial correlation in stock returns, Quarterly Journal of Economics 108,
905–939.
Campbell, John Y., and Robert J. Shiller, 1988a, Stock prices, earnings, and
expected dividends, Journal of Finance 43, 661–676.
Campbell, John Y., and Robert J. Shiller, 1988b, The dividend-price ratio and
expectations of future dividends and discount factors, Review of Financial
Studies 1, 195–228.
Campbell, John. Y., and Tuomo Vuolteenaho, 2004, Bad beta, good beta,
American Economic Review 94, 1249–1275.
Carr, Peter, Robert Jarrow, and Ravi Myneno, 1992, Alternative
characterizations of American put options, Mathematical Finance 2, 87–106.
Chakravarty, Sugato, Huseyin Gulen, and Stewart Mayhew, 2004, Informed
trading in stock and option markets, Journal of Finance 59, 1235–1257.
Chan, Kalok, Y. Peter Chung, and Herb Johnson, 1993, Why option prices lag
stock prices: A trading based explanation, Journal of Finance 48, 1957–1967.

131
Chan, Yeung L., and Leonid Kogan, 2002, Catching up with the Joneses:
heterogeneous preferences and the dynamics of asset prices, Journal of Political
Economy 110, 1255–1285.

Cohen, Randolph B., 1999, Asset allocation decision of individuals and


institutions, Working paper.

Cox, John C., and Stephen A. Ross, 1976, The valuation of options for alternative
stochastic processes, Journal of Financial Economics 3, 145–166.

Dichev, Ilia, 2007, What are stock investors’ actual historical returns? evidence
from dollar-weighted returns, American Economic Review 97, 386–401.

Disatnik, David J., and Simon Benninga, 2007, Shrinking the covariance matrix,
Journal of Portfolio Management pp. 55–63.

Easley, David, Maureen O’Hara, and P. S. Srinivas, 1998, Option volume and
stock prices: Evidence on where informed traders trade, Journal of Finance
53, 431–465.

Efron, Bradely, and Carl Morris, 1975, Data analysis using Stein’s estimator and
its generalizations, Journal of American Statistical Association 70, 311–319.

Efron, Bradely, and Carl Morris, 1976, Multivariate empirical Bayes and
estimation of covariance matrices, Annals of Statistics 4, 22–32.

Fama, Eugene, and J. D. MacBeth, 1973, Risk, return, and equilibrium: empirical
tests, Journal of Political Economy 81, 607–636.

Fama, Eugene F., and Kenneth R. French, 1988, Dividend yields and expected
stock returns, Journal of Financial Economics 22, 3–25.

Fama, Eugene F., and Kenneth R. French, 1989, Business conditions and expected
returns on stocks and bonds, Journal of Financial Economics 25, 23–49.

Fama, Eugene F., and Kenneth R. French, 1993, Common risk factors in the
returns on stocks and bonds, Journal of Financial Economics 33, 3–56.

Fama, Eugene F., and William G. Schwert, 1977, Asset returns and inflation,
Journal of Financial Economics 5, 115–146.

Ferson, Wayne, Sergei Sarkissian, and Tim Simian, 2003, Spurious regression in
financial economics?, Journal of Finance 58, 1393–1414.

Fleming, Jeff, and Robert E. Whaley, 1994, The value of wildcard options,
Journal of Finance 49, 215–236.

132
Garman, Mark B., and Michael J. Klass, 1980, On the estimation of security
price volatilities from historical data, 53, 67–79.

Geske, Robert, 1979, A note on an analytical valuation formula for unprotected


American call options on stocks with known dividends, Journal of Financial
Economics 7, 375–380.

Geske, Robert, and H. E. Johnson, 1984, The American put option valued
analytically, Journal of Finance 39, 1511–1524.

Geske, Robert, and Richard Roll, 1984a, Isolating the observed biases in
American call option pricing: an alternative variance estimator, Anderson
UCLA Working Paper.

Geske, Robert, and Richard Roll, 1984b, On valuing American call options with
the Black-Scholes European formula, Journal of Finance 39, 443–455.

Geske, Robert, Richard Roll, and Kuldeep Shastri, 1983, Over-the-Counter


Option Market Dividend Protection and ”Biases” in the Black-Scholes Model:
A Note, Journal of Finance 38, 1271–77.

Geske, Robert, and Walter Torous, 1990, Black-Scholes option pricing and robust
variance estimation, in Options: Recent Advances in Theory and Practice, ed.
by Stuart Hodges. Manchester University Press.

Geske, Robert, and Walter Torous, 1991, Skewness, kurtosis, and Black-Scholes
option mis-pricing, Statistical Papers 32, 299–309.

Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov, 2004, The MIDAS
touch: Mixed data sampling regression models, UCLA Working paper.

Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov, 2005a, Predicting


volatility: Getting the most out of return data sampled at different frequencies,
Journal of Econometrics forthcoming.

Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov, 2005b, There is a risk-
return tradeoff after all, Journal of Financial Economics 76, 509–548.

Gompers, Paul A., and Andrew Metrick, 2002, Institutional investors and equity
prices, Quarterly Journal of Economics 116, 229–259.

Goyal, Amit, and Ivo Welch, 2006, A comprehensive look at the empirical
performance of equity premium prediction, Review of Financial Studies
forthcoming.

133
Gukhal, Chandrasekhar Reddy, 2001, Analytical valuation of American options
on jump-diffusion processes, Mathematical Finance 11, 97–115.
Harvey, Campbell, and Robert E Whaley, 1992a, Dividends and S&P Index
Option Valuation, Journal of Futures Markets 12, 123–137.
Harvey, Campbell, and Robert E Whaley, 1992b, Market volatility prediction
and the efficiency of the S&P 100 index option market, Journal of Financial
Economics 30, 33–73.
Heston, Steven L., 1993, A closed form solution for options with stochastic
volatility with applications to bond and currency options, Review of Financial
Studies 6, 327–343.
Heston, Steven L., and Saikat Nandi, 2000, A closed-form GARCH option
valuation model, Review of Financial Studies 13, 585–625.
Hodrick, Robert J., 1992, Dividend yields and expected stock returns: Alternative
procedures for inference and measurement, Review of Financial Studies 5, 257–
286.
Hong, Harrison, Walter Torous, and Rossen Valkanov, 2004, Do industries lead
stock markets, UCLA Working paper.
Hull, John, and Alan White, 1987, A closed form solution for options with
stochastic volatility, Journal of Finance 42, 281–300.
Jagannathan, Ravi, and Tongshu Ma, 2003, Risk reduction in large portfolios:
Why imposing the wrong constraints helps, Journal of Finance 58, 1651–1683.
James, W., and C. Stein, 1961, Estimation with Quadratic Loss, in Proceedings of
the Fourth Berkeley Symposium on Mathetical Statistics and Probability vol. 1
pp. 361–379 Berkeley. University of California Press.
Karolyi, G. Andrew, 1993, A Bayesian approach to modeling stock return
volatility for option valuation, Journal of Financial and Quantitative Analysis
28, 579–594.
Keim, Donald B., and Robert F. Stambaugh, 1986, Predicting returns in the
stock and bond markets, Journal of Financial Economics 17, 357–390.
Kim, In Joon, 1990, The Analytical valuation of American options, Review of
Financial Studies 3, 547–572.
Kothari, S. P., and Jay Shanken, 1992, Stock return variation and expected
dividends, Journal of Financial Economics 31, 177–210.

134
Kyle, Albert S., 1984, Continuous auctions and insider trading, Econometrica 53,
1315–1336.
Ledoit, Olivier, and Michael Wolf, 2003, Improved estimation of the covariance
matrix of stock returns with an application to portfolio selection, Journal of
Empirical Finance 10, 603–621.
Ledoit, Olivier, and Michael Wolf, 2004a, Honey, I Shrunk the Covariance Matrix,
Journal of Portfolio Management Summer, 110–119.
Ledoit, Olivier, and Michael Wolf, 2004b, A Well-Conditioned Estimator for
Large Dimensional Covariance Matices, Journal of Multivariate Analysis 88,
365–411.
Longstaff, Francis A., and Eduardo S. Schwartz, 2001, Valuing American options
by simulation: a simple least-squares approach, Review of Financial Studies
14, 113–147.
Loughran, Tim, and Jay R. Ritter, 1995, The new issues puzzle, Journal of
Finance 50, 23–51.
Macbeth, James D., and Larry J. Merville, 1979, An empirical examination of
the Black-Scholes call option pricing model, Journal of Finance 34, 1173–1186.
Manaster, Stephen, and Richard J. Rendleman, 1982, Option prices as predictors
of equilibrium stock prices, Journal of Finance 37, 1043–1057.
McCracken, Michael W., 2004, Asymptotics for out-of-sample tests of Granger
causality, Working paper.
Merton, Robert C., 1973, Theory of rational option pricing, Bell Journal of
Economics and Management Science 4, 141–183.
Merton, Robert C., 1976, Option pricing when underlying stock returns are
discontinuous, Journal of Financial Economics 3, 125–144.
Miller, Edward M., 1977, Risk, uncertainty, and divergence of opinion, Journal
of Finance 32, 1151–1168.
Muirhead, R. J., 1987, Developments in Eigenvalue Estimation, in Advances in
Multivariate Statistical Analysis pp. 277–288 Dodrecht. D. Reidel Publishing
Company.
Newey, Whiney K., and Kenneth D. West, 1987, A simple, positive semi-
definite, heteroskedasticity and autocorrelation consistent Covariance matrix,
Econometrica 55, 703–708.

135
Pan, Jun, and Allen Poteshman, 2004, The information in option volume for
future stock prices, Working paper.

Park, Cheolbeom, 2005, Stock return predictability and the dispersion in earnings
forecasts, Journal of Business 78, 2351–2375.

Parkinson, Michael, 1980, The extreme value method for estimating the variance
of the rate of return, Journal of Business 53, 61–77.

Pastor, Lubos, and Robert F. Stambaugh, 2003, Liquidity risk and expected stock
returns, Journal of Political Economy 111, 642–685.

Poteshman, Allen M., and Vitaly Serbin, 2003, Clearly irrational financial market
behavior: Evidence from the early exercise of exchange traded stock options,
Journal of Finance 58, 37–70.

Ritter, Jay R., 1991, The long-run performance of initial public offerings, Journal
of Finance 46, 3–27.

Roll, Richard, 1977, An analytic valuation formula for unprotected American call
options on stocks with known dividends, Journal of Financial Economics 5,
251–258.

Rubinstein, Mark, 1985, Nonparametric tests of alternative option pricing models


using all reported trades and quotes on the 30 most active CBOE option classes
from August 2, Journal of Finance 40, 455–480.

Shiller, Robert J., 2000, Irrational Exuberance. (Broadway Books New York).

Sims, Christopher, 2001, Rational inattention, Working paper.

Stambaugh, Robert F., 1999, Predictive regressions, Journal of Financial


Economics 54, 375–421.

Stein, Charles, 1955, Inadmissibility of the Usual Estimator for the Mean of
a Multivariate Normal Distribution, in Proceedings of the Third Berkeley
Symposium on Mathematical Statistics and Probability vol. 1 pp. 197–206
Berkeley. University of California Press.

Stephan, Jens A., and Robert E. Whaley, 1990, Intraday price change and trading
volume relations in the stock and stock option markets, Journal of Finance 45,
191–220.

Sterk, William, 1982, Tests of two models for valuing call options on stocks with
dividends, Journal of Finance 37, 1229–1237.

136
Torous, Walter, Rossen Valkanov, and Shu Yan, 2005, On predicting stock returns
with nearly integrated explanantory variables, Journal of Business 77, 937–
966.

Valkanov, Rossen, 2003, Long-horizon regressions: theoretical results and


applications, Journal of Financial Economics 68, 201–232.

Wang, Ashley W., 2003, Institutional equity flows, liquidity risk and asset pricing,
Working paper.

Whaley, Robert E, 1981, On the valuation of American call options on stocks


with known dividends, Journal of Financial Economics 9, 207–211.

Whaley, Robert E, 1982, Valuation of American call options on dividend-paying


stocks: Empirical tests, Journal of Financial Economics 10, 29–58.

137

También podría gustarte