Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Los Angeles
by
Yuzhao Zhang
2008
c Copyright by
Yuzhao Zhang
2008
The dissertation of Yuzhao Zhang is approved.
Michael Brennan
Bryan Ellickson
Robert Geske
Mark Grinblatt
Rossen Valkanov
ii
to my parents, my aunt, Ming Wang and my girlfriend, Rui Wu
iii
Table of Contents
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.7 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
iv
Future Underlying Returns . . . . . . . . . . . . . . . . . . . . . . . . 53
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
v
List of Figures
vi
List of Tables
vii
3.3 Prediction Errors of Volatility Estimators . . . . . . . . . . . . . 126
3.6 Prediction Errors for Low and High Kurtosis Stocks . . . . . . . 129
viii
Acknowledgments
I am deeply indebted to my committee for the guidance and full support they
offered me throughout the Ph.D. program. Richard Roll, my committee chair was
always encouraging and provided countless constructive comments and insights
to my dissertation. Michael Brennan led our weekly reading group, gave a lot
of helpful suggestions, and patiently listened to my babblings. Bob Geske was
there whenever I needed help and I benefited greatly from conversations with
him on finance and life in general. Ross Valkanov first introduced to me what
empirical finance is and how to write an academic paper. Mark Grinblatt helped
me differentiate important issues from the rest. Bryan Ellickson told the asset
pricing story from a whole new perspective. Many other faculty members helped
me during my study and made helpful comments, including Jun Liu, Francis
Longstaff and Avanidhar Subrahmanyam.
I would also like to thank my fellow students. Feifei Li, Jason Hsu, Max
Moroz and Ashley Wang familiarized me with Westwood and the program when
I first arrived. The afternoon office hoops (a big paper recycle bin) with Juhani
Linnainmaa, Alessio Saretto and Duke Whang became one of my favorite sports.
I also enjoyed and learned a lot through discussions with Brett Myers, Udi Peleg,
Al Sheen, and Cesare Fracassi.
My grandma, Yayu Guan taught me to read and count and had always wanted
to see me in the gown. My grandpa (my grandma’s brother), Xunfu Guan, rode
me everywhere with his vintage black bicycle when I was a kid. I hope they
both will be watching me up there. My girlfriend Rui Wu brought me a lot of
happiness and laughters (sometimes by impersonating me) and I enjoyed every
minute of her presence. My parents and my mom-like aunt, Ming Wang, have
supported and inspired me over these years. Words can not express my feelings
ix
for their unconditional love.
x
Vita
xi
2008 Ph.D., Management
UCLA Anderson School
University of California, Los Angeles, CA
Presentations
Zhang, Y. (2006). Does the Early Exercise Premium Contain Information about
Future Underlying Returns? Paper presented at European Finance Association
2006 Meetings.
xii
Abstract of the Dissertation
Yuzhao Zhang
Doctor of Philosophy in Management
University of California, Los Angeles, 2008
Professor Richard Roll, Chair
The three chapters of this dissertation examine return predictability and volatility
estimation. The first chapter shows that flows from contrarian investors reveal
information about the representative agent’s risk aversion in a model in which
the representative agent displays time varying risk aversion and investors have
heterogeneous preferences. The key result, consistent with theory, is that the
flows of contrarian investors predict market returns. The second chapter studies
the information content of the call (put) Early Exercise Premium, or EEP . The
call EEP specifically captures investors’ expectations about future lump sum
dividend payments as well as other state variables such as conditional volatility
and interest rates. From that perspective, the EEP should also be related
to future returns of the underlying security. Interestingly, we find that the
EEP is a good forecaster of returns at daily horizons. Importantly, we find
that the predictability stems primarily from the ability of the EEP to forecast
innovations in dividend growth, rather than other components of unexpected
returns. The third chapter examines a volatility estimation bias that may be
commonly exhibited by all option pricing models. Black and Scholes (1972) were
the first to illustrate the bias by showing that their model under priced options on
xiii
relatively low variance stocks and over priced options on relatively high variance
stocks. The bias is always observed in cross section among individual stocks. We
show that alternative variance estimators that use “shrinkage” techniques can
eliminate the bias.
xiv
CHAPTER 1
We investigate the relation between aggregate stock returns and the capital
flows to the stock market from contrarian investors, defined as those whose
capital flows are in the opposite direction from those of the representative agent.
It is shown that flows from contrarian investors reveal information about the
representative agent’s risk aversion in a model in which the representative agent
displays time varying risk aversion and investors have heterogeneous preferences.
The key implication is that the flows of contrarian investors predict market
returns. We construct a contrarian flow measure using the capital flows from all
major market participants and find, consistent with our theoretical prediction,
that the contrarian flow is a good forecaster of returns at quarterly horizons.
The predictability is robust to other known return predictors. Moreover, the
predictability is stronger for growth stocks than for value stocks which, together
with the notion that growth stocks bear more discount rate risk, supports our
hypothesis that the contrarian flow measures the market risk premium.
1
1.1 Introduction
2
the studies focus on the link between aggregate investor flows or issues and the
market returns. If investors are heterogenous, some valuable information will be
lost in the aggregation process.
In this paper we develop and test the implications for the market risk premium
of heterogeneity in investor risk preferences. The model we develop abstracts
from both the mechanism which causes the risk aversion of the representative
agent to vary over time and the determination of the supply of risky assets. We
concentrate on the implication of heterogenous risk aversions when the supply of
risky asset is assumed to be negatively correlated with the representative agent’s
risk aversion. The model is set up with heterogenous preference and homogeneous
information. Investors have exponential utility and differ in their risk aversions
in a way similar to that of Campbell, Grossman, and Wang (1993). Under the
assumption that the supply of risky asset is inversely related to risk premium,
some investors are contrarian whose flows are in the opposite direction to the
flows of the representative agent. The major testable hypothesis of the model
is that the contrarian investors’ flows are positively related to the representative
agent’s risk aversion and therefore predict market returns.
The assumption of the supply of risky asset inversely related to risk premium
is a key component to the model.
The intuition for the results is as follows. Consider an economy in which the
representative agent has time varying risk aversion and the change in the supply
of the risky asset is negatively correlated with the revision of the representative
agent’s risk aversion. Time varying risk aversion induces a time varying risk
premium. The representative agent is composed of two types of investors. The
risk preference of the first type of investors does not vary over time while
the other type of investors’ exhibit time-varying preference which dictates the
3
representative agent’s risk aversion. As the latter group of investors becomes more
risk averse, they reduce their equity exposures and the former group of investors
increase their equity holdings. Under this scenario, the former group of investors
shows positive capital flow to the market while the latter group shows negative
flow. At the same time, the supply of risky assets decreases because the change
in total share supply is assumed to be negatively related to the change in the
representative agent’s risk aversion. The representative agent shows a negative
capital flow because of the reduction in share supply. By our definition, the former
group of investors is categorized as contrarian. The contrarian investors increase
their holdings of risky assets and are compensated with a higher expected return
because the representative agent also turns more risk averse together with the
latter group of investors. The same logic applies when the representative agent
along with the latter group of investors becomes more risk tolerant.
Guided by the intuition derived from the model, we examine the capital
flows by all major market participants. Specifically, each quarter one aggregate
participant is classified as contrarian if its flow is in the opposite direction with
that of the representative agent’s. For example, in a quarter in which the
market receives positive capital flow from the representative agent, participants
that reduce their equity holdings are labeled contrarian investors. We sum
the normalized flows of these contrarian investors to construct the contrarian
flow (CTF) measure. Cohen (1999) aggregates all major investor groups
into households and institutions and studies the asset allocation decisions
of individuals and institution investors by analyzing their portfolio flows.
Households are found to reduce their equity exposures during business cycle
troughs. This result suggest that the risk aversion of households is countercyclical.
Wang (2003) proposes the equity flows from all institutional investors as a proxy
for market liquidity and find that the flow based liquidity measure commands a
4
premium in the cross section. Rather than focusing on the representative agent as
in Dichev (2007) or breaking down all participants as households and institutions
like in Cohen (1999) and Wang (2003), we adopt a different approach and follow
the classification principal derived from the theory and categorize major market
participants into two groups every quarter by their flows to the stock market.
5
(1995) and many others). Nevertheless, the two measures differ substantially in
principle because the contrarian flow can fluctuate when net issues are fixed and
vice versa. Hence, neither is a perfect substitute for the other. This point is
further supported by the data. Ultimately, it is an empirical question whether
the information in the contrarian flow is fully contained in the net issues.
Second, We find that the CTF does forecast the index returns at one quarter
to intermediate horizons. The predictive relation is robust to the addition of other
control variables such as dividend yield, short interest rate, net issues, term spread
and default spread. The forecastability disappears at longer horizons. The ability
of the CTF to forecast returns is markedly different from that of the dividend
yield and interest rates in two important aspects. From a statistical perspective,
the CTF contains information about the market risk premium but does not
suffer from the well known-statistical problems (such as extreme persistence
and low volatility) which have rendered forecasting with these predictors quite
problematic (Stambaugh (1999), Ferson, Sarkissian, and Simian (2003), Torous,
Valkanov, and Yan (2005), Boudoukh, Richardson, and Whitelaw (2005)). The
6
autocorrelation of the CTF , while significantly different from zero, is not near
the boundary of non-stationarity and will not significantly bias the estimates in
forecasting regressions. Also, the volatility of the CTF is actually larger than
that of the CRSP index returns. This is in contrast to the volatilities of other
predictors, which are at least an order of magnitude lower than that of the returns.
These appealing statistical properties of the CTF make it a suitable forecaster
of returns.
7
high CTF signals an upward revision of the representative agent’s risk aversion
and prices adjust downward. Consistent with this conjecture, we show that the
CTF is negatively correlated with concurrent market returns controlling for other
variables.
8
1.2 Model
The economy is further specified as follows. There are two groups of investors,
A and B, with weights of w and 1 − w respectively. Both groups have constant
absolute risk aversion. The group A investors’ coefficient of absolute risk aversion
equals a and the group B’s risk aversion bt varies over time.
1
The name reflects the fact that change in the representative agent’s preference is driven by
this group of investors.
9
The return on the risk-free asset is constant R = 1 + r . The stock pays a
dividend Dt per share at time t. Dt = D̄ + D̂t and D̂t follows the process,
2
D̂t = αD D̂t−1 + uD,t , uD,t ∼ N (0, σD ), (1.1)
where 0 ≤ αD < 1 and uD,t is i.i.d. normal. All investors receive a public signal
St about next period’s dividend,
We assume that investors are myopic and each trading period t they maximize
next period’s expected utility,
subject to
Wt+1 = RWt + Ht (Pt+1 + Dt+1 − RPt ), (1.4)
10
that γt = γ̄ + γ̂t and γ̂t follows the process,
The net supply of the risky asset at time t is 1 + X̂t . X̂t is assumed to follow
the process,
2
X̂t = αX X̂t−1 + uX ,t , uX ,t ∼ N (0, σX ), (1.7)
The price of the stock is the discounted sum of expected future dividends minus
a risk premium that depends on the risk aversions of all investors and the total
supply of the risky asset. We can denote the discounted sum of all future
R D̄ R 1
(including today’s) dividends by Ft , where Ft = r + R−αD D̂t + R−αD St , and
R2 1
the variance of Ft , σF2 = σ2
(R−αD )2 ε
+ σ2 .
(R−αD )2 S
Proposition 1. There exists an equilibrium price of the stock with the following
11
form,
Pt = Ft − Dt + p0 + p1 γt + p2 Xt , (1.8)
2 2
√
(σF +σQ )γ̄ R−αγ αγ −R+ (αγ −R)2 −4σp2 σF
2
where p0 = − r − p1 γ̄, p2 = R−αX p1 γ̄, p1 = 2
2σp
< 0,
and σp2 is a positive constant with its precise expression in the appendix. The
excess return of the risky asset is,
Qt+1 = εF ,t+1 −r (p0 + p1 γ̄) + p1 (αγ −R)γ̂t + p1 uγ,t+1 + p2 (αX −R)X̂t + p2 uX ,t+1 ,
(1.9)
R 1
where εF ,t+1 = R−αD εt+1 + R−αD St+1 .
Given the return process above, the conditional expected excess return of the
stock is as follows,
12
1.2.3 Contrarian Flows
The holdings of the risky asset by the two groups of investors at t are,
w
Hta = γt (1 + Xt ) (1.11)
a
1−w
Htb = γt (1 + Xt ) (1.12)
bt
w
∆Hta = [γt (1 + Xt ) − γt−1 (1 + Xt−1 )] (1.13)
a
1−w 1−w
∆Htb = γt (1 + Xt ) − γt−1 (1 + Xt−1 ) (1.14)
bt bt−1
aγt
bt = (1 − w ), (1.15)
a − w γt
a − w γt a − w γt−1
∆Htb = (1 + Xt ) − (1 + Xt−1 ). (1.16)
a a
13
negatively predicts next period’s excess return, Cov(Qt+1 , ∆Xt ) < 0.
Proposition 3. The net purchase of group A investors predicts next period’s risk
premium, Cov(Qt+1 , ∆Hta ) > 0.
14
are mainly driven by the change in the representative agent’s risk aversion.
Note that this relation does not require any restrictions on the parameters.
Intuitively, when group B investors become more risk averse and ask for a higher
compensation for risk, they unload part of the shares they hold. Part of the
shares sold by group B investors are absorbed by group A investors who are also
compensated by a higher risk premium. As a result a high capital flow from
group A investors reveals a big upward revision in representative agent’s risk
aversion and therefore a high risk premium. As seen in Proposition 2, group A
investors are the contrarian investors under certain conditions. Therefore, flows
from contrarian investors are revealing about the market risk premium. In the
empirical part, we will directly test this conclusion and other implications derived
from this conclusion.
We have established the link between group A investors’ flows and the
representative agent’s changing risk preference, an immediate implication is that
group A investors’ flows negatively relate to the contemporaneous returns of the
risky asset. The intuition is formalized in the following proposition.
15
1.3 Data
We have three types of time series: portfolio flows into or out of the stock market
by different types of market participants, returns of the aggregate stocks market
and other stock market based variables, such as the dividend price ratio, and
bond market variables, such as the risk-free interest rate, the default spread and
the term spread. We describe each type of time series separately for clarity.
Under the representative agent framework, the net flow from all investors is
the flow of the representative agent. In practice, different (groups) of investors
have diverse endowments in both wealth and information and face various
institutional restrictions, transaction costs or tax treatments. Consequently,
their investment decisions demonstrate the heterogeneities. Every quarter, we
distinguish between investors by whether their flows are in the same direction
as that of the representative investor’s. More specifically, we have the following
definitions.
16
representative agent. Conversely, an investor is classified as prevailing investor
in this period if this type of investor’s flow is in the same direction with the
representative agent’s flow.
X Flowt−1→t,i
CTFt = , for all contrarian investors. (1.17)
Holdt,i
i
Return Series and Other Stock Market Variables: We have quarterly returns
of the CRSP value weighted index from January 1960 to December 2005. The
index return is in excess of the one-month Treasury Bill rate, denoted by Rte . The
quarterly returns of the Fama-French 25 size and book-to-market portfolios as
well as those of the benchmark factor HML are aggregated from monthly returns
obtained from Ken French’s website. We also construct the log dividend price
ratio of the CRSP value weighted index (DPt ) using the moving sum of dividend
17
over the past 4 quarters. We measure the net issues by NYSE listed companies
normalized by the total market capitalization of all NYSE stocks (NTISt ), similar
with Dichev (2007). The net issues that takes in account of IPOs, SEOs as well
as share repurchase is computed as follows,
where MVt is the total market cap of all NYSE listed stocks and vwretxt is the
value weighted return of all NYSE stocks excluding distributions. It measures
the growth in total market values not attributable to capital gains.
Bond Market Variables: We also have the 3-month Treasury Bill rate (Tblt ),
the term spread (TMSt ), calculated as the difference between the yield on
long term government bonds and the T-bill as in Campbell (1987), and the
default spread (DEFt ) computed as the difference between BAA and AAA-rated
corporate bond yields, as in Fama and French (1989).
In this section we describe the data and test the major implications of the model.
18
because by definition contrarian flows are of the opposite sign of the flows from
the representative agent and the representative agent has been buying risky asset
from corporations over the last several decades. The CTF is also much more
volatile than the dividend price ratio and therefore the CTF could potentially
explain more variation in returns than the dividend price ratio as the latter is
often regarded as too smooth.2 It is also worth noting that the autocorrelation
of CTF is 0.21, far from unit-root territory and therefore it suffers little from
the spurious regression problem created by very persistent or even near unit
root regressors, as pointed out by Stambaugh (1999). The autocorrelation of the
CTF is significant but rather modest compared with other commonly used return
predictors, all of which show very high autocorrelations. The persistence of those
variables are also well documented in previous studies.
e
Rt+1 = α + βCTF t + εt+1 (1.19)
2
The standard deviation of the dividend price ratio is less than 1 percent, while the standard
deviation of the log dividend price ratio shown in Table 1.1 is much larger.
19
e
where Rt+1 is the excess return of the CRSP value weighted index return over
the Treasury Bill rate from t to t + 1, CTFt is the quarterly measure normalized
flows from the contrarian investors. This is a direct test of Proposition 3 which
links the contrarian investors’ flows to the market risk premium. The predictive
regression is performed with a list of additional forecasting variables observable
at t, such as the log dividend price ratio (DPt ), the short term (3-month) T-bill
rate (Tblt ), the net issues by NYSE listed companies (NTISt ), the term spread
(TMSt ) and the default spread (DEFt ). The dividend yield along with other
variables might not act as perfect predictors especially at the most recent history
(Boudoukh, Michaely, Richardson, and Roberts (2007), Goyal and Welch (2006))
but they help us understand whether the CTF contains additional information
and provide us a yardstick to measure the information content of the CTF .
20
empirical studies, for example Campbell and Shiller (1988a, 1988b) and Fama and
French (1988). The coefficient on the dividend yield is positive but not significant
and the variable explains little variation in quarterly excess returns because the
dividend yield is very smooth and has becomes insignificant in the most recent
history as discussed in Valkanov (2003) and Goyal and Welch (2006). In column 3,
we regress returns on both the contrarian flow and the dividend price ratio side
by side. The point estimates of the two coefficients remain almost unchanged
compared with those in the simple regressions. The model fit appears to improve
on that obtained by the dividend price ratio alone as the adjusted R 2 increases
to 1.8%.
21
for predictability tests and find that the change in equity supply negatively
forecasts returns. Controlling for the net issues is crucial for two reasons.
Theoretically, under the economic framework presented in the last section the two
measures both contain information about the representative agent’s risk aversion.
Empirically, the contrarian flows are by definition in the same direction with the
corporate issues and therefore the two are significantly correlated as shown in
Table 1.1. The coefficient of the net issues is negative and marginally significant,
consistent with the previous studies. More importantly, the additions of the two
variables do not diminish the forecast power of the contrarian flow, whose point
estimates and statistical significance remain almost unchanged.
Finally, in column 7 and 8, we control for the term spread and the default
spread, which have been studied by Keim and Stambaugh (1986) and Fama and
French (1989) as proxies for business conditions. The coefficient for the term
spread is insignificant and that for the default spread is positive and marginally
significant. The results for both of them are consistent with previous findings.
Furthermore, it must be noted that the inclusion of the business cycle variables
does not alter the point estimate or the significance of the contrarian flow.
Therefore, the contrarian flow captures information about future risk premia
that is orthogonal to that in other predictors.
22
(0.083× 0.275
√ ) in next period’s returns.3 Hence, at a quarterly basis, the economic
4
significance of the contrarian flow is 3 times as high as that of the dividend yield.
The coefficients on the T-bill rate, the net issues and the default spread are also
significant or marginally significant. The impacts of a one-standard-deviation
shock for these three variables are 113 bps, 48 bps and 79 bps respectively. Only
the short term T-bill rate shows a comparable economic significance with the
contrarian flow.
To summarize the findings in Table 1.2, all variables enter with the
economically expected sign in predicting the CRSP value weighted index return
and replicate existing studies. However, the most significant predictors are the
contrarian flow and the short term T-bill rate. Whether the forecasting ability of
the contrarian flow is consistent with the economic intuition explained in Section
2 is something we investigate extensively below.
We have shown that the contrarian flow predicts the market excess returns of
the following quarter. Here we investigate longer horizon predictability for two
reasons. First, under the economic framework we presented in Section xx, the
change in the representative agent’s preference could carry its impact for an
extended period of time. Second, aggregating returns helps to eliminate the
noise in quarterly returns. In this section, we address these issues by examining
whether the contrarian flow predicts the excess market returns at horizons of two
quarters, three quarters, and up to three years.
In Panel B of Table 1.2 we display the long horizon regression results. The
dependent variables are k -period excess returns of the CRSP value weighted
3
The standard deviations of the variables are in Table 1.1.
23
e
index, Rt,t+k . The first row shows that the contrarian flow forecasts returns
significantly at all horizons up to 12 quarters. The predictive power of the
contrarian flow becomes stronger as we move from short to intermediate horizons.
However, the coefficients do not increase mechanically as the horizons and peak at
6 quarters. The coefficients decrease gradually at horizons longer than 6 quarters.
This hump-shaped pattern reassures that the long horizon predictability is not an
artifact created by persistent regressors, as suggested by Boudoukh, Richardson,
and Whitelaw (2005). The rest of the rows present the forecasting power of other
variables at long horizons. The second row and the third row display that the
T-bill rate and the dividend yield are significant only at short horizons, up to
3 quarters. The findings are consistent with those in Ang and Bekaert (2007)
who conclude that the predictability of the short rate is only significant at short
horizons.
The longer horizon results suggest that the forecastability of the contrarian
flow is mostly observable at horizons up to 12 quarters. The magnitude of the
contrarian flow coefficients increases first and decreases gradually and becomes
insignificant after 12 quarters or so. The fact that the coefficients form a hump
shape excludes the concern of spurious regressions produced by persistent return
predictors.
We have shown that the contrarian flow predicts the CRSP value weighted
index at quarterly and longer horizons as a high contrarian flow reveals a high
discount rate, consistent with the theoretical prediction in Proposition 3. One
important and testable implication of Proposition 3 is that the contrarian flow
should exhibit stronger predictive power for growth stocks than for value stocks
24
because growth stock returns have high covariances with market discount rates,
as in Brennan, Wang, and Xia (2004) and Campbell and Vuolteenaho (2004).
Brennan, Wang, and Xia (2004) and Campbell and Vuolteenaho (2004) both
establish that growth stocks are more correlated with proxies for discount rates
or investment opportunities and therefore growth stocks are less risky in the long
term and earn lower returns than value stocks. Despite that our goal is not to
explain the value premium in the cross-section, the fact that growth stocks are
more affected by discount rates provides us another angle to test our theory by
examining whether the contrarian flow predicts growth stocks more than value
stocks and hence could forecast the value premium.
To investigate whether the contrarian flow can forecast the value premium,
we run the quarterly predictive regression
HML
Rt+1 = α + ρCTF t + γZt + et+1 (1.20)
HML is the return of the zero investment portfolio of long value stocks
where Rt+1
and short growth stocks as proposed by Fama and French (1993) and Zt is a
vector of control variables as discussed above.
25
lagged value premium to exclude the chance that the predictability is due to
the autocorrelation in the HML. The lagged value premium appears to improve
the model fit albeit the coefficient itself is insignificant. Furthermore, the the
point estimates of our main variable are barely altered in both magnitude and
statistical significance. From column 4 to 7, we incrementally add the T-bill rate,
the net issues, the term spread and the default spread, none of the controls
are statistically significant, with the possible exception of the T-bill, which
is marginally significant under certain specifications. All but the T-bill rate
also impair the model fit judged by the diminished adjusted R 2 and the slope
coefficients of the contrarian flow actually become stronger, if anything. The
increase in the magnitude and statistical significance is undoubtedly due to the
addition of the T-bill rate which has a significant positive correlation with the
main variable of interest.
For the same reason as in the last section, the change in the representative
agent’s risk price should affect the discount rates in the intermediate horizons.
We examine the long horizon regressions for the value premium and the results
are displayed in Panel B of Table 1.3. Similar with the one quarter results, the
contrarian flow is the only significant variable at any horizons. The adjusted R 2
peaks at one year horizon and the coefficients gradually decrease and become
insignificant at horizons longer than two years. This confirms the results in
Table 1.2 that the predictability is at short to intermediate intervals and our test
statistics do not grow over time rules out the possibility of a mechanical relation
manifested by overlapping returns and persistent regressors.
To recap the findings in Table 1.3, the contrarian flow is a significant predictor
for the value premium at short to intermediate horizons while other common
return predictors could not forecast the value premium. The
26
It is natural to ask whether this predictability coming from the long side or
the short side or both. In order to answer this question, we do the following tests
ff
Ri,t+1 = αi + φi CTF t + ei,t+1 , i = 1, 2, ...25 (1.21)
ff
where Ri,t+1 represents the excess returns of the 25 Fama-French size and book-
to-market ratio sorted portfolios.
Table 1.4 reports the regression statistics for the CTF . First of all, all
φ coefficients are positive and 19 out of 25 are statistically significant at 5%
level. These are consistent with the notion that CTF predicts the aggregate
market with a positive coefficient. Secondly, holding size constant, the coefficients
are declining from low book-to-market (growth) stocks to high book-to-market
(value) stocks. Figure 1.1 plots each portfolio’s coefficient against its own size and
book-to-market group. The declining pattern from growth to value is rather clear.
When CTF is low and contrarian investors are selling more stocks, both value
and growth stocks will underperform as well as the aggregate market. However,
the effect on growth stocks are almost twice as much as those on value stocks.
We have shown that high contrarian investors’ flows reflect high risk aversion of
the representative agent and therefore the contrarian flow predicts the market
discount rate and the predictability is stronger for growth stocks than for value
stocks. For the same reason, , prices and the contemporaneous returns should
move against the contrarian flow. This exact point is formulated in Proposition 4.
Therefore, we can directly test Proposition 4 by examining whether the contrarian
flow is negatively related to the contemporaneous market return. This provides
27
us one straightforward yet important cross check of our theory.
28
1.4.3 Disagreement or Risk
Miller (1977) conjectures that high disagreement leads to low future returns
because the price reflects the most optimistic investors’ valuation in the presence
of short-sale constraints. Park (2005) tests the conjecture for the aggregate
stock market (S &P 500 index) and finds that the dispersion in analysts earning
forecasts negatively predicts the market returns at intermediate horizons. The
findings are consistent with the Miller’s hypothesis that high dispersion precedes
lower market returns. We use the analyst forecast dispersion as a proxy for the
difference in beliefs to establish that the contrarian flows are not generated by
disagreement of fundamentals and the predictability from contrarian flows is not
affected altogether.
The analyst earning forecasts for the S &P 500 index are extracted from
the Institutional Brokers Estimate System (I/B/E/S) database. We follow Park
29
(2005) to construct the dispersion (DISPt ) variable as follows,
wq σ2,t + (1 − wq )σ2,t
DISPt = , for the first three quarters (1.22)
et
σ2,t
DISPt = , for the last quarter of the year (1.23)
et
where σ1, t and σ2, t are the standard deviations of the current earning forecasts
and the subsequent year earning forecasts made in the current quarter, et is
moving 12 months actual earnings of the S &P 500 index, and wq is 43 , 1
2 and
1
4 respectively for the first three quarters of the year. The normalization by the
actual earnings is to ensure that the dispersion does not grow mechanically with
the magnitude of the actual earnings. The weighting scheme between the current
year and the following year is to keep the forecast horizon relatively stable as the
calender time progresses. I/B/E/S started reporting forecasts of the S &P 500
earnings per share for the current and the following year since 1982 but there
were not a large enough number of analyst estimates to calculate the standard
deviation until 1984.
Table 1.6 compares the forecasts made by the analyst dispersion and the
contrarian flow with other control variables for the sample of 1984 to 2005. Panel
A displays the one quarter predictability tests of both variables. We observe that
the contrarian flow predicts future CRSP returns even after controlling for analyst
dispersion and other macro variables. The point estimate of the contrarian flow’s
coefficient in this subsample, 0.093, is very similar to that of the full sample in
Table 1.2, 0.085. As expected, the DISPt predicts next quarter’s market return
negatively and it is statistically significant as a single predictor. The predictive
power of the disagreement measure weakens as we add in the contrarian flow and
finally becomes insignificant as we control for a full fledge of macro variables. The
results are consistent with Park (2005) in that the predictive power of the DISPt
30
is stronger at intermediate horizons. Panel B presents the long horizon results.
The contrarian flow remains significant at almost all horizons in the presence
of the disagreement measure. The point estimates for DISPt are significant at
intermediate horizons, from 2 to 7 quarters, which is qualitatively very similar to
those in Park (2005).
The comparisons between the contrarian flow and the analyst dispersion are
important because differences in belief could serve as one primary motive for
trading. Indeed, it could be argued that contrarian investors trade a lot when they
disagree most with the representative agent. The fact that the contrarian flow is
not affected by the disagreement proxy excludes the possibility that differential
information is the explanation for our main results.
It is useful to dissect the investor groups by the direction they adjust their
risky asset holdings and each investor group is labeled differently through time.
However, it is also interesting to examine how each group’s flow correlates with
the contrarian flow and who make these contrarian trades more often. To answer
this question, we are going to look at two measures.
The first measure we adopt is the linear correlation between each investor
group’s normalized flow and the contrarian flow. The correlations are presented
in Table 1.7 column 1 with p-values in parentheses. Out of 11 investor
groups, 5 groups’ (households, life insurance companies, private pensions, saving
institutions and foreigners) flows negatively correlates with the contrarian flow
and 1 group’s (federal pension funds) 4 flow positively correlates with the
contrarian flow. The rest do now show significant correlations with the contrarian
4
Federal pension funds first appeared in the data in 1987.
31
flow. Although it is difficult to state the exact reason why each group has positive
or negative correlation with the contrarian flow, it is certainly plausible that
households avoid risky assets in bad times which makes them trade in the opposite
direction as the contrarian investors. Similarly, federal pensions could adopt an
investment plan that contributes to the stock market constantly regardless of the
market conditions and the prevailing risk appetites.
The second measure we adopt is the frequency each investor group being
a contrarian investor. However, the raw frequency measure do not take into
account of historical trends. On the one hand, corporations issues shares to the
representative agent for more than half of the time and therefore the contrarian
investors have negative flows for those periods. On the other hand, households
direct holdings of the stock market declined dramatically since the 60s and
institutional investors holdings increased over the same time, as documented by
Gompers and Metrick (2002) and Cohen (1999). Therefore, if we only look at
the raw frequency, households trade as a contrarian investor for about 67% of the
time and it simply reflects the institutionalization and the downward trend in
household direct holdings. As a result, we adjust the frequency measure for the
part that is only created by the trend. Let’s illustrate the case of the households as
an example. The contrarian flow is negative for about three quarters of the time
and positive for the rest one quarter of the time. At the same time, households
flow is negative for about 90% of the time and positive for 10% of the time,
which is the historical trend. Therefore, the households have a probability of
0.70 = 0.75 · 0.90 + 0.25 · 0.10 of being labeled as contrarian just by random
chance. Considering this trend, the adjusted frequency of being a contrarian
investor is 0.67 − 0.70 = −0.03. The adjusted frequency uncovers that the actual
tendency for the households acting like contrarian is negative despite the raw
frequency of being a contrarian investor is the high. Column 2 of Table 1.7
32
reports the raw frequency . The households have the highest raw frequency
of being contrarian and all other groups are contrarian for less than half of
the time. The pattern simply reflects the households have reduced their direct
stock holdings and institutions have increased their stock holdings. Column 3
displays the adjusted frequency with t-statistics in the parentheses. The adjusted
frequency corrects the biases created by the historical trend and describes the
propensity of being contrarian more accurately. The adjusted frequencies are
generally in line with the correlation measure. There are 4 groups (households,
life insurance companies, private pensions, and foreigners) that have significantly
negative adjusted frequencies and 1 group (federal pensions) that has positive
adjusted frequency.
1.5 Robustness
In this section, we provide several robustness checks of the main results in Table
1.2. We address the in-sample and out-of-sample predictability issue discussed in
Goyal and Welch (2006) and further provide subsample results.
Goyal and Welch (2006) demonstrate that most known return predictors perform
poorly judged by out-of-sample tests and conclude that those predictors would
hardly help an investor to time the market. Therefore it is of both economical as
well as statistical importance to examine the out-of-sample performance of the
ContrarianFlow variable.
33
using the data within the initial estimation period to obtain an initial estimate
of the model. Then we perform one quarter out of sample forecast based on the
initial estimate. We reestimate the model adding one data point one at a time and
predict returns with the rolling estimates. Finally we compare the conditional
forecast with the unconditional forecast based on only historical mean. The
benchmarks we employ are based on the mean square error (MSE) and the root
mean square error (RMSE). We denote the mean square error from the rolling
historical mean MSEU (unconditional) and that from the rolling regression MSEC
(conditional). The out-of-sample statistics are calculated as follows,
MSEC
R2 = 1 − ,
MSEU
p p
∆RMSE = MSEU − MSEC ,
MSEU − MSEC
MSE -F = (T − h + 1) · ( ), (1.24)
MSEC
where MSE -F is the McCracken (2004) F -statistics and tests the null hypothesis
that ∆MSE = 0. h is the degree of overlap (h = 1 for no overlap). The MSE -
F follows a non-standard distribution and we will bootstrap to compute the
critical values. Under the null of no predictability we first estimate the following
specification using the whole sample,
e
Rt+1 = α + u1,t+1
and keep the residuals. We then resample the residuals to generate 10, 000 paths
using the data generating process above. For each of the 10, 000 time series, we
perform the same out-of-sample analysis as we did in the actual data and calculate
the distributions of the OOS R 2 , the ∆RMSE and the MSE -F statistics. In this
34
way, we obtain the p-values of the three benchmarks.
The out-of-sample results are presented in Table 1.8. The one quarter
non-overlapping regression achieves an out-of-sample adjusted R 2 of 2.4% and
it is remarkably similar with the in-sample adjusted R 2 of the second half
of the sample (shown in Table ...). Judging by the MSE and RMSE ,
we can reject the null hypothesis that the CTF variable contains no more
information about expected returns than the unconditional mean. In this
benchmark case, the out-of-sample test substantiates the in-sample regression
statistics and the predictability is economically meaningful for outside investors.
Moreover, in all horizons our conditional estimates improve on their unconditional
counterparts. The fact that our out-of-sample estimates improve greatly on the
unconditional forecasts is a clear demonstration of the economical significance of
our predictability results.
35
1.6 Conclusion
In this paper, we use the capital flow of contrarian investors to study the
implication for the market risk premium of heterogeneity in investor risk
preferences. We first show that the flow of contrarian investors contain
information about risk aversion of the representative agent in a model in
which investors have heterogeneous risk preferences . A high contrarian flow
reveals to econometrician that the representative agent becomes more risk averse.
Therefore, it is reasonable to suspect that the CTF reflects market risk premium
required by the representative agent and we explore this link by asking whether
the CTF forecasts market excess returns.
Our results show that in a time-series regression, the CTF predicts the
CRSP value weighted index at quarterly horizon. Economically this forecasting
relationship is about 3 times as high as the widely used benchmark, the dividend
yield. The economical and statistical significance is robust to the addition of
other control variables. We conjecture that the CTF predicts market risk premia
because our theory suggests that it reflects the risk aversion of the representative
agent. We verify this hypothesis by examining the predicability for value and
growth stocks separately and find that the forecastability is stronger for growth
stocks than for value stocks. The difference in forecastability, together with the
notion that growth stocks bear more discount rate risk, confirms our hypothesis
that the contrarian flow measures the market risk premium.
Traditional literature has studied the time varying risk premia through
aggregate flow or issuance. Our study introduces the CTF as a measure of the
risk premia required by the market especially when investors are heterogeneous in
their risk preferences. This measure provides a new way to uncover information
lost in the aggregate variables.
36
1.7 Proofs
Et [Qt+1 ]
Hta = w
aVart [Qt+1 ]
Et [Qt+1 ]
Htb = (1 − w ) , (1.26)
bt Vart [Qt+1 ]
where Qt+1 = Pt+1 + Dt+1 − RPt , is the excess return of the stock. The demand
functions are standard results following the mean-variance analysis.
w Et [Qt+1 ] 1 − w Et [Qt+1 ]
Hta + Htb = +
a Vart [Qt+1 ] bt Vart [Qt+1 ]
w 1 − w Et [Qt+1 ]
= ( + )
a bt Vart [Qt+1 ]
wbt + (1 − w )a Et [Qt+1 ]
=
abt Vart [Qt+1 ]
Et [Qt+1 ]
= , (1.27)
γt Vart [Qt+1 ]
abt
where γt = wbt +(1−w )a
. It is clear that the demands of all the investors is
equivalent to that of a representative agent with his risk aversion γt defined as
above.
Pt = Ft − Dt + p0 + p1 γt + p2 Xt , (1.28)
37
R D̄ R 1
where Ft = r + R−αD D̂t + R−αD St . Therefore, the excess return of the risky
asset is,
R 1
Vart [Qt+1 ] = ( )2 σε2 + ( )2 σS2 + p12 σγ2 + p22 σX
2
+ 2p1 p2 σX ,γ ,
R − αD R − αD
| {z } | {z
2
}
2 σQ
σF
(1.31)
where σX ,γ = ργ,X σX σγ .
To solve for the equilibrium price function, we impose the market clearing
condition, so that Hta + Htb = 1 + Xt ,
Et [Qt+1 ]
= 1 + Xt ,
γt Vart [Qt+1 ]
Et [Qt+1 ] = (γt Vart [Qt+1 ])(1 + Xt )
= γt (σF2 + σQ
2
)(1 + Xt )
= (σF2 + σQ
2
)(γ̄ + γ̄ X̂t + γ̂t + γ̂t X̂t ), (1.32)
38
substitute equation 1.30 into equation 1.32, we have
2
(σF2 + σQ )γ̄ = −r (p0 + p1 γ̄)
(σF2 + σQ
2
)γ̂t = p1 (αγ − R)γ̂t
(σF2 + σQ
2
)γ̄ X̂t = p2 (αX − R)X̂t (1.34)
α −R α −R
where σP2 = σγ2 + ( αxγ −R )2 γ̄ 2 σX
2 + 2( γ
αx −R )γ̄σX ,γ > 0. Both p0 , and p2 are linear
functions of p1 . Both roots of the quadratic equation above are negative and we
choose the one that equals 0 as σF2 is 0. Therefore,
(σF2 + σQ
2 )γ̄
p0 = − − p1 γ̄,
r q
αγ − R + (αγ − R)2 − 4σP2 σF2
p1 = ,
2σP2
αγ − R
p2 = p1 γ̄ (1.36)
αX − R
39
1.7.2 Proof of Proposition 2
w
∆Hta = [γt (1 + Xt ) − γt−1 (1 + Xt−1 )]
a
w
= [(αγ − 1)γ̂t−1 + uγ,t + γ̄(X̂t − X̂t−1 )]
a
a − w γt a − w γt−1
∆Htb = (1 + Xt ) − (1 + Xt−1 )
a a
w w
= − (γt − γt−1 ) + Xt − Xt−1 − (γt Xt − γt−1 Xt−1 )
a a
w w
= − [(αγ − 1)γ̂t−1 + uγ,t ] + (1 − γ̄)(X̂t − X̂t−1 ) (1.37)
a a
Therefore,
w
Cov(∆Hta , ∆Xt ) = Cov(γ̂t − γ̂t−1 + γ̄(X̂t − X̂t−1 ), X̂t − X̂t−1 )
a
w
= [Cov(X̂t , γ̂t ) − Cov(X̂t , γ̂t−1 ) −
a
− Cov(X̂t−1 , γ̂t ) + Cov(X̂t−1 , γ̂t−1 ) + γ̄Var(X̂t − X̂t−1 )]
w 1 αX αγ 1
= [ σX ,γ − σX ,γ − σX ,γ + σX ,γ +
a 1 − αX αγ 1 − αX αγ 1 − αX αγ 1 − αX αγ
2
+ γ̄(2 − αX )σX ]
w 2 − αX − αγ 2
= [ σX ,γ + γ̄(2 − αX )σX ], (1.38)
a 1 − αX αγ
and
w w
Cov(∆Htb , ∆Xt ) = − Cov(γ̂t − γ̂t−1 , X̂t − X̂t−1 ) + (1 − γ̄)Var(X̂t − X̂t−1 )
a a
w 2 − αX − αγ w 2
= − · σX ,γ + (1 − γ̄)(2 − αX )σX (1.39)
a 1 − αX αγ a
40
γ̄(2−αX )(1−αX αγ ) σX
We can denote ρ∗ = − 2−αX −αγ · σγ . It is clear from equation (1.38)
that if ρX ,γ < ρ∗ , Cov(∆Hta , ∆Xt ) < 0
From equation (1.40), if ρX ,γ < ρ∗∗ , Cov(∆Xt , Et [Qt+1 ]) < 0, where ρ∗∗ =
γ̄(1−αX αγ ) αγ
− 1−αX · σσXγ . ρ∗∗ < ρ∗ as long as 2 − αX > αX . That is satisfied if αX is not
too small compared with αγ . The two autocorrelation coefficients, αX and αγ
are approximately the same in the data.
41
1.7.3 Proof of Proposition 3
The covariance of expected returns and the group A’s flows can be simplified
α −R
using the fact that p2 = p1 γ̄ αXγ −R ,
w
Cov(∆Hta , Et [Qt+1 ]) = Cov(∆γ̂t + γ̄∆X̂t , p1 (αγ − R)γ̂t + p2 (αX − R)X̂t )
a
w
= [p1 (αγ − R)Cov(∆γ̂t , γ̂t ) + p1 (αγ − R)γ̄Cov(∆X̂t , γ̂t ) +
a
+ p2 (αX − R)Cov(∆γ̂t , X̂t ) + p2 (αX − R)γ̄Cov(∆X̂t , X̂t )]
w 1 − αγ
= [p1 (αγ − R)σγ2 + p1 (αγ − R)γ̄ σX ,γ +
a 1 − αγ αX
1 − αγ 2
+ p2 (αX − R) σX ,γ + p2 (αX − R)γ̄σX ]
1 − αγ αX
w 1 − αγ
= [p1 (αγ − R)σγ2 + p1 (αγ − R)γ̄ σX ,γ +
a 1 − αγ αX
1 − αX
+ p1 (αγ − R)γ̄ σX ,γ + p1 (αγ − R)γ̄ 2 σX 2
]
1 − αγ αX
w 2 − αγ − αX
= p1 (αγ − R)(σγ2 + γ̄ · · σX ,γ + γ̄ 2 σX2
) (1.41)
a 1 − αγ αX
2−αγ −αX
σγ2 + γ̄ · 1−αγ αX · σX ,γ + γ̄ 2 σX
2 > 0, because the discriminant of the quadratic
2−αγ −αX
form is less than 0 as 1−αγ αX < 2. We also have p1 < 0, and αγ − R < 0.
Therefore,
Cov(∆Hta , Et [Qt+1 ]) > 0 (1.42)
42
1.7.4 Proof of Proposition 4
w
Cov(∆Hta , Qt − Et−1 [Qt ]) = Cov(∆γ̂t + γ̄∆X̂t , p1 uγ,t + p2 uX ,t )
a
w
= Cov)(uγ,t + γ̄uX ,t , p1 uγ,t + p2 uX ,t )
a
w
= [p1 σγ2 + (p1 γ̄ + p2 )σX , γ + p2 γ̄σX
2
]
a
w αγ − R
= p1 [σγ2 + (1 + )γ̄σX ,γ + γ̄ 2 σX2
(1.43)
],
a αX − R
αγ −R
where p1 < 0. The quadratic form σγ2 + (1 + αX −R )γ̄σX ,γ + γ̄ 2 σX
2 is positive
43
Table 1.1: Summary Statistics
Panel A reports the summary statistics for all the variables. CTF represents the flows from the
contrarian investors defined in equation (1.17). R e is the quarterly excess return of the CRSP
value weighted index. NTIS is the twelve-month moving sums of net issues by NYSE stocks
divided by the year end total market capitalization. Tbl is the three-month Treasury Bill rate.
DP is the log dividend price ratio of the CRSP index. TMS is the term spread. DEF is the
default spread. Panel B reports the correlation coefficients of the CTF with other variables
and the p-values are displayed in the parentheses.
Panel B: Correlations
44
Table 1.2: Predictive Regressions of Index Excess Returns
This table reports predictive regressions of CRSP value weighted index returns by the CTF
variable and other control variables at different horizons. The dependent variables are the
excess returns of the CRSP value weighted index. CTFt represents the flows from the contrarian
investors defined in equation (1.17). Rte is the CRSP index excess return (lagged). DPt is the
log dividend price ratio of the CRSP index. Tblt is the three-month Treasury Bill rate. NTISt
is the twelve-month moving sums of net issues by NYSE stocks divided by the year end total
market capitalization. TMSt is the term spread. DEFt is the default spread. The t -statistics
in parentheses are corrected for heteroscedasticity and autocorrelation. Panel A compares the
predictive regression of the next quarter’s index excess returns under different specifications.
Panel B examines the return predictability at longer horizons of up to twelve quarters using all
the predictors (most exhaustive specification from Panel A). The numbers indicate the number
of quarters the returns leading all the explanatory variables. For example, the column labeled
e
4 shows the regression of Rt,t+4 on all the explanatory variables at time t .
Adj R 2 0.067 0.073 0.092 0.109 0.112 0.131 0.122 0.134 0.190
45
Table 1.3: Predictive Regressions of HML
This table reports predictive regressions of HML by the CTF variable and other control
variables at different horizons. The dependent variables are the excess returns of the long
the value stocks and short the growth stocks. CTFt represents the flows from the contrarian
investors defined in equation (1.17). HMLt is the return of the zero investment portfolio of
long value stocks and short growth stocks (lagged). DPt is the log dividend price ratio of the
CRSP index. Tblt is the three-month Treasury Bill rate. NTISt is the twelve-month moving
sums of net issues by NYSE stocks divided by the year end total market capitalization. TMSt
is the term spread. DEFt is the default spread. The t -statistics in parentheses are corrected
for heteroscedasticity and autocorrelation. Panel A compares the predictive regression of the
next quarter’s HML under different specifications. Panel B examines the return predictability
at longer horizons of up to eight quarters using the optimal specification (most exhaustive
specification from Panel A). The numbers in the square brackets indicate the number of quarters
the returns leading all the explanatory variables. The dependent variables are overlapping. For
example, the column labeled 4 shows the regression of HMLt,t+4 on all the explanatory variables
at time t .
1 2 3 4 5 6 7 8
CTFt -0.050 -0.057 -0.085 -0.180 -0.174 -0.169 -0.188 -0.215
(-2.081) (-1.689) (-2.003) (-3.450) (-2.769) (-2.021) (-2.205) (-2.193)
Tblt 0.150 0.171 0.129 0.128 0.072 0.065 0.000 0.058
(1.184) (0.645) (0.319) (0.260) (0.126) (0.102) (0.000) (0.078)
VSt -0.067 -0.132 -0.151 -0.161 -0.173 -0.226 -0.264 -0.274
(-2.144) (-2.215) (-1.848) (-1.635) (-1.679) (-2.066) (-2.228) (-2.210)
Const -0.097 -0.184 -0.194 -0.198 -0.196 -0.261 -0.301 -0.305
(-2.019) (-1.994) (-1.552) (-1.308) (-1.240) (-1.548) (-1.652) (-1.623)
46
Table 1.4: Predictive Regressions of the Fama-French Portfolios
This table reports the coefficients from predictive regressions of the excess returns of the 25
Fama-French portfolios by the CTF variable. The dependent variables are the 25 Fama-
French benchmark portfolios. CTF represents the flows from the contrarian investors defined in
equation (1.17). This table compares the univariate predictive regressions of the next quarter’s
25 Fama-French portfolios. The t -statistics in parentheses are corrected for heteroscedasticity
and autocorrelation.
Small 2 3 4 Big
Low 0.169 0.171 0.145 0.133 0.125
(2.272) (2.585) (2.396) (2.510) (3.476)
2 0.151 0.130 0.109 0.091 0.085
(2.534) (2.678) (2.451) (2.170) (2.628)
3 0.153 0.103 0.084 0.074 0.068
(2.581) (2.151) (2.188) (1.742) (2.374)
4 0.109 0.090 0.068 0.066 0.059
(1.952) (2.071) (1.440) (1.801) (2.004)
High 0.112 0.102 0.089 0.088 0.063
(1.740) (1.668) (2.037) (2.428) (2.067)
47
Table 1.5: Contemporaneous Returns
This table reports the contemporaneous regressions of CRSP value weighted index returns on
the CTF variable under control variables. The dependent variables are the excess returns
of the CRSP value weighted index. CTFt represents the flows from the contrarian investors
e
defined in equation (1.17). DPt is the log dividend price ratio of the CRSP index. Rt−1 is
the lagged CRSP index excess return. Tblt is the three-month Treasury Bill rate. NTISt is
the twelve-month moving sums of net issues by NYSE stocks divided by the year end total
market capitalization. TMSt is the term spread. DEFt is the default spread. The t -statistics
in parentheses are corrected for heteroscedasticity and autocorrelation.
48
Table 1.6: Contrarian Flows and Analyst Dispersions
This table compares the predictive regressions of CRSP value weighted index returns by the
CTF variable and the DISP under control variables at different horizons. The dependent
variables are the excess returns of the CRSP value weighted index. CTFt represents the flows
from the contrarian investors defined in equation (1.17). DISPt is the analyst dispersion of
the S &P 500 earning forecast. DPt is the log dividend price ratio of the CRSP index. Tblt is
the three-month Treasury Bill rate. NTISt is the twelve-month moving sums of net issues by
NYSE stocks divided by the year end total market capitalization. TMSt is the term spread.
DEFt is the default spread. The t -statistics in parentheses are corrected for heteroscedasticity
and autocorrelation. Panel A compares the predictive regression of the next quarter’s index
excess returns. Panel B examines the return predictability at longer horizons of up to twelve
quarters using all the predictors (most exhaustive specification from Panel A). The numbers
indicate the number of quarters the returns leading all the explanatory variables. The sample
period is from 1984 to 2005.
Adj R 2 0.054 0.071 0.112 0.162 0.176 0.191 0.227 0.221 0.277
49
Table 1.7: Contrarian Likelihood of All Investors
This table reports the likelihood of being a contrarian for all types of investors through three
measures. Column Correlation is the linear correlations of each type of investor’s flows with the
contrarian flows p-values in the parentheses and column Frequency is the frequency of being a
contrarian for each type of investor. Column Adj . Freq. is the frequency of being a contrarian
for each type of investor adjusting for the time trend in each type of investor’s flows and the
p-values are in the parentheses.
50
Table 1.8: Out of Sample Tests
This table presents the out of sample tests of the CTF variable in a simple regression. OOS
R 2 is the out of sample adjusted R 2 for the CTF variable. ∆RMSE is the improvement of
the root-mean-square-error of the conditional model over the unconditional mean. MSE -F is
the McCracken (2004) F -statistics testing the null hypothesis that there is no improvement
in the MSE . Horizon is in quarters. The p-values in the parentheses and the critical values
(not reported here) are obtained by bootstrap using 10000 simulated time series. The initial
estimation period is from 1960 to 1982 and the evaluation period is from 1983 to 2005
51
Figure 1.1: Coefficients for 25 Fama-French Portfolios
This graph pictures the shape of the regression coefficients of the 25 Fama-French portfolios on
lagged CTF variable.
0.18
0.16
0.14
0.12
Coef
0.1
0.08
0.06
0.04
Low
2 Small
3 2
3
4
4
BM High Big
Size
52
CHAPTER 2
We investigate the information content of the call (put) Early Exercise Premium,
or EEP , defined as the normalized difference in prices between otherwise
comparable American and European call (put) options. The call EEP specifically
captures investors’ expectations about future lump sum dividend payments as
well as other state variables such as conditional volatility and interest rates.
From that perspective, the EEP should also be related to future returns of
the underlying security. Little is known about the EEP , largely because it is
usually unobservable for most underlying securities. The FTSE 100 index is an
exception in that regard, because it has both American and European options
contracts that are traded in large volumes. We use data of the FTSE 100 index,
and its American and European options contracts, from which we compute a
time series of the EEP . Interestingly, we find that the EEP is a good forecaster
of returns at daily horizons. This forecastability is not due to time-variation in
market risk premia or liquidity. Importantly, we find that the predictability stems
primarily from the ability of the EEP to forecast innovations in dividend growth,
rather than other components of unexpected returns. Overall, we use several
empirical and simulation methods to establish predictability of the underlying
with an options market variable, link this predictability to information about
53
cash flow fundamentals, and thereby provide clear support for Black’s (1975)
conjecture that informed investors prefer to trade on their superior information
about fundamentals in the options market relative to the underlying.
54
2.1 Introduction
Black (1975) was one of the first to suggest that informed investors prefer to
trade on their superior information about fundamentals in the options market
rather than in the underlying asset market because they can easily take on more
leveraged positions. An important implication of this argument is that, over
short horizons, option prices will reflect news about fundamentals that is yet to
be incorporated into prices of the underlying security. Since the transmission
of information across markets, and particularly between the options and the
underlying market, is of central importance in finance, it is not surprising that
this conjecture has generated a lot of theoretical and empirical interest. On the
theoretical side, Biais and Hillion (1994), Easley, O’Hara, and Srinivas (1998)
and others elaborate and formalize the theoretical conditions under which Black’s
(1975) conjecture will hold.
55
In this paper, we take an altogether new look at the connection between
prices of options and the underlying security, and its link with information about
fundamentals. More specifically, we focus on the difference in prices between
otherwise comparable European and American call options, which is known as the
call early exercise premium, or call EEP . Merton (1973) was the first to show that
the call EEP must be zero if the underlying asset pays no dividends.1 Roll (1977),
Geske (1979), and Whaley (1981) prove that in the presence of a known lump
sum dividend, prices of European and American calls are not necessarily equal,
because American option holders might want to exercise the option right before
the ex-dividend date. In the more realistic case of multiple dividends that are
not known with certainty, the EEP will depend on both the expected magnitude
and the lumpiness of these dividends.2 Conditional on dividends being non-zero,
the EEP will also depend on other factors that affect option prices (volatility,
interest rates, etc.). Therefore, when dividends are non-zero and lumpy, changes
in expectations about future cash flows and discount rates of the underlying
asset ought to be reflected in a non-zero mean and variations of the early exercise
premium.
We focus on the call EEP rather than on other option market predictors,
because of its close and unambiguous connection with lump sum dividends. The
arguments in Roll (1977), Geske (1979), and Whaley (1981) suggest that the call
EEP is very sensitive to changes in expectations about future lumpy dividends.
Furthermore, based on an empirical exercise of S&P 100 index options, Whaley
1
The intuition is that, in the absence of dividends, the intrinsic value obtained from
exercising an American call option is always less than the value of the option. An investor
would therefore rather sell the option in the open market rather than exercise it early.
2
When dividends are lumpy and non-uniform, the probability of early exercise is higher, and
the magnitude of the early exercise premium is dependent not just on the last dividend before
option maturity, but also on the other (lumpy) dividends over the life of the option.
56
(1982) and Harvey and Whaley (1992a, 1992b) conclude that the magnitude
and timing of dividends is critically important in determining the early exercise
premium. More specifically, Whaley (1982) remarks that the “magnitude of the
early exercise premium is importantly influenced by the amount of the dividend
payment.” Harvey and Whaley (1992a) provide additional evidence and make
this point even more forcefully by concluding that: “From a practical standpoint
of pricing (or trading) S&P 100 index options, knowing the amount and timing of
S&P 100 index cash dividends appears to be critical.” In the context of the FTSE
100 index options, the dependence of dividends of the EEP should be even more
important since FTSE 100 index dividends have been clustered once every two
weeks, and hence been highly lumpy and non-uniform. In contrast, predictors
such as the change in prices of American options, volume, open interest, and
even the put EEP depend not only on the lumpy dividends but on all the other
factors that affect option prices. For simplicity, by “EEP ” we refer to the call
EEP unless otherwise specified.
Clearly, analytic arguments and extant empirical findings suggest that the
EEP is very sensitive to fluctuations about future dividend payments.3 We
conjecture that, to the extent that dividend expectations influence returns and
to the extent that Black’s (1975) argument holds, the EEP should also be a
particularly good predictor of underlying returns at short horizons of a few days–
horizons that lie within the period that it takes for information to be impounded
into underlying prices. In the context of Black (1975), if informed investors
trade primarily in the options market, their information will be incorporated
first in the markets for European and American options rather than in the market
3
Even though ex-post dividends tend to be highly persistent, there is a great degree of ex-
ante uncertainty about their innovations, and the importance of these is recognized both by
academics and practitioners.
57
for the underlying. Since dividend information specifically has a different effect
on American relative to European options, the EEP should capture dividend
information faster than the underlying.4
In this paper, we investigate the empirical relation between the early exercise
premium of call options and the returns of the FTSE 100 index using daily data
from June 1992 to January 1996. First, we describe the statistical properties of
the early exercise premium. We document that, for calls and puts, the average
EEP is non-zero and its magnitude is economically and statistically significant.
The time series of the EEP also exhibits significant serial correlation at horizons
up to one week. This finding suggests that the call EEP might be related to
lumpy dividend payments of the underlying index. We use the Longstaff and
Schwartz (2001) simulation approach to show that, indeed, a simple model which
4
The private information that can be differentially incorporated into prices is not necessarily
market-wide, but also firm-specific since only a handful of stocks go ex-dividend on any
particular ex-dividend date and the dividend expectations that influence the early exercise
premium relate to these firm-specific dividends.
5
The S&P 500 index had both American and European contracts from April 2, 1986 through
June 20, 1986.
58
calibrates the lumpy sum dividends, the conditional volatility, and the interest
rate processes has little trouble to replicate the magnitudes of the average EEP
and its serial correlation that are found in the data.
The ability of the EEP to forecast returns is markedly different from that
of the dividend yield, the volatility, and interest rates in two important aspects.
From a statistical perspective, the EEP contains information about dividend
yields, volatility, and interest rates but does not suffer from the well known-
statistical problems (such as extreme persistence and low volatility) which have
rendered forecasting with these predictors quite problematic (Stambaugh (1999),
Ferson, Sarkissian, and Simian (2003), Torous, Valkanov, and Yan (2005)). The
autocorrelation of the EEP , while significantly different from zero, is not near
6
We will explain the institutional requirements in the data section.
59
the boundary of non-stationarity and will not significantly bias the estimates in
forecasting regressions. Also, the volatility of the EEP is actually larger than that
of the FTSE 100 returns. This is in contrast to the volatilities of other predictors,
which are at least an order of magnitude lower than that of the returns. These
appealing statistical properties of the EEP make it a suitable forecaster of returns,
especially at short horizons.
Third, we analyze the source of this forecasting relation using two alternative
approaches. First, we use Campbell’s (1991) VAR framework and decompose
realized returns into expected returns and shocks to dividend growth, excess
returns, and interest rates. We find that the EEP predicts mainly the dividend
shock component of the underlying index return. This result supports the
conjecture that the call EEP captures changes in expectations about future cash
flows of the underlying index. In a second approach, rather than relying on a
60
return decomposition, we test directly whether the EEP captures fluctuations in
future dividend growth. The results not only collaborate the VAR findings that
the EEP forecasts fluctuations in future dividend growth but they also suggest
that the predictive signal in the EEP is concentrated right before dividends are
announced. In sum, both the VAR and the direct regression approaches link the
predictability of the underlying returns by an option market variable to cash flow
news. To our knowledge, this link has not been previously established.
The last two results lead us to the conclusion that the call EEP is positively
related with subsequent underlying index returns at daily frequency mainly
because it contains information about future cash flows. These findings support
Black’s (1975) conjecture that informed investors prefer to trade in the options
market. The fact that we don’t observe predictability beyond two-day horizons
implies that the options and the underlying markets are reasonably well-
integrated. The last two findings are also consistent with the first empirical and
simulations results, which suggested that the EEP is sensitive to changes in cash
flows. Our results support the claims of Kothari and Shanken (1992) and Torous,
Valkanov, and Yan (2005) who argue that the commonly used proxy for expected
future dividends may contain measurement error and are too smooth to forecast
future returns at short horizons. The EEP responds rapidly to changes in cash
flows and is thus more suitable to detect short horizon predictability. In sum, the
novel contributions of this paper are to propose the call EEP as a short horizon
predictor of the underlying return, to argue for the economic reasons behind the
predictability, and to provide supporting empirical evidence. More broadly, we
establish predictability of the underlying returns with an options market variable,
and link this predictability to information about cash flows fundamentals, and
thereby provide clear support to the Black (1975) conjecture.
61
The paper is structured as follows. We describe the dataset in Section 2.
In section 3, we present summary statistics of the EEP and use simulations to
show that important statistical properties of the EEP can be replicated when
dividends, volatility, and interest rates are calibrated to the data. In section 4,
we present the predictability results and link them to the ability of the EEP to
forecast changes in future dividend growth. We conduct a series of robustness
checks in Section 5 and conclude in Section 6 with some final remarks.
2.2 Data
We have three types of time series: values of the FTSE 100 index, prices of
European and American calls and puts on the FTSE 100 index, and other
variables, such as short interest rate, the lumpy dividend stream of the FTSE
100 index and the volume of its shares traded, and the implied volatility of the
index. We describe each time series separately for clarity.
FTSE 100 Index and Index Futures: We compute daily log returns of the
FTSE 100 cash index over a four-year period from June 1992 to January 1996.
All the stocks are traded at the London Stock Exchange (LSE). The log returns
exclude dividend payments. The index return in excess of the one-month (riskfee)
UK interest rate is denoted by Rt . We also compute the daily log return of the
FTSE 100 index futures,7 which is traded at the London International Financial
Futures Exchange (LIFFE). The index futures return is denoted by Rtfut .
FTSE 100 Index Options: We have all the bid-ask quotes recorded for
all European and American FTSE 100 index options traded on the London
International Financial Futures Exchange from June 1992 to January 1996.
7
The returns are computed from the “on-the-run” contracts. We roll over to the next on-
the-run contract one month before the current on-the-run contract expires.
62
Since these contracts were heavily traded, there were no designated market-
makers obliged to stand ready to buy and sell. Liquidity in these markets was
generated in a CBOT-style auction hand-signal pit-trading environment with
voluntary dealers and direct interaction of buyer’s and seller’s agents. Nothing
in this process should generate any microstructure-related systematic differences
between European and American prices. To minimize possible data errors and
to make all contracts comparable, we apply several filters. For instance, we use
only synchronous quotes of the European and American option, or quotes that are
posted within 60 seconds of each other. This results in a sample of 47960 matched
quotes for call options and 41270 matched quotes for put options. We also exclude
prices lower than the intrinsic values. Furthermore, if we define moneyness as the
spot price divided by the strike price, S /K , we use only options that are within
the range 0.9 and 1.1. Finally our data includes 41891 matched pairs of calls and
35961 matched pairs for puts.
The European and American index option contracts have identical exercise
dates. At any time there are five different maturities of both types of options, one
month, two months, three months, four months and a long-dated one. For a given
maturity, American option exercise prices are multiples of 50 while European
option exercise prices are multiples of 25 but not 50. In order to directly compare
the prices of the American and the European option, we linearly interpolate
the prices volatilities of the two adjacent American options whose strike prices
straddle the strike price of the European option that we are trying to estimate.
With this method, we obtain the synchronous prices of American and European
contracts with the same strike price and maturity.
63
of the European contract. We use the normalization mainly because the non-
normalized difference is affected by the index level, the volatility, and other
variables that enter the option pricing formula. Through the normalization, we
control for the level of these variables and measure the premium of the American
relative to the European contract. We are careful not to normalize by the level
of the index itself, because doing so would induce an automatic correlation with
next-period index returns.
8
While this interpolation introduces a bias in the EEP measure (because the EEP is not
exactly linear in the moneyness space), the bias is not large in economic or statistical terms.
We also provide further robustness check for this interpolation method.
64
similar (and sometimes more significant) results.
The American index option holders also have a wildcard option. During
the sample period, London option market’s close was 4:10pm and London stock
market’s close was 4:30pm. American index option holders have the right to
exercise (but not trade) the option up to 4:31pm at a settlement price based on
the index level either at 4:10pm or later.
9
Easley, O’Hara, and Srinivas (1998) provide a theoretical model and convincing evidence
that options volume is related to future underlying returns. Unfortunately, we do not have
historical FTSE 100 option volume data.
65
this mandated synchronization, the dividend price ratio is distributed about once
every two weeks rather than uniformly. Second, U.K. companies typically pay
out semi-annually (rather than quarterly, as in the U.S.). This creates additional
lumpiness as well as uncertainty in dividends. Finally, while the dividend series
are ex-post persistent, there is a significant ex-ante uncertainty about their actual
realizations, which is a product of concern for a lot of financial analysts who follow
LSE-listed firms. In that respect, the U.K. and the U.S. stock markets are quite
similar.
In this section, we describe the statistical properties of the early exercise premium.
Table 2.1 presents the summary statistics of the EEP for calls and puts. To
facilitate comparison, all numbers are expressed in annualized percents except
the EEP s and the trading volume.10 We see that the average EEP of calls and
puts is very different from zero. For calls, the EEP is 3.5 percent with a standard
deviation of 1.8 percent. The EEP of puts is even higher at 7.6 percent with a
standard deviation of 4.3 percent. In comparison, the average annualized return
of the FTSE 100 index is 9.6 percent with a standard deviation of 12.5 percent
and the average annualized dividend yield is 4 percent with a standard deviation
of 2.2 percent. The EEP s are more volatile, very skewed and exhibit significant
kurtosis compared to the returns of the underlying asset. This is due to the
convex payoff of options and to their natural leverage.
10
We multiply the daily returns by the number of trading days, 252.
66
the EEP s, the index return and the index futures return. Column 1 shows the
autocorrelations for call EEP and column 2 shows that of the put EEP . For
calls and puts, the EEP s are positively serially correlated and the correlations
are significant at up to 5 daily lags. In contrast, the FTSE 100 returns are
uncorrelated at all but one-day lag, as can be seen from their autocorrelation
which is displayed in the column 3. Finally the last column shows that the
partial autocorrelation of the index futures returns are close to zero at all lags.
The observed serial correlation in the EEP s implies that the difference in
European and American prices is not white noise. Since the EEP will not be
zero when the dividends are non-zero or are paid in lump sums, we conjecture
that the persistence might be due to dividend shocks. Whether the empirically
observed serial correlations can be generated by dividends is an issue that we
tackle next.
11
These papers attack the problem by solving a partial differential equation with a moving
boundary which is a problem that generally does not have a closed-form solution.
67
So does allowing for realistic fluctuations in the conditional variance and risk-free
interest rate.
We conduct two simulations. First, we examine how the dividend yield and
volatility affect the EEP in a constant dividend yield Black-Scholes model. The
goal of this exercise is to see whether realistic magnitudes the lumpy dividends
and volatility can generate the empirically observed magnitudes of the average
EEP . In this simulation, we assume that the stock price follows a geometric
Brownian motion and the interest rate, the dividend yield and the volatility are
all constant. The underlying stock will pay a lumpy dividend in two weeks. Under
this setting, we calculate the price of a one-month at the money American option
with the Longstaff-Schwartz simulation. Then we compare it with the price of
an European option with the same contract details and obtain the EEP .
Figure 2.1 plots the magnitude of the EEP for different magnitudes of the
dividend yield and volatility. The top panel shows that the levels of the simulated
call EEP are close to the average EEP that we observe in the data. For instance,
when we set the dividend yield and the volatility to their average values from
68
the data (4% and 12%, respectively), our simulation generates an EEP of 0.028.
Recall that, in Table 2.1, the mean of the call EEP is 0.035. Similar results
obtain for the put EEP , shown in the bottom panel of the figure.
Figure 2.1 also illustrates the sensitivity of the EEP to changes in the dividend
yield and volatility. In the top panel, the call EEP surface is monotonic in both
directions. The call EEP increases with the dividend yield. This is intuitive
because, as the dividend yield increases, American option holders are more
incentivized to exercise before ex-dividend date in order to profit from the high
dividends, which raises the premium. Importantly, when volatility is in the 0.10
to 0.15 range, the call EEP is more convex in the level of the dividend yield.
Hence, in normal volatility regimes, the EEP is very sensitive to changes in the
dividend yield.
The call EEP is quite flat in the volatility space when the dividend yield is low
and decreases with volatility when the dividend yield is high. This latter effect
is due to two factors. First, volatility boosts the option value when American
option holders face the exercise decision and reduces the chance of exercising
early. Second, our EEP measure scales the absolute premium by the price of the
European contract which increases with volatility. For put options, the EEP is
decreasing in the dividend yield as higher dividends will reduce the incentives for
early exercise. The put EEP is also decreasing in volatility for the same reasons
as calls.
In a second simulation, we examine the dynamics of the EEP under the risk-
neutral measure when the interest rate and dividend yield follow an AR(1) process
and the volatility follows a GARCH(1,1) process. More specifically, we investigate
whether we can reproduce the serial correlation in the EEP observed in the data.
To do so, we generate the risk-free rate, dividend yield, and conditional volatility
69
from
where Rft+1 is the risk-free rate, DYt+1 is the dividend yield, and σt2 is the
variance of excess returns. α and β are the GARCH(1,1) coefficients for σt2 ,
and φ and ρ are the AR(1) coefficients for the risk-free rate and the dividend
yield, respectively. There is no risk premia for the state variables. Under this
dynamic setting, we simulate the underlying asset for 1000 steps, re-price the
same American option as above, and calculate the EEP . From the simulations
we obtain a time series of the EEP and calculate the AR(1) coefficients of the
call and the put EEP , respectively.
Table 2.2 shows the simulation results of the AR(1) coefficients of the call and
put EEP for a set of different parameters of the data generating processes. The
first row uses the parameters that are estimated from our data. The call EEP
has an AR(1) coefficient of 0.361 which is remarkably similar to the one from the
data (0.377). In the second set of rows, we vary the persistence of the volatility.
The third and fourth set of rows show similar results for various persistence levels
of the risk-free rate and the dividend yield, respectively. These simulation shows
that the more persistent are the volatility, the dividend yield and the interest
rate, the higher is the AR(1) coefficient of the call EEP .
In particular, the serial correlation of the call EEP is very sensitive to the
persistence of the dividend yield process. A change in ρ from 0.906 to 0.800
results in a drastic reduction of its AR(1) coefficient from 0.361 to 0.084, which
70
represents a decrease of 76.7 percents. The persistence of the risk-free rate and of
the volatility have a more significant impact on the observed serial correlation of
the put EEP than on the call EEP . For instance, a decrease in the GARCH
parameter β from 0.890 to 0.800 results in a reduction of the AR(1) of the
put EEP of 32.1 percents ((.171 − .252)/.252) and of the AR(1) of the call of
22.1 percents ((.281 − .361)/.361). Since the φ and ρ parameters are likely to
be downward biased (Andrews (1993)), using higher values of these parameters
results in even higher serial correlation of both the call and the put EEP s.
These simulations suggest that the call EEP is particularly sensitive to the
level and the serial correlation of the lump sum dividends. In particular, the
ability to simulate an EEP process that, in the presence of lumpy dividends, is
very similar to the data leads us to conjecture that unexpected fluctuations in
the dividend yield process might be captured by the EEP . This is a hypothesis
that we investigate in the next section.
71
2.4.1 Predictive Regressions
To investigate whether the EEP contains information about future stock returns,
we run the daily predictive regression
where Rt+1 is the excess return between the FTSE 100 index return and the
one-month UK treasury rate from t to t + 1, EEPt is the daily call EEP and Xt
is a vector of additional predictive variables observable at t, such as the dividend
yield (DYt ), the risk-free rate (Rft ), the implied variance of the at-the-money
option (Vart ), and the changes in its volume (∆Vlmt ). The dividend yield along
with other variables might not act as perfect predictors especially at short horizon
but they help us understand whether the EEP contains additional information
and provide us a yardstick to measure the information content of the EEP .
The results from different specifications of the regressions are shown in Table
2.3 Panel A. In column 1, we display the benchmark case of the dividend
yield as the only predictor of returns, because it has been used as a return
predictor in numerous studies (e.g., Campbell and Shiller (1988b)) with U.S.
data. The coefficient on the dividend yield is positive but not significant and
the variable explains little variation in daily excess returns because the dividend
yield is very smooth as discussed in Valkanov (2003). The t-statistics reported
in parentheses below the estimates are computed using Newey and West (1987)
heteroskedasticity and autocorrelation robust standard errors.
In column 2 of Table 2.3 Panel A, we add the EEP which measures the relative
premium of American relative to European call options. Its coefficient is positive
72
and statistically significant. This is one of the main results of the paper. Adding
EEP also appears to improve the model fit as the R 2 increases to a modest level
of 0.7 percent for a daily predictive regression. The sign is in line with what
we expect from economic intuition and from the simulations displayed in Figure
2.1 and Table 2.2. A higher and persistent EEP implies that investors expect
higher lump sum dividend payments and are ready to pay a higher premium for
American relative to European options.
73
horizons.
Finally, in column 5 we control for lagged changes in FTSE 100 share volume
over the previous day as proxy for liquidity in the FTSE 100 market at day
t. Amihud and Mendelson (1986) and Pastor and Stambaugh (2003) show that
liquidity has a large impact on future returns. In Table 2.3, high liquidity precedes
lower future returns, which is consistent with previous findings. The change in
volume is the only significant variable, in addition to the EEP . However, it
must be noted that the inclusion of the change in volume does not alter the
point estimate or the significance of the EEP . Hence, the EEP ought to capture
information about future returns that is orthogonal to that in the other predictors.
To summarize the findings in Table 2.3, all variables enter with the
economically expected sign in predicting the FTSE 100 index return and replicate
studies for the US stock market. However, the only significant predictor at the
daily frequency is the EEP and the changes in trading volume. Whether the
12
The standard deviations of the variables are in Table 2.1.
74
forecasting ability of the EEP is spurious or the result of various microstructure
issues is something we investigate extensively below.
We have shown that the EEP predicts the market excess returns at a daily
horizon. However, there are two potential issues with the FTSE 100 stock index
results. First, they may be due to non-synchronous trading. Indeed, some stocks
in the index may not trade in the closing hour of the market and therefore our
results from the cash index may be due to stale quotes. Second, in our sample
period, the stock market trading ceases at 4:30pm and the options market closes
at 4:10pm. Although the stock market closes later than the derivative markets
and we are not using any future information when conducting our prediction
study, it is interesting to investigate whether the return predictability we found
is caused by the movement of the stock market between 4:10pm to 4:30pm.
Since this 20-minute window is also the period when the wildcard option can
be exercised, investigating the exact timing provides us one more way to control
the effects of the wildcard option.
We address both of these issues by using the returns of the FTSE 100
index futures. This futures index is not subject to the non-synchronous trading
problem. Table 2.1 Panel B indicates that the index futures returns exhibit
little autocorrelation even at one-day lag. This implies that the returns of index
futures do serve our goal well in mitigating the caveat of stale prices. This is not
surprising because the futures contracts are actively traded with high liquidity.
Moreover, the futures market closes at 4:10pm similarly to the options market.
Table 2.4 Panel A shows the same predictive regressions that are in Table 2.3
but the forecasted variable is the FTSE 100 futures, instead of the spot, return.
75
In all specifications, our main predicting variable, the EEP , is economically and
statistically significant. The coefficients of the EEP are all equal to 0.043 across
all specifications and they are about 15% higher than those in the market excess
return prediction. Economically, the EEP is more important in predicting the
returns of the index futures and the statistical significance is comparable with
the previous case. All other predicting variables remain insignificant except the
changes in trading volume. One thing worth noting is that the point estimate
of the lagged return is negative and very close to zero. In other words, using
the index futures returns does help us eliminate the non-synchronous trading
problem. More importantly, we clearly see that our predictability is not due to
the 20-minute return before the stock market close. Therefore, it precludes the
possibility that the documented predictability is due to the wildcard option.
We have shown that the EEP predicts the market excess returns and the
index futures returns of the following day. Here we investigate longer horizon
predictability for two reasons. First, it is interesting to see how rapidly
information diffuses from the options market to the underlying asset. Second,
it is possible that microstructure-related dynamics could potentially generate
spurious predictability. In this section, we address these issues by examining
whether the EEP predicts the excess market and the index futures returns at
horizons of two days, three days, and up to two weeks.
In Panel B of Tables 2.3 and 2.4 we display the results for excess market
76
returns and index futures returns. Since the two sets of results are virtually
the same, we will focus on the case of index market returns in Panel B of
Table 2.3. Column one in that panel contains the results from a forecasting
regression of two period returns, Rt+1,t+2 . The point estimate of the EEP
is 0.035, slightly lower than the estimate of 0.036 obtained in the one period
regressions. The t-statistics and the R 2 are also lower. In column 2 to 4,
we present the results where the forecasted variable is Rt+2,t+3 , Rt+3,t+4 , and
Rt+4,t+5 , respectively. The coefficients of EEP s in these regressions are even lower
and become statistically insignificant. The explanatory powers also declines. The
forecastability completely disappears at horizons longer than three days. Finally
in column 5, in the forecast of Rt+6,t+10 , the coefficient of the EEP goes down
further and the EEP does not carry any predictive power for the next week’s
weekly return. The high R 2 is evidently due to the overlapping of the weekly
returns.
The longer horizon results suggest that the forecastability of the EEP is
mostly observable at horizons of one and two days. At longer horizons, the
magnitude of the EEP coefficients decreases gradually and becomes insignificant
after two days or so. This pattern suggests that it takes about one to two days
for the stock market to digest the information in the EEP . The fact that our
findings are robust even when we include the lagged returns makes it unlikely
that the result is due to market microstructure effects. Given the results in this
section, from now on we concentrate on one-day returns, Rt+1 .
77
future lump sum dividend payments. We test this conjecture below using two
alternative approaches. The approaches differ with respect to the identification
and frequency of the dividend growth shocks. However, the empirical results are
remarkably similar which indicates that our findings are a robust feature of the
data.
The VAR results are shown in Table 2.5 where p = 1 and the order of the VAR
was chosen with sequential pre-testing. The first column is similar to Table 2.3
with the exception that the EEP is omitted from the system. We do not include
the EEP in the VAR because our goal is to understand whether it forecasts
Et (Rt+1 ) or some components of wt+1 . Including the EEP in the VAR would
imply that, by construction, it would be uncorrelated with the residuals wt+1
and it would be correlated with Et (Rt+1 ). There are also economic reasons for
not including the EEP in the set of conditioning information. First, the EEP is
78
not directly observable for most assets. Unlike the dividend yield, the short rate
or the conditional variance, it is virtually inaccessible to investors. Second, as
argued above, the EEP is unlikely to be a good proxy for expected returns as are
the other variables in the VAR.
(R)
wt+1 = Rt+1 − Et (Rt+1 )
∞
X ∞
X ∞
X
j j
= (Et+1 − Et ) ρ ∆dt+1+j − (Et+1 − Et ) ρ rft+1+j − (Et+1 − Et ) ρj Rt+1+j
j =0 j =0 j =0
= ηd ,t+1 − ηRf ,t+1 − ηR,t+1 , (2.2)
After estimating the VAR, the ex-post return is Rt+1 = Êt (Rt+1 ) + η̂d ,t+1 −
η̂Rf ,t+1 −η̂R,t+1 where “ˆ” denotes the estimated values. Since Rt+1 is forecastable
by the EEP , it is interesting to investigate where the forecastability is coming
from. We note that while Êt (Rt+1 ) is uncorrelated by construction with η̂d ,t+1 ,
η̂Rf ,t+1 , and η̂R,t+1 , the latter three shocks to returns are correlated.
79
To disentangle the source of the predictability, we regress the four components
of realized return, Êt (Rt+1 ), η̂d ,t+1 , η̂Rf ,t+1 , and η̂R,t+1 on the previous day EEP .
The results from these regressions are reported in Table 2.6. The EEP forecasts
changes in the dividend growth process. The coefficient in front of η̂d ,t+1 is
positive and significant. These results are in agreement with economic intuition
and the simulation results in section 3. Higher unexpected lump sum dividends
lead to a larger American option premium and larger EEP , as the American
contract is more likely to be exercised prior to the ex-dividend date in order to
take advantage of the larger dividend payout. As we will see below, this result
is a robust feature of the data. This finding is also consistent with the findings
in Amin and Lee (1997), who document that option traders initiate a greater
proportion of long (short) positions a few days before good (bad) earning news.
The EEP does not forecast changes in expected excess returns. The coefficient
on η̂R,t+1 in Table 2.6 has a positive but insignificant sign. The sign of the
coefficient in front of η̂Rf ,t+1 is negative but also insignificant, which implies
that an increase in the EEP leads to an (insignificant) increase in future returns
through an unexpected lowering of the interest rates.13 Finally, the EEP is
negatively correlated with forecasted next day returns, Êt (Rt+1 ). The coefficient
is negative and is only significant at the ten percent level. The negative estimate
is probably due to the fact that the forecasted daily returns are a noisy proxy of
expected returns, which are better estimated at longer horizons. The sub-sample
findings presented in Table 2.6 will be discussed in the robustness section.
If we take the results from Table 2.6 at face value, the forecasting ability of the
EEP is due to its significant correlation with future changes in dividend growth.
To understand the economic significance of this correlation, we compute the effect
13
Lower unexpected interest rates lead to higher returns. Note that η̂Rf ,t+1 enters with a
negative sign in equation (2.2).
80
of a one standard deviation shock of EEP on subsequent returns. For the dividend
growth, this is 7.2 basis points (0.040 × 0.018). The standard deviation of η̂d ,t+1
in the VAR is 97 basis points. In other words, a one standard deviation shock of
EEP leads to an almost 10% change of the volatility of η̂d ,t+1 .
We have thus far showed that the EEP forecasts future changes in the dividend
growth process, where the innovations in dividend growth were obtained using
the Campbell and Shiller (1988a) VAR decomposition. It is reasonable to ask
whether this decomposition accurately identifies the dividend growth innovations
in returns. To answer this question, we take a more direct empirical approach
at isolating dividend growth shocks. Using the FTSE 100 dividends, Dt , we
construct a series of dividend growth rate, DGt = log(Dt ) − log(Dt−1 ). The
dividends are available on the ex-dividend date for the FTSE 100 index. If the
hypothesis that the EEP contains information about future dividend growth rates
is correct, then the daily EEP series must forecast fluctuations in the DGt series.
The advantage of this approach is that the DGt series is directly observable and
does not have to be identified from the returns series.
Two obstacles stand in our way of investigating more directly whether the
EEP forecasts fluctuations in dividend growth rates. First, as mentioned above,
the lumpy DGt series are available at bi-weekly frequency, whereas the EEP series
are daily. Aggregating the EEP to a bi-weekly horizon is not suitable in this case,
because as shown in Table 2.3, the forecasting relation occurs at frequencies of no
more than a couple of days. In other words, running the forecasting regressions
at bi-weekly frequency would obfuscate the daily lead-lag effect. Second, there
are seasonalities in the dividends and dividend growth processes, which might
81
produce spurious correlation in a forecasting relation.
To address both of these concerns, we use the following mixed data sampling
(MIDAS) regressions (Ghysels, Santa-Clara, and Valkanov (2005b)).
K
X
DGHt = α + φ(L)DGH (t−1) + γ β(k , θ)EEPt−k + et (2.3)
k =1
is the MIDAS term. In that expression, we use lagged daily EEP s to forecast
the bi-weekly dividend growth rates. In other words, the dividend growth rate
of, say, July 1st, 1995 will be regressed on p own lags as well as on lagged daily
EEP rates starting June 30th and going back K days.
Since the number of lagged daily EEPs needed to capture the dynamics of the
dividend growth rate might be large, the unrestricted specification of the weights
results in a lot of parameters to estimate. The cost of parameter proliferation is
that the estimates will be estimated imprecisely and the regression will produce
poor out-of-sample forecasts. To reduce the number of coefficients to estimate,
we follow the MIDAS regression approach and parameterize the lags in front
of the EEPt−k using a function β(k , θ). The lag function is parsimoniously
parameterized and its parameters are collected in a vector θ. Ghysels, Santa-
Clara, and Valkanov (2005a) show that a suitable parameterization β(k , θ)
circumvents the problem of parameter proliferation and of choosing the truncation
point K . We also normalize the weights β(k , θ) to add up to one, which allows us
82
to estimate a scale parameter γ. The normalization is useful because γ captures
the overall predictive power of lagged EEPs, while the dynamics of the EEPs is
captured by the weights.
In general, there are many ways of parameterizing β(k , θ). We focus on the
Beta function specification (also used by Ghysels, Santa-Clara, and Valkanov
(2005a)), which has only two parameters, or θ = [θ1 ; θ2 ]:
f ( k , θ1 ; θ2 )
β(k , θ) = PK K j (2.4)
j =1 f ( K , θ1 ; θ2 )
In the second column of Table 2.7, we add the lagged daily EEP s, where K
83
is set to 45 days, or two months’ worth of daily returns.14 If the conjecture that
the early exercise premium contains information about future dividend growth is
correct, then we expect the γ coefficient to be positive and statistically different
from zero. Consistent with this conjecture, we obtain a γ estimate of 3.093. This
estimate is statistically significant at the 1 percent level. A joint F-test of the
significance of all the MIDAS parameters (γ, θ1 , and θ2 ) being equal to zero is
also statistically significant at the 1 percent level. Since the β(k , θ) function is
normalized to sum to one, we can interpret the coefficient estimates of γ as the
total impact of the lagged EEPs on future dividend growth.
First, most of the mass is concentrated on only four to five daily EEP s, which
suggests that the predictability is at short horizons. Otherwise, we would expect
to see more weight on a larger fraction of lagged EEP s. Second, the location
of the mass is on EEP s between 15 and 18 days before the ex-dividend date.
Dividend payments for the FTSE 100 index stocks are announced between 10
to 15 days before the ex-dividend date. That period is represented in shaded
pattern on the figure. The shape of the estimated weights clearly shows that
most of the predictability occurs right before the announcement period, which
14
We experimented with K as large as 130 days (about 6 months) and the results were almost
identical.
84
suggests that our MIDAS procedure accurately captures the timing of when
information is incorporated into prices. To summarize the findings in the figure,
the concentration of the mass and the location of the weights corroborate the
evidence from the previous section that the predictability is at short horizons
and it is due to news about dividend growth rates.
The EEP forecasts the underlying returns at short horizons. Moreover, its
predictive ability is related mostly to innovations to the dividend growth
component of returns rather than discount rates or expected returns. Both
of these findings are consistent with the view that information about future
cash flows is first revealed in option prices rather than in the price of the
underlying security. This result is consistent with Black’s (1975) view that
informed investors prefer to trade in the options market. Moreover, the very
short horizon nature of the predictability indicates that while information does
not flow instantaneously between the options and the underlying markets, it is
incorporated quite efficiently.
85
The daily lag in the information flow is consistent with Sims (2001) and Shiller
(2000) who explore the implications of limited information-processing capacity
for asset prices. These authors argue that investors, rather than possessing
unlimited-processing capacity, are better characterized as being only boundedly
rational. The inability of investors to immediately incorporate all relevant
information into prices gives rise to short horizon predictability across markets.
Hong, Torous, and Valkanov (2004) make a similar point by linking the slow
diffusion of cash flow information across industries to short horizon cross asset
return predictability. We are the first to document a similar phenomenon in
the options market, by linking the underlying return predictability to the EEP ’s
ability to forecast mainly innovations to dividend growth.
Our paper is related to several others that use option market information to
forecast underlying returns. Manaster and Rendleman (1982) show that if we take
the volatility as given and impute the implied stock prices from the options, this
implied stock price will predict future stock return by one day. Anthony (1988)
shows that shocks to option trading volume leads shocks to stock trading volume
by one day. However, Stephan and Whaley (1990), Chan, Chung, and Johnson
(1993) and others find no evidence that price changes in option markets lead
price changes in the underlying. Easley, O’Hara, and Srinivas (1998) find that
option market volume predict underlying returns which is consistent with the view
that informed investors trade in the options market. Pan and Poteshman (2004)
also find that option trading volume contains information about future stock
price movements and argue that the source of the predictability is non-public
information possessed by option traders. In relation to previous work, the novel
contributions of this paper are: (i) the introduction of the call EEP as a short
horizon predictor of the underlying return; (ii) to argue for the economic reasons
behind the predictability; and (iii) to provide supporting empirical evidence that
86
link the predictability to news about future cash flows.
An alternative explanation for our findings might be that the early exercise
premium is purely driven by irrational financial market behavior which also has
an impact on underlying returns. While there is some evidence that individual
customers engage in irrational exercising of options, Poteshman and Serbin (2003)
show that larger traders exhibit no irrational exercise behavior. Hence, this is
not a compelling explanation for the FTSE 100 index options which are widely
held and traded by both individuals and institutional investors.
87
2.5 Robustness
In this section, we provide several robustness checks of the main results in Tables
2.3 and 2.6. Some of these tests are motivated by economic theory, while others
address statistical concerns.
Table 2.8 presents the predictive regression results in the two sub-sample
periods. The entire sample results (from Table 2.3) are also displayed in the
first column for convenience. We observe that the EEP predicts future FTSE
100 returns in both sub-samples even after controlling for all other commonly
used predictors. Interestingly, the point estimates of the EEP coefficient in the
sub-samples, 0.038 and 0.048, are very similar to that of the entire sample, 0.036.
Importantly, the estimates remain statistically significant despite the short sub-
15
Prior to July 18, 1994, the LSE followed a fixed date (rather than fixed period) settlement
regulation, in which all transactions within a two or three week “account settlement period”
were settled on the second Monday of the following account settlement period, making ex-
dividend dates two or three weeks apart. After July 18, 1994, even though the settlement
system changed to settle 5 trading days after a transaction, ex-dividend dates have largely
continued the historical practice of being only on the first day of the week, and typically every
two weeks.
88
samples. The slight reduction in the t-statistics is undoubtedly due to the fact
that we have fewer observations in the two sub-periods, which decreases the power
of our tests.
The sub-sample results in Tables 2.6 and 2.8 lead us to conclude that the
predictive ability of the EEP is a robust feature of the data.
Thus far, our focus has mainly been on the call EEP , even though the dividend
yield also has an impact on the EEP of put options. The concentration on the
call EEP was guided by two main reasons. First, the EEP of put options are
89
positive even when the underlying security does not pay dividends or when the
dividend stream is continuous. In contrast, a positive call EEP can arise only
when dividends are paid in lump sums. To put it differently, American put
options can be optimally exercised earlier than maturity for reasons other than
lumpy dividends. Therefore, the put EEP is not as sensitive and unambiguous
an indicator of expected future dividends as is the call EEP . The second reason
for not including put EEP in our main analysis is that higher dividend yields
increase the cost of early exercising put options, all else equal. Indeed, we have
seen in Figure 2.1 that the dividend yield impacts the EEP of call and put options
in opposite directions.
With these arguments in mind, we expect that the predictive ability of the
put EEP will be lower than that of the call EEP and the sign on the predictor
will be reversed. In Table 2.9, we use the put EEP to run the same predictive
regression as we did with the call EEP (Table 2.3). As expected, the coefficient
on the put EEP is negative, because higher put EEP indicates lower expected
future dividends, everything else equal. Also expected is the fact that the put
EEP coefficient is not statistically significant. While the point estimates are
stable in the sample and across sub-samples, the t-statistics are never above one.
As anticipated, the put EEP is a much noisier predictor of the underlying stock’s
returns, because it is a function of many other variables in addition to dividends.
The put EEP results serve as an additional robustness check that our findings
are not spurious. Indeed, it may be argued that the predictability is due
to market micro-structure differences between the options and the underlying
market. The fact that we don’t observe the predictability with put EEP is a
clear demonstration that our results are not due to such automatic correlations
and indirectly supports our main premise.
90
2.5.3 Alternative EEP Aggregation Methods
In the construction of our EEP variable, we do not control for the time-to-
maturity of each contract. While the EEP certainly depends on the time span of
the options, this dependence is complicated by the timing and lumpiness of the
dividend payments and is therefore highly non-linear.
Panels A and Panel B in Table 2.10 contain the results from the predictive
regressions with these two new EEP forecasters. The predictive regressions also
include the other forecasting variables. We provide the results for the entire
sample as well as for the two sub-samples. The results in Table 2.10 are very
similar to those of the previous tables in terms of magnitudes of the estimates as
well as statistical significance. These additional robustness checks are reassuring
that our results are not driven by the particular construction of the EEP .
91
2.6 Conclusion
In this paper, we use the call EEP to examine the information flow between
the stock market and the derivative market. We first show that the empirically
observed level and serial correlation of the EEP can be reproduced when the
dividend yield process is lumpy and persistent. Therefore, it is reasonable to
suspect that the observed EEP reflects the market participants’ expectation
about future dividends. We explore the information content of the EEP by
asking whether the EEP forecasts the FTSE 100 index return. Based on the
estimation results of time-series regression models, we further identify the source
of this predictability.
Our results show that in a time-series regression the EEP predicts the
underlying FTSE 100 index at daily horizon. Economically this forecasting
relationship is about 50% higher than the widely used benchmark, the dividend
yield. The economical and statistical significance is robust to the addition of other
control variables. We conjecture that the EEP predicts the underlying asset’s
return because it is a forward-looking variable that contains the information
about expected future dividends. We verify this hypothesis by decomposing
the realized returns into expected returns and three different components of
unexpected returns and find that the EEP indeed predicts the dividend shock
component of the index return. This result confirms our hypothesis that the EEP
reflects the option market’s expectation about future fundamentals.
Traditional literature has studied the information flow between the options
market and underlying asset market through option prices, volume and signed
volume. Our study introduces the EEP as a short-horizon predictor of underlying
returns, and links this forecastability to the fundamentals of the underlying
market. This link provides a clear support for the Black (1975) conjecture that
92
informed investors prefer to trade on their information about fundamentals in
the options market rather than the underlying market.
93
Table 2.1: Summary Statistics
Panel A reports the summary statistics for all the variables. EEP Call and EEP Put represent
the EEP of call and put options. Rt is the FTSE 100 index return. Rtfut is the FTSE 100
index futures return. DYt is the one-month moving average of the dividend yield. Rft is the
one-month stochastic detrended risk free rate. Vart is the implied variance of the closest to the
money European call option. ∆Vlmt is the change in share volume in million. All variables
except ∆Vlm are annualized. Panel B shows the partial autocorrelations of the call EEP , the
put EEP , the index return, and the index futures return. The t -statistics are in the parentheses.
94
Table 2.2: Simulating the Dynamics of the EEP
This table reports the dynamics of the EEP for calls and puts when the options are priced using
numerical valuation. Its purpose is to illustrate the serial correlation in the EEP for various
persistence levels of the underlying dividend yield, interest rate, and volatility processes. The
parameters α and β are the GARCH(1,1) coefficients for the underlying asset’s volatility. φ
and ρ are the AR(1) coefficients for the risk-free rate and the dividend yield, respectively. Each
sample simulates 1000 steps of the underlying asset. The EEP s for calls and puts are calculated
and the AR(1) coefficients are obtained from the calculated results.
95
Table 2.3: Predictive Regressions of Index Excess Returns
This table reports predictive regressions of excess returns by the early exercise premium and
other forecasters for different horizons. The dependent variables are the excess return of the
FTSE 100 index. DYt is the one-month moving average of the dividend yield. EEPtCall is the
early exercise premium of call options. Rt is the FTSE 100 index excess return (lagged). Vart
is the implied variance of the closest to the money European call option. Rft is the one-month
stochastically detrended risk free rate. ∆Vlm is the change in share volume (in million shares).
The t -statistics in parentheses are corrected for hetroskedasticity and autocorrelation. Panel
A compares the predictive regression of the next day’s index excess returns under different
specifications. Panel B examines the return predictability at longer horizons of up to two weeks
using all the predictors (most exhaustive specification from Panel A). The numbers in the
square brackets indicate the number of days the returns leading all the explanatory variables.
For example, the column labeled [2, 3] shows the regression of Rt+2,t+3 on all the explanatory
variables at time t .
96
Table 2.4: Predictive Regressions of Index Futures Returns
This table reports predictive regressions of excess futures returns by the early exercise premium
and other forecasters for different horizons. This table is similar to Table 2.3 above with the
exception that the dependent variable is the return of FTSE 100 futures contracts rather than
the index itself. DYt is the one-month moving average of the dividend yield. EEPtCall is the
early exercise premium of call options. Rtfut is the FTSE 100 return of the futures contract
(lagged). Vart is the implied variance of the closest to the money European call option. Rft
is the one-month stochastically detrended risk free rate. ∆Vlm is the change in share volume
(in million shares). The t -statistics in parentheses are corrected for hetroskedasticity and
autocorrelation. Panel A compares the predictive regression of the next day’s index excess
returns under different specifications. Panel B examines the return predictability at longer
horizons of up to two weeks using all the predictors (most exhaustive specification from Panel
A). The numbers in the square brackets indicate the number of days the returns leading all the
explanatory variables. For example, the column labeled [2, 3] shows the regression of Rt+2,t+3
on all the explanatory variables at time t .
97
Table 2.5: VAR Results
The table contains the vector autoregression (VAR) estimates of A in zt+1 = Azt + wt+1 where
the vector zt is demeaned and is defined as zt = [Rt , DYt , Vart , Rft , ∆Vlmt ]′ . Rt is the FTSE
100 index excess return. DYt is the one-month moving average of the dividend yield. Vart is
the implied variance of the closest to the money European call option. Rft is the one-month
stochastically detrended risk free rate. ∆Vlmt is the change in share volume in million. The
order of the VAR was chosen with sequential pretesting. The t -statistics in parentheses are
corrected for hetroskedasticity and autocorrelation.
98
Table 2.6: Identifying the Source of the Forecastability
This table reports the regressions of different components of returns on the EEP . Êt (Rt+1 ) is
the fitted value from the VAR regression. η̂R,t+1 represents news about future excess returns.
η̂d,t+1 represents news about cash flow. η̂Rf ,t+1 represents news about risk-free rate. All these
four components are regressed on the call option EEP at time t . The t -statistics in parentheses
are corrected for hetroskedasticity and serial correlation.
R
Sample Period ŵt+1 Êt (Rt+1 )
η̂d,t+1 η̂R,t+1 η̂Rf ,t+1
6/1/92 to 1/12/96 0.040 -0.000 0.001 -0.003
(2.642) (-0.394) (0.460) (-1.714)
R2 0.007 0.000 0.000 0.005
99
Table 2.7: Dividend Growth Forecast: MIDAS Regression
This table shows results for the following mixed-data sampling (MIDAS) of the bi-weekly
dividend growth rate (DGt ) on its own lags and lags of daily call EEPs. DGt = α+φ(L)DGt−1 +
P
γ K k =1 β(k , θ)EEPt−k /14 + et . Details about the MIDAS regression are in the text. The t -
statistics are in parentheses. The F -statistic tests the MIDAS model against the benchmark
model in column 1 under the null hypothesis that lagged call EEP s do not forecast the dividend
growth rate. The F -statistic is shown and its p-value is in parentheses.
γ 3.093
(21.525)
θ1 289.014
(1.385)
θ2 500.000
(1.406)
R2 0.297 0.326
F 4.766
(0.004)
Sample Size 116 116
100
Table 2.8: Predictive Regression in Subsamples
This table reports predictive regressions of excess futures returns by the early exercise premium
and other forecasters for different sub-sample periods. In Panel A, the forecasted variable is
the index excess return, whereas in Panel B, it is the index futures contract return. DYt is
the one-month moving average of the dividend yield. EEPtCall is the early exercise premium
of call options. Vart is the implied variance of the closest to the money European call option.
Rft is the one-month stochastic detrended risk free rate. ∆Vlm is the change in share volume
in million. The first column shows the whole sample period and the next two columns display
the result for the two sub-samples. The choice of the sub-samples is explained in the text. The
t -statistics in parentheses are corrected for hetroskedasticity and autocorrelation.
Panel A: Index Excess Return
101
Table 2.9: Robustness Check: Put Results
This table reports the predictive regression of return using the put EEP (instead of call EEP ).
The dependent variable is the excess return of the FTSE 100 index. DYt is the one-month
moving average of the dividend yield. EEPtPut is the early exercise premium of put options.
Rt is the (lagged) FTSE 100 index excess return. Vart is the implied variance of the closest to
the money European call option. Rft is the one-month stochastically detrended risk free rate.
∆Vlm is the change in share volume in million. The first column shows the whole sample period
and the next two columns show the result for two sub-samples. The t -statistics in parentheses
are corrected for hetroskedasticity and autocorrelation.
102
Table 2.10: Predictive Regression under Alternative EEP Aggregation
This table presents the predictive regression of the future return under alternative EEP
aggregation methods. The dependent variables are the excess return of the FTSE 100 index.
DYt is the one-month moving average of dividend yield. EEPtMat is the early exercise premium
of call options aggregated daily by interpolating both moneyness and time to maturity. EEPtAvg
is the early exercise premium of call options aggregated daily by averaging the EEP of all
contracts. Rt is the FTSE 100 index excess return. Vart is the implied variance of the closest
to the money European call option. Rft is the one-month stochastic detrended risk free rate.
∆Vlm is the change in share volume in million. The first column shows the whole sample period
and the next two columns show the result for the two halves of the sample. The t -statistics in
the parentheses are corrected for hetroskedasticity and autocorrelations.
Panel A: EEP Controlled for Maturity
Sample period 6/1/92 to 1/12/96 6/1/92 to 7/18/94 7/19/94 to 1/12/96
DYt 0.877 -0.945 1.761
(0.291) (-0.204) (0.410)
EEPMat
t 0.026 0.029 0.036
(2.529) (2.300) (2.008)
The two graphs display the magnitudes of the call and put EEP s computed using numerical
valuations of an at-the-money, one-month American option contracts. The risk free rate is 8%.
The top (bottom) graph displays for the call (put) EEP for various levels of the dividend yield
and volatility.
Call EEP
0.1
0.08
0.06
EEP
0.04
0.02
0
0.08
0.06 0.35
0.3
0.04 0.25
0.02 0.2
0.15
Dividend Yield 0 0.1
Volatility
Put EEP
0.12
0.1
0.08
EEP
0.06
0.04
0.02
0
0.08
0.06
0.35
0.04 0.3
0.25
0.02 0.2
0.15
Dividend Yield 0 0.1
Volatility
104
Figure 2.2: MIDAS Weights
This graph pictures the shape of the β coefficients against the lagged
Pdays in the following mixed-
K
data sampling (MIDAS) regression, DGt = α + φ(L)DGt−1 + γ k =1 β(k , θ)EEPt−k /14 + et .
Dividend payments for the FTSE 100 index stocks are announced between 10 to 15 days before
the ex-dividend date. The shaded pattern denotes that period. The daily EEPs contain
information about future dividends 2 to 3 weeks before the ex-dividend date, right before
the announcement of the dividend payments.
MIDAS Weights
0.5
0.45
0.4
0.35
0.3
Weight
0.25
0.2
0.15
0.1
0.05
0
0 5 10 15 20 25 30 35 40 45
Daily Lags
105
CHAPTER 3
This paper examines a volatility estimation bias that may be commonly exhibited
by all option pricing models. Black and Scholes (1972) were the first to illustrate
the bias by showing that their model under priced options on relatively low
variance stocks and over priced options on relatively high variance stocks. The
bias is always observed in cross section among individual stocks. Thus, we
think this bias might have nothing to do with Black-Scholes or any option
pricing model but instead might be attributable to sampling error. If it is,
the bias should be observed with any option pricing model on any underlying,
not just equity, but also fixed income securities, mortgages, foreign exchange,
and commodities. To test this idea, we collect 100 months of call and put
equity option prices spanning 8 1/3 years from January, 1996 through April,
2004. The bias is indeed present and very significant in this sample. Alternative
variance estimators that use “shrinkage” techniques might be able to eliminate
the bias. We use shrinkage estimators of James-Stein detailed in Efron and Morris
(1976) and Ledoit and Wolf (2004b). While both shrinkage estimators utilize the
covariance matrix, Ledoit-Wolf (or LW hereafter) is unique because it does not
require matrix inversion. Thus the number of stocks can exceed the number
of observations, which is extremely advantageous for large portfolios. We show
106
that the variance bias can be eliminated using these improved estimators. Using
Theil’s decomposition, we study whether the prediction error is increased by the
“corrected” volatility shrinkage estimates while the volatility bias is eliminated.
We find that Stein estimators do increase the prediction error but the LW
estimator does not. We also find that the LW optimum shrinkage dominates
randomly chosen shrinkage factors. Finally we find there is differential optimal
shrinkage for stocks with different proportions of systematic and idiosyncratic
risks
107
3.1 Introduction
The Black and Scholes (1973) option pricing model exhibits systematic mis-
pricing of options on individual stocks and options on indexes of stocks.
This mis-pricing has been related to moneyness (S /K ), time to expiration,
and volatility. The mis-pricing has also been related to the Black-Scholes
distributional assumption, to their assumption of no dividend payouts, and to
the model’s European rather than American nature.1
1
Black and Scholes (1972), Black (1975), Macbeth and Merville (1979), Rubinstein (1985),
Whaley (1982), Sterk (1982), Geske and Roll (1984b).
2
See Geske, Roll, and Shastri (1983).
108
these papers would potentially alter the prices of options on all individual stocks
without a particular focus on the observed cross-sectional mis-pricing of options
on low and high variance stocks. Thus, in this paper it is our thought to see
whether this variance bias observed in individual option cross-sectional prices
can be attributed to estimation error in the sample variance.
There is some a priori reason to suspect estimation error in the sample variance
rather than the model as the source of this particular mis-pricing. The reason is
that this variance related mis-pricing always arises in the context of an inter-stock
comparison. This is in contrast to other biases (moneyness, time to expiration),
which can be detected in an inter-option comparison. Unlike the strike price and
time until expiration parameters, the true variance is identical for all identical
expiration options on the same stock on a given date. Thus, investigation of the
variance related mis-pricing cannot rely on either the implied variance or other
more sophisticated option pricing models, but must instead be based on historical
estimates of actual stock return volatility.
There are many techniques to improve the accuracy of the volatility estimate
for individual stocks. (Cf. Boyle and Ananthanarayanan (1977), Parkinson
(1980), Garman and Klass (1980), and Butler and Schachter (1986), ARCH,
GARCH.) However, the essence of the present problem is that a number of
variances are estimated simultaneously, one for each stock, and then option mis-
pricing is related cross-sectionally to these multitudinous estimates.
109
versa for relatively smaller estimates. Thus, in a cross-sectional comparison of
option mis-pricing, estimation error alone will cause stocks with larger estimated
variances to over-price the market and stocks with smaller estimated variances to
under-price the market. The Black-Scholes model price, being a positive function
of the sample variance, should display a positive cross-sectional mis-pricing. This
is exactly the observed mis-pricing phenomenon.
When many variances are being estimated, one for each stock, a James and
Stein (1961) estimator is unambiguously superior to the standard univariate
estimator. The James-Stein estimator reduces estimation risk on average over
all stocks. Such an estimator “shrinks” each individual variance estimate toward
a target such as the grand mean of all estimates. Since the variance bias is
characterized by over-pricing options on high volatility stocks and under-pricing
options on low volatility stocks, adjusting each estimated volatility toward the
average volatility for all stocks obviously has the potential to reduce the observed
variance bias. In the multiple variance estimation setting, the superior James-
Stein estimation technique has the potential to eliminate this problem.
Geske and Roll (1984a) observed the variance bias and were the first to
attempt a correction based on a version of Stein’s technique described in Efron
and Morris (1976).3 However, this particular “shrinkage” technique involves two
difficult questions. First, how much historical data should be used to estimate
individual stock variances? Second, toward what target should individual stock
3
Subsequent to Geske and Roll (1984a), several other papers confront the same volatility
problem Karolyi (1993) uses a Bayesian approach. He describes the difference (p. 583) as
follows: “What distinguishes the Bayesian estimator of volatility from the “shrinkage” estimator
is in the adjustment process.” Karolyi considers only call options and he reports that the
Bayesian approach eliminates the volatility bias for high volatility stocks but there remains a
statistically significant but small bias for the low volatility stocks. Karolyi also reports that
the Bayesian estimator creates an under pricing bias in all the call options. Geske and Torous
(1990, 1991) use robust techniques to treat outliers when estimating volatility, and they also
examine the effects of a non-normal skewness and kurtosis on option prices.
110
variance estimates be shrunk? Until recently, the first question was usually
resolved by constraints on matrix inversion. The sample covariance matrix is
non-singular only when the time series sample size, N , exceeds the number of
stocks, k .4 Because of this requirement, smaller groups of stocks are often formed
to estimate parameters, and then results from the smaller groups are combined
and analyzed.
The second question of shrinkage target is more complex. The target should
have minimal free parameters (a lot of structure), should have less estimation
error, and should somewhat reflect the characteristics of the quantity to be
estimated. In three recent papers Ledoit and Wolf (2003, 2004b, 2004a) have
introduced techniques that provide solutions to these requirements.
Ledoit and Wolf start with the sample covariance matrix because it is unbiased
and easy to calculate. They recognize that it is subject to estimation error,
especially when there are fewer time series observations than individual stocks,
which is often the case in financial applications. They also recognize that
an estimator with more structure would have less estimation error, but would
likely be mis-specified and biased. Thus, they find a compromise by computing
an optimal linear convex combination of the sample covariance matrix and a
structured target. They provide results for three targets, the Sharpe single index
model, the identity matrix, and a constant correlation model. Herein we compare
a version of the James-Stein estimator to the Ledoit-Wolf technique. For Ledoit-
Wolf we shrink toward the simplest target, the identity matrix, which is well
conditioned, structured, and parsimonious.
111
alternative variance estimators. Section 4 reports the results and shows that
the shrinkage techniques of Stein and Ledoit-Wolf both eliminate the variance
bias, but that the Ledoit-Wolf technique is superior with respect to prediction
error. Section 5 concludes.
The data come from CRSP for daily stock returns and from Option Metrics (OM)
for call and put option prices, dividend distributions, and implied volatilities. The
OM data span the 100 months from January, 1996 through April, 2004 inclusive.
Stocks are screened one way and options are screened five ways. To assure
that stocks are actively traded, we use only the 500 largest stocks by market
capitalization on the last trading day of the previous year. Stocks are limited
to common shares with share codes 10 or 11. For options, the first screen limits
observations to the first trading day of each calendar month. This potentially
provides 100 monthly observations of options on 500 individual stocks and allows
estimators of volatility to be computed with return observations through the end
of each preceding month. The second screen limits options to being near-the-
money, which we define as 0.95 < K /S < 1.05 (with K the strike price and
S the stock price on the first day of the month.) Near-the-money options are
the most actively traded of all options with different times to expiration, and
since these are options on large companies they usually trade many times every
day. Also, near-the-money options should exhibit less moneyness bias. The third
option screen restricts the sample to options expiring on the third Friday of the
next month. Thus, all options have the same short time to expiration, which
should control somewhat for any time bias. Short-maturity options are also the
most actively traded of all options with different strike prices. Thus, near-the-
112
money, short-maturity options on large stocks should trade many times every
day. The fourth screen restricts options to those that actually did trade on each
day. The fifth option screen eliminates any detectable arbitrage violations (e.g.,
C > S − Ke −rT ; P > S − K ). After these screens, the sample has on average
about 300 call options and 200 put options per month .
Historical volatilities are computed for each individual stock using 126 days
(approximately 6 months) of previous CRSP daily data preceding each of the 100
first day of month observations for the stock price and option prices. Stock βs are
calculated using 504 days (approximately 2 years) of daily data preceding each of
the 100 first day of month observations, using the CRSP value weighted return as
the market index. (βs are inputs for the particular Stein estimator that assumes
a one-factor structure for the covariance matrix.) We also compute the sample
covariance matrix for all stocks in the sample using the preceding 6 months of
CRSP daily data; this is an input for the Ledoit-Wolf estimator.
113
2 is the Stein estimator for stock j , σ̂ 2 is an historical estimate for the
where σ̂sj Hj
same stock, σ̄ 2 is the grand cross-sectional average of all the historical estimates,
and γj is a shrinking intensity factor bounded between zero and one.
N − k − 2 −1 k + 1 − k2 2 −1 −1
Ŝs = [ ŜH + (σ̄ I) ] (3.2)
N −1 N −1
5
See Efron and Morris (1975). They discuss Stein’s rule as an empirical Bayes rule,
and present applications such as predicting baseball batting averages, estimating toxomosis
prevalence rates, and estimating the exact size of Pearson’s chi-square test.
114
of the same order as the total size of the data available. When k is larger than
N , the sample covariance matrix is always singular, even if the true covariance
matrix is known to be non-singular. Muirhead (1987) reviews the literature
on shrinkage estimators of the covariance matrix and shows that they all suffer
from two major limitations: (i) they break down when k > N and the matrix
cannot be inverted; (ii) they do not utilize a priori knowledge about correlations
between stock returns. We can circumvent the second limitation by assuming
that asset returns follow a factor model, say the single-factor market model akin
2 . By
to the CAPM. Therefore the off-diagonal entry i , j of Ŝs is simply β̂i β̂j σ̂m
imposing more structure in this fashion, one can make the sample covariance
matrix behave. Ledoit-Wolf techniques circumvent both of these problems.
qm
m ≡ hS , Ii −→ µ (3.3)
qm
d 2 ≡ kS − mIk2 −→ ξ (3.4)
N
2 2 2 2 1 X qm
b = min(b̄ , d ), where b̄ = 2 kX.i X.iT − S k2 , b −→ β (3.5)
N
i=1
qm
a 2 ≡ d 2 − b 2 ; a −→ α (3.6)
6
See Ledoit and Wolf (2004b). The squared Frobenius norm kk2 is a quadratic form whose
inner product is hXX ′ i = tr(XX ′ )/N and the four unobservable scalars are µ = hΣ, I i,
α2 = kΣ − µIk, β 2 = E[kS − Σk2 ], and ξ 2 = E[kS − µIk2 ] and Σ is the true covariance
matrix and convergence is in quadratic mean, qm.
115
their linear combination of S and I that minimizes the expected quadratic loss
is:
b2 a2
S∗ = mI + S (3.7)
d2 d2
b2
Now if γ is defined as γ ≡ d2
, then S ∗ = γmI + (1 − γ)S .
We shrink the standard historical volatilities estimates in three ways (two Stein
and one Ledoit-Wolf) and then compare the four estimators (including the
historical.) We form groups of 50 stocks each for Stein because it requires
the cross-section of individual stocks, k , to be smaller than the time series of
observations, N (herein N = 126). The two Stein estimators differ because
the first groups stocks randomly while the second estimator groups stocks to
maximize the volatility dispersion within each group. To achieve the volatility
dispersion, we first sort all the 500 stocks by their historical volatilities and
allocate the stocks ranked 1, 11 . . . 481, 491 to the first group, 2, 12, . . . to the
second group, and similarly for all 10 groups.
Table 3.1 provides summary statistics for all the volatility estimators.
The statistics presented are time-series means of the cross-sectional summary
116
statistics. On average, all Stein-type estimators have a lower mean than the
original historical estimates as well as the Ledoit-Wolf estimator. The reason is
all the Stein-type estimators involve matrix inversions which decrease the average
due to the Jensen’s inequality. Moreover, all shrinkage estimators exhibit lower
cross-section dispersion as expected, consistent with the shrinkage process. The
LW estimator still preserves preserves more cross-sectional variation compared
with the Stein estimators, which suggests that the LW shrinkage intensity is
effectively smaller.
σimp
Errorestimator = ln( ) (3.8)
σestimator
Then the following cross-sectional regressions are computed for each month:
σimp,i,t
log( ) = α + β σ̂i,t + γXi,t + ǫi,t (3.9)
σ̂i,t
Following Fama and MacBeth (1973), time series means of the cross-sectional
coefficients are compared against time series standard errors computed using a
Newey-West autocorrelation correction with 8 lags.
Table 3.2 presents the main results from these regressions. The historical
volatility column reports the coefficients and test statistics for equation (3.9)
117
when volatility is computed with the standard historical method; it shows clearly
the extent of the previously-observed volatility bias. The coefficient is large and
negative, −0.390, and very significant, (t = −8.686.) This is consistent with the
finding in Black and Scholes (1972) that in the cross-section options of low (high)
volatility stocks are under priced (over priced) by their model.
In columns next three columns for the panel using calls, the coefficients
(-0.057, 0.119, -0.029) for the three shrinkage estimators and test statistics
(−0.044, 0.883, −0.869) for James-Stein Random, James-Stein High Dispersion,
and Ledoit-Wolf show that the volatility bias has been eliminated. The control
for moneyness reveals that the moneyness bias is significant and is independent
of the volatility bias. Puts display an identical pattern for everything. Thus,
we conclude that both Stein and Ledoit-Wolf shrinkage techniques are able to
eliminate this volatility bias of under pricing options on low volatility stocks and
over pricing options on high volatility stocks.7
7
In unreported results, we also examine other variants of the James-Stein estimators
with differing assumptions about the covariance matrix target. One target assumes that all
covariances are the same and equal to the average sample covariance. The other target assumes
that all covariances are zero. These calculations were again carried out with randomly sorted
groups and with groups organized to maximize intra-group volatility dispersion. In all cases,
the results are essentially the same as those reported in all tables (3.1 through 3.6 inclusive), for
the James-Stein estimators. The authors will be happy to provide detailed results to interested
readers.
118
Stein 2 (disperse) are larger (0.058 and 0.054) and very significantly different from
the historical estimator (t-stats of 3.843 and 3.460). However, the prediction error
for the Ledoit-Wolf estimator, 0.040, is almost the same size as the uncorrected
historical prediction error, 0.040, and the two are not significantly different. For
put options the results are very similar, with the only difference being that the
Ledoit-Wolf estimator now has the lowest prediction error, 0.035, but it is not
statistically different from the uncorrected historical estimator prediction error,
0.037. Thus, we see that while the Stein shrinking does eliminate the volatility
bias, it also increases the prediction error, and this increased error is statistically
significant. The Ledoit-Wolf estimator does not have this problem.
119
we define the following relative indicator of systematic risk for each stock I at the
beginning of month t, based on approximately two prior years of daily returns:
σidiosyncratic,i,t 2
Sysi,t = 1 − ( ) (3.10)
σhistorical ,i,t
2
σshrunk 2
,i,t − σhistorical ,i,t
Shrinkagei,t = 2 (3.11)
σhistorical ,i,t
Table 3.4 presents the results. For the Stein estimators, as the systematic
portion of the risk increases, the negative coefficient indicates that the difference
between the corrected and uncorrected estimators decreases. Thus, it appears
that Stein’s shrinkage percentage decreases (increases) as the portion of
systematic risk increases (decreases), or alternatively, Stein shrinks more if the
proportion of idiosyncratic risk is larger. Ledoit-Wolf estimator appears to behave
opposite to Stein in this respect. As the proportion of systematic risk increases
the Ledoit-Wolf shrinkage percentage increases.
The higher prediction errors of the Stein-type estimators might arise because
the Efron-Morris method assumes normality while stock returns distributions
are leptokurtic. In order to investigate whether the leptokurtosis increases the
prediction errors, we first insert each stock’s 6-month kurtosis into equation 12.
120
The regression results are displayed in Table 3.5. Higher kurtosis induces a
upward bias in all the volatility estimates as evidenced by the significant negative
coefficients for all estimators. The same results hold in both call and put options.
However, the coefficients for this kurtosis variable are virtually the same across
all the estimators, implying that the impact of kurtosis is about the same across
all estimators and not likely to be the reason for the higher prediction errors of
the Stein-type estimators.
Table 3.6 examines the prediction errors in low and high kurtosis stocks. Each
month, we sort all the stocks into two halves by their previous 6-month kurtosis
and look at the prediction errors of all the estimators in each half. T -statistics
for the difference between shrinkage estimators and the historical estimator are
computed from the 100-month time series of each monthly difference in errors
with a Newey-West correction for autocorrelation using eight lags.
For call options, the prediction errors using the historical and Ledoit-Wolf
estimators are significantly lower in the low kurtosis group. For Stein-type
estimators, the gaps between low and high kurtosis groups are much smaller.
Therefore, among low kurtosis stocks, the prediction errors for the Stein-type
estimators remain significantly higher than for the historical and Ledoit-Wof
estimators. In the high kurtosis group, all estimators have similar prediction
errors and the differences are not significant statistically. Hence, the generally
higher prediction errors exhibited by the Stein estimators can be attributed
mostly to low kurtosis stocks.
The results for put options are quite similar except that the Stein-type
estimators produce errors that are also significantly greater than the historical
estimator even for high kurtosis stocks, though the significance level is higher for
low kurtosis stocks as it is for calls.
121
We also examine whether an optimal shrinkage estimator that minimizes the
sum of squared errors is important for the particular application of volatility
estimation. To do this, we compare LW’s optimal shrinkage to a random average
of combining the historical product moment sample matrix and the identity
matrix. In a similar comparison for estimation of the covariance matrix both
Jagannathan and Ma (2003) and Disatnik and Benninga (2007) report that
optimal shrinkage is no better than randomly choosing between the sample matrix
and the identity matrix , and thus optimality is not worth the effort.8 We find
that the LW optimal shrinkage estimator is much better than the random average
of the sample matrix and the identity target.
3.5 Conclusion
A volatility bias in option prices was first uncovered by Black and Scholes
(1972). They demonstrated that their model over-priced options on relatively
high volatility stocks under-priced options on relatively low volatility stocks. We
thought that this bias might have nothing to do with the Black-Scholes model
but instead could be attributable to sampling error because it is always observed
in cross section with inter stock differences. If this is true, this bias would be
observed with any option pricing model on any underlying, not just equity, but
also fixed income securities, mortgages, foreign exchange, and commodities.
8
Jagannathan and Ma (2003), p. 1667, and Disatnik and Benninga (2007), p. 60 report a
random average does as well as optimal shrinkage. Disatnik and Benninga state, “Theoretically,
the shrinkage estimator should perform better than any other weighted average of the two
estimators, as the proportions in the weighted average of the shrinkage estimator are obtained
from minimizing the quadratic risk (of error) function of the combined estimator. Yet it seems
that, in practice, estimating these specific proportions gives rise to a new type of error, and
overall the shrinkage estimator does not perform better than the random average.”
122
of James-Stein and Ledoit-Wolf, which correct historical volatility estimates by
shrinking them toward a central value, thereby reducing their cross-sectional
dispersion. While both shrinkage estimators utilize the covariance matrix, Ledoit-
Wolf is unique because it does not require matrix inversion and thus the number
of stocks can exceed the number of observations.
First, we verify that the same bias Black-Scholes originally observed was
present and very significant in both put and call option prices for the 100 months
during the period January, 1996 through April, 2004. Second, we find that
shrinkage variance estimators can eliminate this volatility bias, independent of
the presence of the moneyness bias. Third, we uncover a difference between the
Ledoit-Wolf and Stein estimators; the former does not increase the prediction
error, but the latter significantly increase prediction error for stocks with low
kurtosis. Fourth, we demonstrate that the Stein estimator’s shrinkage percentage
is greater the lower (higher) the proportion of a stock’s systematic (idiosyncratic)
risk. The Ledoit-Wolf estimator behaves oppositely, and has a greater (lower)
shrinkage percentage the higher (lower) the portion of systematic (idiosyncratic)
risk. Finally, we show that the optimal shrinkage estimator of Ledoit-Wolf is
superior to a random combination of the sample matrix and the target for this
volatility estimation problem.
123
Table 3.1: Summary Statistics of Annualized Volatility Estimates
This table shows time series averages of cross-sectional summary statistics of the historical
volatility estimates with 6-months of daily returns and corresponding shrinkage estimators.
Stein Random is the volatility shrunk by the Efron-Morris formula in random groups. Stein
High Dispersion is the volatility shrunk by Efron-Morris formula in groups formed to have larger
volatility dispersion. Ledoit-Wolf is the volatility shrunk by the Ledoit-Wolf method. The mean
is the average across all time series and cross-sections. The std is the time series average of
the cross-sectional standard deviation for each sample month. Minimum and maximum are the
time series averages of, respectively, the cross-sectional minimum and maximum in each sample
month.
124
Table 3.2: Volatility Biases
Fama-MacBeth type tests were conducted for the following cross-sectional specification,
σ
log( imp,i,t
σ̂i,t ) = α + β σ̂i,t + γXi,t + ǫi,t . The upper panel is for call options, the lower panel
for put options. All t -statistics are computed from the 100-month time series of cross-sectional
coefficients with a Newey/West correction for autocorrelation using eight lags and are reported
below the corresponding coefficient means. The four columns correspond to different volatility
estimators. Historical is the standard estimator. James-Stein Random is an estimator shrunk
by the Efron-Morris formula in random groups. James-Stein High Dispersion is an estimator
shrunk by the Efron-Morris formula in groups with large volatility dispersion. Ledoit-Wolf is
an estimator shrunk by the Ledoit-Wolf method.
125
Table 3.3: Prediction Errors of Volatility Estimators
Average prediction errors are computed for the historical volatility estimator and all three
σ
shrinkage estimators, measured by the root mean square of log( imp,i,t
σ̂i,t ). T -statistics for the
difference between shrinkage estimators and the historical estimator are computed from the
100-month time series of each monthly difference in errors with a Newey-West correction for
autocorrelation using eight lags, and are reported below the corresponding coefficient means.
The four columns correspond to different volatility estimators. Historical is the standard
estimator. James-Stein Random is an estimator shrunk by the Efron-Morris formula in random
groups. James-Stein High Dispersion is an estimator shrunk by the Efron-Morris formula in
groups with large volatility dispersion. Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf
method. In Theil’s decomposition, UM is the proportion due to bias in the forecasts. UR is
the error due to a low correlation between the actual and the forecast. UD is the remaining
part. T -statistics in the parentheses are computed using Newey-West with 8 lags.
126
Table 3.4: Shrinkage and Systematic Risk
For each of 100 month, cross-sectional regressions were computed to explain the shrinkage
proportion as a function of the systematic risk estimated over the previous two years
(approximately.) T -statistics, in parentheses, are computed from the time series of cross-
sectional coefficients using a Newey-West correction for autocorrelation with 8 lags. The
columns correspond to three alternative shrinkage estimators. James-Stein Random is an
estimator shrunk by the Efron-Morris formula in random groups. James-Stein High Dispersion
is an estimator shrunk by the Efron-Morris formula in groups with large volatility dispersion.
Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf method.
127
Table 3.5: Control for Kurtosis
Fama-MacBeth type tests were conducted for the following cross-sectional specification,
σ
log( imp,i,t
σ̂i,t ) = α + β σ̂i,t + γXi,t + ǫi,t . The upper panel is for call options, the lower panel
for put options. All t -statistics are computed from the 100-month time series of cross-sectional
coefficients with a Newey/West correction for autocorrelation using eight lags and are reported
below the corresponding coefficient means. The four columns correspond to different volatility
estimators. Historical is the standard estimator. James-Stein Random is an estimator shrunk
by the Efron-Morris formula in random groups. James-Stein High Dispersion is an estimator
shrunk by the Efron-Morris formula in groups with large volatility dispersion. Ledoit-Wolf is
an estimator shrunk by the Ledoit-Wolf method.
128
Table 3.6: Prediction Errors for Low and High Kurtosis Stocks
Average prediction errors are computed for the historical volatility estimator and all three
σ
shrinkage estimators, measured by the root mean square of log( imp,i,t σ̂i,t ) for stocks grouped
by kurtosis over the previous six months. T -statistics for the difference between shrinkage
estimators and the historical estimator are computed from the 100-month time series of each
monthly difference in errors with a Newey-West correction for autocorrelation using eight lags.
These t -statistics are given in parentheses below each mean prediction error. The four columns
correspond to different volatility estimators. Historical is the standard estimator. James-Stein
Random is an estimator shrunk by the Efron-Morris formula in random groups. James-Stein
High Dispersion is an estimator shrunk by the Efron-Morris formula in groups with large
volatility dispersion. Ledoit-Wolf is an estimator shrunk by the Ledoit-Wolf method. In Theil’s
decomposition, UM is the proportion due to bias in the forecasts. UR is the error due to a low
correlation between the actual and the forecast. UD is the remaining part. T -statistic in the
parentheses are computed using Newey-West with 8 lags.
129
References
Amihud, Yakov, and Haim Mendelson, 1986, Asset pricing and the bid-ask
spread, Journal of Financial Economics 17, 223–249.
Amin, Kaushik I., and Charles M. Lee, 1997, Option trading, price discovery and
earning news dissemination, Contemproray Accounting Research 14, 153–192.
Ang, Andrew, and Geert Bekaert, 2007, Stock return predictability: is it there?,
Review of Financial Studies forthcoming.
Anthony, Joseph H., 1988, The interrelation of stock and options market trading-
volume data, Journal of Finance 43, 949–964.
Baker, Malcolm, and Jeffrey Wurgler, 2000, The equity share in new issues and
aggregate stock returns, Journal of Finance 55, 2219–2257.
Bakshi, Gurdip, Charles Cao, and Zhiwu Chen, 1997, Empirical performance of
alternative option pricing models, Journal of Finance 52, 2003–2049.
Biais, Bruno, and Pierre Hillion, 1994, Insider and liquidity trading in stock abd
options markets, Review of Financial Studies 74, 743–780.
Black, Fischer, 1975, Fact and fantasy in use of options, Financial Analyst
Journal 31, 36–41,61–72.
Black, Fischer, and Myron Scholes, 1972, The valuation of option contracts and
a test of market efficiency, Journal of Finance 27, 399–417.
Black, Fischer, and Myron Scholes, 1973, The pricing of options and corporate
liabilities, Journal of Political Economy 81, 637–659.
130
Boudoukh, Jacob, Matthew Richardson, and Robert F. Whitelaw, 2005, The
myth of long-horizon predictability, Working paper.
Boyle, Phelim P., and A. L. Ananthanarayanan, 1977, The impact of variance
estimation in option valuation models, Journal of Financial Economics 5, 375–
387.
Brennan, Michael J., Ashley W. Wang, and Yihong Xia, 2004, Estimation and
test of a simple model of intertemporal capital asset pricing, Journal of Finance
59, 1743–1776.
Butler, J. S., and Barry Schachter, 1986, Unbiased estimation of the Black-Scholes
formula, Journal of Financial Economics 15, 341–357.
Campbell, John Y., 1987, Stock returns and the term structure, Journal of
Financial Economics 18, 373–399.
Campbell, John Y., 1991, A variance decomposition for stock returns, Economic
Journal 101, 157–179.
Campbell, John Y., and John H. Cochrane, 1999, By force of habit: a
consumption-based explanation of aggregate stock market behavior, Journal
of Political Economy 107, 205–251.
Campbell, John. Y., Sanford J. Grossman, and Jiang Wang, 1993, Trading volume
and serial correlation in stock returns, Quarterly Journal of Economics 108,
905–939.
Campbell, John Y., and Robert J. Shiller, 1988a, Stock prices, earnings, and
expected dividends, Journal of Finance 43, 661–676.
Campbell, John Y., and Robert J. Shiller, 1988b, The dividend-price ratio and
expectations of future dividends and discount factors, Review of Financial
Studies 1, 195–228.
Campbell, John. Y., and Tuomo Vuolteenaho, 2004, Bad beta, good beta,
American Economic Review 94, 1249–1275.
Carr, Peter, Robert Jarrow, and Ravi Myneno, 1992, Alternative
characterizations of American put options, Mathematical Finance 2, 87–106.
Chakravarty, Sugato, Huseyin Gulen, and Stewart Mayhew, 2004, Informed
trading in stock and option markets, Journal of Finance 59, 1235–1257.
Chan, Kalok, Y. Peter Chung, and Herb Johnson, 1993, Why option prices lag
stock prices: A trading based explanation, Journal of Finance 48, 1957–1967.
131
Chan, Yeung L., and Leonid Kogan, 2002, Catching up with the Joneses:
heterogeneous preferences and the dynamics of asset prices, Journal of Political
Economy 110, 1255–1285.
Cox, John C., and Stephen A. Ross, 1976, The valuation of options for alternative
stochastic processes, Journal of Financial Economics 3, 145–166.
Dichev, Ilia, 2007, What are stock investors’ actual historical returns? evidence
from dollar-weighted returns, American Economic Review 97, 386–401.
Disatnik, David J., and Simon Benninga, 2007, Shrinking the covariance matrix,
Journal of Portfolio Management pp. 55–63.
Easley, David, Maureen O’Hara, and P. S. Srinivas, 1998, Option volume and
stock prices: Evidence on where informed traders trade, Journal of Finance
53, 431–465.
Efron, Bradely, and Carl Morris, 1975, Data analysis using Stein’s estimator and
its generalizations, Journal of American Statistical Association 70, 311–319.
Efron, Bradely, and Carl Morris, 1976, Multivariate empirical Bayes and
estimation of covariance matrices, Annals of Statistics 4, 22–32.
Fama, Eugene, and J. D. MacBeth, 1973, Risk, return, and equilibrium: empirical
tests, Journal of Political Economy 81, 607–636.
Fama, Eugene F., and Kenneth R. French, 1988, Dividend yields and expected
stock returns, Journal of Financial Economics 22, 3–25.
Fama, Eugene F., and Kenneth R. French, 1989, Business conditions and expected
returns on stocks and bonds, Journal of Financial Economics 25, 23–49.
Fama, Eugene F., and Kenneth R. French, 1993, Common risk factors in the
returns on stocks and bonds, Journal of Financial Economics 33, 3–56.
Fama, Eugene F., and William G. Schwert, 1977, Asset returns and inflation,
Journal of Financial Economics 5, 115–146.
Ferson, Wayne, Sergei Sarkissian, and Tim Simian, 2003, Spurious regression in
financial economics?, Journal of Finance 58, 1393–1414.
Fleming, Jeff, and Robert E. Whaley, 1994, The value of wildcard options,
Journal of Finance 49, 215–236.
132
Garman, Mark B., and Michael J. Klass, 1980, On the estimation of security
price volatilities from historical data, 53, 67–79.
Geske, Robert, and H. E. Johnson, 1984, The American put option valued
analytically, Journal of Finance 39, 1511–1524.
Geske, Robert, and Richard Roll, 1984a, Isolating the observed biases in
American call option pricing: an alternative variance estimator, Anderson
UCLA Working Paper.
Geske, Robert, and Richard Roll, 1984b, On valuing American call options with
the Black-Scholes European formula, Journal of Finance 39, 443–455.
Geske, Robert, and Walter Torous, 1990, Black-Scholes option pricing and robust
variance estimation, in Options: Recent Advances in Theory and Practice, ed.
by Stuart Hodges. Manchester University Press.
Geske, Robert, and Walter Torous, 1991, Skewness, kurtosis, and Black-Scholes
option mis-pricing, Statistical Papers 32, 299–309.
Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov, 2004, The MIDAS
touch: Mixed data sampling regression models, UCLA Working paper.
Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov, 2005b, There is a risk-
return tradeoff after all, Journal of Financial Economics 76, 509–548.
Gompers, Paul A., and Andrew Metrick, 2002, Institutional investors and equity
prices, Quarterly Journal of Economics 116, 229–259.
Goyal, Amit, and Ivo Welch, 2006, A comprehensive look at the empirical
performance of equity premium prediction, Review of Financial Studies
forthcoming.
133
Gukhal, Chandrasekhar Reddy, 2001, Analytical valuation of American options
on jump-diffusion processes, Mathematical Finance 11, 97–115.
Harvey, Campbell, and Robert E Whaley, 1992a, Dividends and S&P Index
Option Valuation, Journal of Futures Markets 12, 123–137.
Harvey, Campbell, and Robert E Whaley, 1992b, Market volatility prediction
and the efficiency of the S&P 100 index option market, Journal of Financial
Economics 30, 33–73.
Heston, Steven L., 1993, A closed form solution for options with stochastic
volatility with applications to bond and currency options, Review of Financial
Studies 6, 327–343.
Heston, Steven L., and Saikat Nandi, 2000, A closed-form GARCH option
valuation model, Review of Financial Studies 13, 585–625.
Hodrick, Robert J., 1992, Dividend yields and expected stock returns: Alternative
procedures for inference and measurement, Review of Financial Studies 5, 257–
286.
Hong, Harrison, Walter Torous, and Rossen Valkanov, 2004, Do industries lead
stock markets, UCLA Working paper.
Hull, John, and Alan White, 1987, A closed form solution for options with
stochastic volatility, Journal of Finance 42, 281–300.
Jagannathan, Ravi, and Tongshu Ma, 2003, Risk reduction in large portfolios:
Why imposing the wrong constraints helps, Journal of Finance 58, 1651–1683.
James, W., and C. Stein, 1961, Estimation with Quadratic Loss, in Proceedings of
the Fourth Berkeley Symposium on Mathetical Statistics and Probability vol. 1
pp. 361–379 Berkeley. University of California Press.
Karolyi, G. Andrew, 1993, A Bayesian approach to modeling stock return
volatility for option valuation, Journal of Financial and Quantitative Analysis
28, 579–594.
Keim, Donald B., and Robert F. Stambaugh, 1986, Predicting returns in the
stock and bond markets, Journal of Financial Economics 17, 357–390.
Kim, In Joon, 1990, The Analytical valuation of American options, Review of
Financial Studies 3, 547–572.
Kothari, S. P., and Jay Shanken, 1992, Stock return variation and expected
dividends, Journal of Financial Economics 31, 177–210.
134
Kyle, Albert S., 1984, Continuous auctions and insider trading, Econometrica 53,
1315–1336.
Ledoit, Olivier, and Michael Wolf, 2003, Improved estimation of the covariance
matrix of stock returns with an application to portfolio selection, Journal of
Empirical Finance 10, 603–621.
Ledoit, Olivier, and Michael Wolf, 2004a, Honey, I Shrunk the Covariance Matrix,
Journal of Portfolio Management Summer, 110–119.
Ledoit, Olivier, and Michael Wolf, 2004b, A Well-Conditioned Estimator for
Large Dimensional Covariance Matices, Journal of Multivariate Analysis 88,
365–411.
Longstaff, Francis A., and Eduardo S. Schwartz, 2001, Valuing American options
by simulation: a simple least-squares approach, Review of Financial Studies
14, 113–147.
Loughran, Tim, and Jay R. Ritter, 1995, The new issues puzzle, Journal of
Finance 50, 23–51.
Macbeth, James D., and Larry J. Merville, 1979, An empirical examination of
the Black-Scholes call option pricing model, Journal of Finance 34, 1173–1186.
Manaster, Stephen, and Richard J. Rendleman, 1982, Option prices as predictors
of equilibrium stock prices, Journal of Finance 37, 1043–1057.
McCracken, Michael W., 2004, Asymptotics for out-of-sample tests of Granger
causality, Working paper.
Merton, Robert C., 1973, Theory of rational option pricing, Bell Journal of
Economics and Management Science 4, 141–183.
Merton, Robert C., 1976, Option pricing when underlying stock returns are
discontinuous, Journal of Financial Economics 3, 125–144.
Miller, Edward M., 1977, Risk, uncertainty, and divergence of opinion, Journal
of Finance 32, 1151–1168.
Muirhead, R. J., 1987, Developments in Eigenvalue Estimation, in Advances in
Multivariate Statistical Analysis pp. 277–288 Dodrecht. D. Reidel Publishing
Company.
Newey, Whiney K., and Kenneth D. West, 1987, A simple, positive semi-
definite, heteroskedasticity and autocorrelation consistent Covariance matrix,
Econometrica 55, 703–708.
135
Pan, Jun, and Allen Poteshman, 2004, The information in option volume for
future stock prices, Working paper.
Park, Cheolbeom, 2005, Stock return predictability and the dispersion in earnings
forecasts, Journal of Business 78, 2351–2375.
Parkinson, Michael, 1980, The extreme value method for estimating the variance
of the rate of return, Journal of Business 53, 61–77.
Pastor, Lubos, and Robert F. Stambaugh, 2003, Liquidity risk and expected stock
returns, Journal of Political Economy 111, 642–685.
Poteshman, Allen M., and Vitaly Serbin, 2003, Clearly irrational financial market
behavior: Evidence from the early exercise of exchange traded stock options,
Journal of Finance 58, 37–70.
Ritter, Jay R., 1991, The long-run performance of initial public offerings, Journal
of Finance 46, 3–27.
Roll, Richard, 1977, An analytic valuation formula for unprotected American call
options on stocks with known dividends, Journal of Financial Economics 5,
251–258.
Shiller, Robert J., 2000, Irrational Exuberance. (Broadway Books New York).
Stein, Charles, 1955, Inadmissibility of the Usual Estimator for the Mean of
a Multivariate Normal Distribution, in Proceedings of the Third Berkeley
Symposium on Mathematical Statistics and Probability vol. 1 pp. 197–206
Berkeley. University of California Press.
Stephan, Jens A., and Robert E. Whaley, 1990, Intraday price change and trading
volume relations in the stock and stock option markets, Journal of Finance 45,
191–220.
Sterk, William, 1982, Tests of two models for valuing call options on stocks with
dividends, Journal of Finance 37, 1229–1237.
136
Torous, Walter, Rossen Valkanov, and Shu Yan, 2005, On predicting stock returns
with nearly integrated explanantory variables, Journal of Business 77, 937–
966.
Wang, Ashley W., 2003, Institutional equity flows, liquidity risk and asset pricing,
Working paper.
137