Documentos de Académico
Documentos de Profesional
Documentos de Cultura
This paper investigates the extent to which several firm-specific and macroeconomic factors can
explain the downgrade from investment to speculative grade, as well as their subsequent predictive
accuracy.
More specifically, a logistic model containing five variables selected on a stepwise basis, is tested on
three samples of non-financial companies spanning 37 countries and 7 industries, with quarterly data
ranging from 1986 to 2011.
Consistent with previous literature, after controlling for industry,
profitability , leverage, operating efficiency and the yield spread have a significant impact on fallen
angels, while the logistic model has an out-of-sample predictive overall accuracy between 70 and
85%.
The results are robust across different samples and model specifications.
1.Introduction
Ratings provided by Credit Rating Agencies (CRAs) gained over time a tremendous popularity, being
currently used as a benchmark for measuring different obligors creditworthiness. In spite of being
defined as specialized opinions by the CRAs, ratings are being currently hardwired to a vast
number of financial regulations, instruments and transactions (Gonzalez et al, 2004). Their importance
was even further enhanced by the Basel 2 rules, which established bank capital adequacy
requirements according to the corresponding credit ratings of their asset portfolio.
Among all rating transitions, there is one of particular importance due to its threshold effects: the
downgrade from investment to speculative grade, which results in a fallen angel - as Moodys
define the companies that go through such a process. Even if the investment-speculative cutoff was
arbitrarily established by CRAs (Harold, 1938), speculative-rated bonds are much more prone to
default than investment ones according to Moodys (appendix 1). Hence, it is hardly surprising that
there are regulations which restrict heavily the ability of different investor groups to hold speculative
grade securities1, which further leads to decreased access to debt for the downgraded entities with all
its subsequent consequences.
Rating research has a long history, following promptly the first public issuance of ratings. Yet, the
number of studies focusing on ratings is considerably low , in comparison with the papers related to
bankruptcy prediction -
which might seem surprising given the fact that downgrades can be
commercial banks were prohibited from holding junk bonds since 1936, while the Financial Institutions Reform Recovery
and Enforcement Act of 1989 extended the ban to other speculative-rated credit instruments as well(West, 1973); in US,
state insurance regulations follow the guidelines established by the National Association of Insurance Commissioners,
which requires higher risk premia on holdings of speculative-grade bonds.
2
Lastly, the most prolific literature strand, to which this paper is also related, refers to the use of
publicly available information for explaining and predicting ratings. The most common type of rating
prediction models rely on various statistical techniques, similar with the ones used for default
prediction. Given the ordinal nature of ratings, order probit is a natural methodological choice, which
is discussed in detail in Kaplan et al (1979) and Ederington (1985), who also prove its empirical
superiority as compared to discriminant analysis and least squares. In spite of the development of more
complex techniques, ordered probit has also been used in more recent studies, such as Hwang(2008)
and Hwang(2010), which use a semiparametric specification for the probit function.
With regards to fallen angels, even if most of the previously mentioned studies control for speculativerated companies, there has been only one study focusing specifically on fallen angels:
Chernenko&Suderam (2011), which investigate the market segmentation at the investment-tospeculative boundary, and having therefore a different focus than this paper.
On the other hand, given the fact that the accuracy of rating prediction models is inferior to the ones of
bankruptcy prediction models (many of which achieved accuracies of over 85%3), and their volatility
is rather high, it is worth investigating whether these differences are of methodological nature, or due
to the informational advantage of CRAs.
Firstly, using one single model for all rating transitions leads to spurious results, since the impact of a
downgrade and the factors taken into account in the rating methodology vary throughout the rating
scale. Moodys state clearly that fallen angels have their own specific characteristics, while different
ratios weight heavier when assessing such companies (Moodys Rating methodology, 2000)
Secondly, the meaning of downgrade is overlooked in most, if not all, studies. More specifically,
researchers focus on creating a model which differentiates between rating groups, without capturing
the downgrade process per se4 - which presumably leads to a model with higher explanatory power,
but slower in predicting downgrades. In other words, there is an aging effect, which is being
neglected in these studies.
Thirdly, similarly with the default prediction literature, most studies focused on rating prediction
neglect the sensitivity of the cutoff score to classification errors, as first mentioned in Ohlson (1980).
While the real cost of type 1 error cant be estimated, a more accurate out-of-sample cutoff score
could be estimated according to a representative sample distribution.
3
Further, I argue that using issuer as opposed to issue-based ratings leads to more accurate results, since
several other instrument-specific variables such as maturity or seniority should be taken into account
for instrument-specific ratings.
Lastly, taking into account the controversy surrounding ratings timeliness, it would be more
interesting to use quarterly, as opposed to annual data, which is employed throughout the existent
literature.
While some of these issues are partially addressed in few studies, the research design and data
structure used in multiple rating class analysis makes it virtually impossible to rule out simultaneously
the aforementioned pitfalls.
For this reason, but also because of the importance of the investment-to-speculative grade cutoff, this
paper attempts to develop a model focusing specifically on fallen angels, taking into account the
financials from the quarter prior to the actual downgrade. I find that several firm-specific factors
commonly used in default prediction such as size, profitability, leverage, operating efficiency, as well
as industry and macroeconomic factors can explain as much as 35% 5 of the downgrades, being
significant and robust across different specifications. Moreover, the model has an average out-ofsample prediction accuracy ranging from 70 to 85%. While the results should be interpreted with
caution due to the limited sample size , the consistent significance and relatively high prediction
accuracy, especially given the sample heterogeneity , strengthens considerably this papers findings.
The remainder of the paper is organized as follows. The next section discusses the methodology and
model, after which the findings are presented and compared with previous literature. Lastly, several
suggestions for improvements and further research are mentioned.
This number represents McFadden R-squared , which is specific to logistic regressions, and is therefore not
directly comparable to OLS r-squared
6
For a more detailed discussion on ML estimation and logit , see McFadden (1973)
In short, logit is a classification model, which uses a logistic function to convert a binary outcome into
probabilities, given a specific set of predictors.
Lets assume Xi as a vector of predictors for the ith observation, and be a vector of unknown
parameters. Then, P(X1, ) denotes the probability of downgrade for any given Xi and , with 0 < P <
1. The logarithm of the likelihood of any specific outcome, given the binary alternative (downgrade vs
not downgraded), is then given by:
( )
))
( (
( )
(2)
One important implication of (2) is that y=P/(1-P), which means that the interpretation is rather
straightforward, and it has more informational content as compared, for example, with MDA.7
Moreover, one major benefit of logit over all other statistical models is its unbiasedness when used for
choice-based sampling. As detailed in Cram et al (2009), the fact that the control and treatment
group are known and chosen apriori represents a potential threat to external validity - which means
that the results cant be generalized. Nevertheless, according to the authors, logit represents the sole
exception in this case, generating even with choice-based samples unbiased coefficients, except for the
intercept. 8 Furthermore, as Ohlson (1980) argues, logit performs well when predictors normality
assumption is violated.
On the other hand, in the context of going-concern literature, one critical aspect to validating statistical
techniques such as logit is their predictive accuracy, and in particular the occurrence of type I error
(i.e. misclassifying a bankrupt company as a survivor or, respectively, a fallen angel as an investment
graded company).
See Palepu (1986) for a more detailed discussion on how the intercept can be changed, and Maddala (1986) for a textbook
discussion of the logit exception to the choice-based sample bias; Zmijevski (1984) explains that other statistical models
can only be used with WESML (weighted maximum likelihood estimation), but this estimation requires information about
the population/sample ratio, which is usually unknown in default/credit risk studies
Following Palepu(1986) and Cram et al (1999), this paper tackles one largely ignored, albeit important
methodological concern related to the choice of cutoff scores, which determines the trade-off between
type I and type II errors.
As shown in Palepu(1986), the standard cutoff score of 0,5 , which is used in the overwhelming
majority of prediction studies, is intended for a balanced sample, where both types of errors are
equally important. . Yet, rare-event studies imply a non-balanced sample (unless matching is done),
and type I error is more costly, therefore ..the use of arbitrary cut-off probabilities in prediction tests
makes the computed error rates difficult to interpret. (Palepu, 1986).
The current paper is the first to address this methodological concern in the context of predicting rating
changes, by proposing a cutoff rate equal with the sample distribution. By comparing the error rates
against a range of cutoff scores (as done first in Ohlson, 1986), for 3 different samples, I show that the
proposed cutoff rate minimizes the total error rate.
3. Data
The sample of companies was collected from Datastream and S&P public database on the basis of
three criteria, in the following order:
1. having S&P long term issuer rating, with at least 1 year (4 consecutive financial quarters) within
BBB group9 rating
2. not being a financial services company
3. having data for Total Assets item between 1986 and 2011
The first criterion was the most important, firstly because of being the raw data for the dependent
variable, and secondly because it lead to a drastic sample reduction 10 , following which the other
selection criteria, as well as the research method as a whole, had to be adjusted.
Further, companies from both databases were merged, which resulted in a further reduction due to a
considerable number of duplicates. Since the fallen angels were of interest, all the companies which
had ratings outside BBB-BB group, as well as the ones that were rated BB, but have never been
9
investment grade, were removed. The notches outside BBB-BB were excluded in order to rule out
high differences in creditworthiness (appendix 2), while only-BB were excluded as they might have
biased upwards the results 11 . Finally, the 1-year BBB criterion was intended to confer a certain
stability within this rating group, especially since CRAs claim to have a more thorough review on a
yearly basis.
Second criterion is common throughout the literature, as financial companies have very different
accounting and are subject to different regulations.
One natural criterion for data collection is the availability of all required data for a pre-determined
period. Furthermore, a typical methodology involves the initial collection of a large number of
explanatory variables, which are then filtered with various statistical methods (e.g. factor analysis or
stepwise regression) in order to achieve a more parsimonious model which best fits the dataset.
However, due to the already limited amount of data, the only variable on which we imposed
availability was total assets since this item will be used as a denominator for most accounting ratios.
Consequently, the choice of certain variables will result in different sample sizes, with a significant
decrease for most of the quarterly data. As with regards to 1986-2011 timeline, this is the resulting
timeframe of the companies with rating data. As will be discussed in a later section, the rest of
variables have been collected for the time they were available, and the specification of the final
model(s) will consequently be dependent on the data availability. After applying these criteria, the
final sample consisted of 450 companies, spanning different industries and countries, with quarterly
data ranging from 1986 to 2011 (see Appendix 3 for general company characteristics)
11
i.e. the speculative-only companies presumably have poorer credit quality as compared to the ones just downgraded, but
the same effect is not relevant for the investment ones, firstly because of the downgrade vs upgrade asymmetry
mentioned in the literature review section, and secondly because we are solely concerned with the downgrade effect
12
These companies were never downgraded to speculative grade
The key assumption is that there are significant differences between a company which was rated BB
for one year, as opposed to one which was just downgraded to BB. This assumption relies on the
empirical evidence concerning an important announcement effect, especially in the case of fallen
angels. 13 This prompt market reaction will result in a domino effect, which should trigger an
accelerated deterioration in that companys financials. In a sense, this effect is underlying the so-called
aging effect, documented in many studies, such as Altman &Kao(1992).
The outcome of such a coding is: 164 values of 1 and 12417 null values , out of an initial number
457.100 observations
14
As formerly argued, such a data structure fits bets our research question; however, given the fact that
the downgrade observations represent only 0,0132 of the total , the choice of the non-downgrade
observations becomes an important methodological concern.
The default prediction literature, which I follow with regards to method, failed so far to reach a
consensus regarding the most appropriate sampling/matching techniques - therefore different sample
specifications are compared and tested.
Since bankruptcies are rare events, taking the entire population of companies would translate into
increased data collection costs, but a decreased informational content (since extra observations will
mostly represent companies which survived) - and thus potentially weaker results. As a solution to
this issue, but also in order to reduce the omitted variable bias, first default prediction studies used
matched samples for survivors (Altman, 1968; Beaver,1966).
Later on, this method was sharply criticized for inducing further biases, while not being representative
for the entire population (and therefore lacking validity). Zmijevski(1984) was the first to address
matching issues, while Palepu (1986) has a thorough discussion of the biases introduced by nonrandom sampling, referring this time to M&A literature:
First, the use of non-random samples in the model estimation stage, without appropriate
modifications to the estimators, leads to inconsistent and biased estimates.. This results in overstating
the models ability to predict. Second, the use of nonrandom samples in prediction tests leads to error
rate estimates that fail to represent the models performance in the population.
13
This was already mentioned in the introduction; typically in this category of studies, there is a variable capturing the
age of a company in a certain rating group
14
457 companies across 100 financial quarters
As stated in Part 2, logistic regression should alleviate the choice-based sampling biases. However,
according to Cram et al (2009), there are other types of biases arising whenever a representative
sample cant be used.
Since neither using the entire population of rated companies nor random sampling is possible in this
papers research context15, the final regression will be tested on samples based on different choices of
the control group16 - in order to address and compare the potential biases.
The main specification which will be referred to as sample (1) will include a 0 observation for
each control company with available data, following the same time distribution as the downgrades (see
fig.1C). Therefore, this model will include distinct companies, with time-matched observations17.
While this sampling choice is random to some extent and arguable representative for the fallenangels to investment only population 18 , the fact that exact year matching is not possible, as well
as the limited number of observations might affect the outcome.
For this reason, I choose two other samples to test the robustness of the outcome.
The second model (2) is using a balanced sample, resulted after matching by time, country sovereign
rating and industry19. Hence, the number of observations is even lower, while, even in this case, exact
matching is not possible - thus the coefficients might be biased as well.
The final specification (3) takes into account the full sample, including same-company observations.
Consistent with Ohlson (1980), logistic regression should lead to unbiased estimates for this
specification. However, due to its construction, our sample will not be comparable with Ohlsons.
Firstly, there is a large amount of same-company observations, which means that the time variation of
accounting variables wouldnt be extremely high across the same company, while the treated group
should have much less influence on the outcome 20 . Secondly, since the ratio of fallen angels to
investment companies is 1:3, this sample is not representative for the entire population. Nevertheless,
this specification is merely used as a robustness check, providing a strong argument for the variable
choice if the results are significant in this context as well.
The summary statistics for the variables used in the final logistic model, as well as the number of
observations with available data are shown in Table. 2.
15
16
Treated and control group are selected based on the outcome, therefore there is at least a choice-based bias
Investment-graded companies which were never downgraded to speculative.
17
e.g. in 2003 there are 20 downgrade-observations, which means that 20 non-downgrade observations from 2003 will be
selected from the company year-observations; however, due to data scarcity, exact matching is not possible.
18
According to Moodys Investor Sevice, 2013, the average ratio is approx. 1:3 during 30 year period
19
The non-downgrade observations are all different from model (1)
20
However, if there is a very strong difference between downgraded and non-downgraded companies, the results should
be robust in a full sample specification context as well which we use therefore only as a robustness check.
5.Explanatory variables
Previous literature on credit rating prediction shows that the models and factors used in default
prediction perform very well for credit ratings as well, which is not surprising. Therefore, this study
will follow the same pattern, taking also into account more recent studies focusing on credit ratings21.
We consider three types of factors: firm-specific (accounting), industry-specific and country/
macroeconomic . The financial variables were collected from Datastream, and the macroeconomic
factors from OECD, both on quarterly and annual basis, resulting in 39 variables for each frequency.
There are several reasons why we consider quarterly data as more accurate for the purpose of our
investigation - however, for robustness purposes, we also collect annual data.
One first argument relates to the timeliness of the rating process, or , otherwise said, to the extent to
which ratings are point-in-time or through-the-cycle (Allen & Saunders, 2001). Although CRAs
claim to update their ratings quarterly or on upon the receipt of new information
Ederington , 1985), there has been some evidence that these updates are less thorough (Ederington &
Yawitz, 1998). Further, Blume et al (1998) find that the CRAs became more strict with overall rating
process recently, while Johnson(2003) suggests that S&P 23 ratings are more timely for investment
to-speculative downgrades. On the other hand , as mentioned in the introduction , the rating
announcement literature seems to support immediate effects on bond returns (with several exceptions),
which means that ratings have an informational content which is not available to investors, and which
is transmitted in a timely manner. Lastly, since the empirical evidence is rather mixed, and timeliness
seems to be the result of a tradeoff between stability and accuracy, finding the correct lag in the
prediction model becomes extremely difficult. However, since most studies employ annual financial
data, the practical approach to use the financial information from former financial year means that
the considered lag is of roughly one year (Hwang et al., 2010).
However, since most interim reports are unaudited, the questionable reliability of quarterly data lead to
little empirical evidence: Baldwin&Gezel (1992) are among the few who use an extensive sample with
such data, and find support for its superiority compared with annual data for default prediction.
Moreover, in the context of credit ratings, higher observation
important, given that CRAs sometimes make drastic rating changes throughout a year24.
21
Dimitras et al (1996) and Belovari et al (2001) provide extensive reviews of the factors used in bankruptcy prediction ,
while Cram et al (2009), Huang et al (2010); Figliweski et al (2012) refer to factors used for ratings.
22
For newly issued bonds the rating is simultaneous; however, this situation is ruled through the coding of our dummy
variable
23
this study refers in particular to S&P data)
24
In our sample, 10% of the fallen angels were bouncing the investment-speculative boundary more than once in a years
time
10
Therefore, quarterly financial data was preferred and consequently collected from Datastream. Since
the quarterly in interim data was indeed more scarce compared to the annual one, the corresponding
annual values were also collected, but the query type remained quarterly, as well as the coding of the
dependent variable. Non-calendaristic financial quarters were matched with the rating dummy and the
macroeconomic variables, by converting them to a calendaristic order (as shown in appendix 4)
Following both the rating changes and the default prediction literature, the 31 firm-specific variables
are divided into 6 groups: size, profitability, leverage, cash flow adequacy, operating efficiency , and
liquidity. Each group contains several ratios which are correlated, since they have been used as proxy
for the aforementioned measures.
Most variables are normalized by total assets, while the size variables are normalized by using the
logarithm. One particular ratio is problematic according to the previous literature : interest coverage.
Indeed, the variable is not normally distributed, having quite a few extreme values. For addressing this
issue, I test the quadratic value of the ratio, as well as its conversion to a piecewise linear function
following Blume et al(1998) - however, the statistics and output for the converted ratio are not
reported, since they lack significance for all specifications.
Size is considered to be the single most significant ratio in the context of credit ratings (Blume,1998),
being always included both in rating change and default prediction models. Bigger companies are
considered less risky, with more stable cash flows therefore the probability of downgrade should be
negatively related to size.
High profitability, cash flow adequacy, operating efficiency and liquidity are all indicators of a
healthy, revenue-generating company- thus these measures should be also negatively correlated with
the probability of downgrade.
Leverage and interest coverage ratios are indicators of financial distress thus they should be
positively related to the downgrade variable.
Since the sample spans 39 countries and 25 years, it is important to include macroeconomic variables.
Empirical evidence shows that both defaults and credit ratings are procyclical (Allen & Saunders,
2003); however, since CRAs claim to rate through the cycle, while there is an obvious cyclicality
in firms financials, the extent to which ratings are - or should be dependent on the firms external
environment is a still debated topic (Nickel et all, 2000).
Three of the country-specific variables relate to economic growth. While measures of economic
growth are commonly used in related literature, the impact on rating downgrades is not very
straightforward - firstly because of the arguments formerly mentioned, secondly because , usually, the
time when the economy grows fastest, is also the time when there is a lot of slack typically right
after a crisis period. Therefore, even if GDP growth is considered as a candidate for the final model 11
consistently with previous literature, we argue that leading indicators would be more appropriate.
Furthermore, using such indicators makes more sense when developing a model for prediction
purposes.
Thus, apart from GDP, two other variables are considered: output gap -
difference between the potential and realized GDP, and consumer leading indicator which is an
aggregated measure of consumer sentiment calculated by OECD 25 . Consistent with the literature
supporting the procyclicality of credit ratings, we would expect that these measures are negatively
correlated with the probability of downgrade.
Other factors which could impact ratings relate to the financial markets. Here,I choose as well two
leading indicators: aggregated returns for all stocks traded in the national stock market, and the
yield spread - which represents the difference between the long and short term interest rates). In a
rational market, we would expect that a market with volatile stocks would also have more volatile
bond returns , which means lower ratings. Ceteris paribus, long term instruments would be preferred
over the short term ones in a risky market.
Lastly, the sovereign credit rating is taken into account, since the rating of the domicile countries of
firms is an important threshold in the rating methodology - however, since we have observations only
for investment-graded sovereigns, it is very probably that this variable lacks any explanatory power.
Since most macroeconomic variables are considered leading indicators, there is no need to lag the
data. Furthermore, it is very difficult to decide on a specific lag. However, we test a weighted-average
lagged sum following Figliewski et al (2011) for GDP, output gap and yield spread - but the results
are not significant.
Furthermore, since there is strong empirical evidence for industry effects (see Chava&Jarrow, 2001,
for a more detailed discussion), an industry dummy based on GICS sector classification 26 is used in
the baseline model. Alternatively, a ranking based on industrys sensitivity to external shocks27 is
tested as well.
The variable description, as well as references to studies using them, can be found in table 1.
25
http://www.oecd.org/std/leading-indicators/41629509.pdf
http://www.msci.com/resources/factsheets/MSCI_Global_Industry_Classification_Standard.pdf, the industry group
distribution is shown in appendix 3B; the standard SIC classification used in related literature would generate more groups
with fewer observations, which is why GICS general group classification is used instead
27
Variable constructed using Gaguin (2000)s classification, pg.25
26
12
show significant
McFadden R-squared: similarly to the R-squared in OLS, it measures the fit of the model;
however, it is not directly comparable with its OLS counterparty;
Akaike information criterion: provides a measure of relative quality of the model, dealing with
the trade-off between goodness of fit and model complexity; the lower the AIC, the better the
model;
LR statistics is the equivalent of regression F-stat for OLS, representing the overall
significance of the model; thus, higher LR value is desirable (which is equivalent to lower p
values)
After all these filtering steps, the final model includes 6 factors, and the final regression is thus:
Downgrade_dummy = c + industry dummy + log(mv) + ebita/ta + td/ta + sales/(ta-ca) + spread
The variables we chose as proxy for liquidity and cash flow adequacy did not improve any of the
aforementioned statistic tests, and were therefore excluded from the model.
28
Unless otherwise specified, the sample used is (1), i.e. where the non-downgraded group is matched solely by time.
The baseline model for our stepwise procedure is a logistic regression including only the intercept and industry effects;
to the baseline model we add each of the significant variables from a subgroup, and we keep the one that performs best
according to 5 different statistics.
29
13
Sample
224
106
Full sample(3)
112
6332
106
106
Variables
LOG(MV)
-0,801
-0,749
-0,824
(0,114)***
(0,151)***
(0,098)***
-10,114
-9,598
-9,664
(4,350)**
(3,258)***
(1,331)***
3,062
2,921
3,824
(1,060)**
(1,319)**
(1,223)**
-0,612
-0,663
-0,680
(0,144)***
(0,128)***
(0,137)***
0,311
0,231
0,231
(0,110)**
(0,131)*
(0,076)**
Yes***
Yes***
Yes***
McFadden R2
0,357
0,321
0,262
Akaike
0,893
1,069
0,128
146,753***
96,906***
283,716***
0,361
0,408
0,118
EBITA/TA
TD/TA
SALES/(TA-CA)
SPREAD
Industry effects
Regression tests
LR statistic
S.E.of regression
The variables are selected according to the stepwise procedure described in section .. *** denotes 1%
significance level, ** 5% and * 10%. The standard errors are homoscedasticity-adjusted using White
method
14
The coefficients are all significant and robust across different specifications. From a statistical
perspective, the full sample model seems to be superior - which is not surprising given a much larger
number of observations. Additionally, the fact that all variables in (3) are significant given the highly
unbalanced nature of the sample, is a proof of their strong explanatory power30. However, as discussed
in section 4, from an economic standpoint, using sample (3) for estimating the model has serious
drawbacks. Consistent with Zmijevski (1984) and Palepu(1986), sample (2) is weaker in comparison
to (1) -confirming our hypothesis that the time-matched sample should lead to the best outcome.
More important than the statistical quality of the model is, of course, its economic significance. First,
the coefficient signs are as expected, confirming our original hypotheses. One further aspect is the
magnitude of the coefficients - which is not as straightforward as in OLS models. Therefore, in order
to compare the magnitude of different factors, we computed below the marginal effects for model (1).
Table 4. Economic magnitude.
Marginal
Variables
effects
LOG(MV)
-0,18905
EBITA/TA
-2,38719
TD/TA
0,722621
SALES/(TA-CA)
-0,14437
SPREAD
0,073403
Basic Materials
0,976603
Consumer Goods
1,089732
Consumer Services
1,228993
Health Care
0,757763
Industrials
0,927857
1,052082
Technology
1,311143
Telecommunications
The marginal
effects
1,32375
are calculated
30
The outcome is also consistent with Ohlson (1980), who argues that logistic regression should perform well
even for highly unbalanced samples
15
Consistent with previous literature both focused on rating changes and default prediction, profitability
has the highest impact relative to the other factors. More specifically, according to our model, an
increase of one in ebita/ta will lead to a decrease of 2,38 % in the probability of downgrade to
speculative grade.
Interestingly, the magnitude of all industry coefficients is much higher than the spread, meaning that
the industry matters more than the business cycle conditions. However, given that some industries are
procyclical as well, drawing a conclusion in this context can be difficult.
Lastly, since related literature employed ordered probit for the analysis of entire rating scale (while
taking into account the full period in a certain rating scale, as formerly discussed), a direct comparison
of the models is not possible.
8.Predictive power
In contrast with other rating prediction models, this paper proposes a cutoff score based on the sample
distribution, following the arguments discussed in section (2).
By comparing the sample-specific cutoff scores with the range of cutoffs for the 3 samples, I find
evidence that choosing such a cutoff minimizes indeed total error rates.
Below, the trade-off between the 2 errors is shown for the cutoff range for model (1):
Type II
Error trade-off
0,6
0,5
0,4
0,3
0,2
0,1
0
0
0,2
0,4
0,6
0,8
16
Type I
However, as type I is more costly for the market participants31, an optimal cutoff score in this case
wont be the one minimizing the total error, but just the type I.32 Since determining the optimal cutoff
score is practically impossible since the costs of the 2 errors should be known (the method is presented
in Hsieh, 1993), we acknowledge this issue as a limitation , but focus on discussion the type I error
rates. However, the cutoff scores we use correspond as well to type I error values which are close to
the minimal ones.
As shown in table 6, the holdout sample classification tables reveal that model (1) performs best
compared to the other specifications, with type 1 error of 22% and an average error rate of 17%.
31
Type I errors cost can be considered as, for example, the opportunity cost of holding a long position in a security which
was downgraded to speculative grade
32
Minimizing one of the errors is done at the expense of maximizing the other
17
229
114
7269
122
122
1022
LOG(MV)
-0,779
-0,746
-0,875
(0,108)***
(0,152)***
(0,095)***
-11,115
-10,734
-9,496
(4,517)**
(3,579)***
(1,159)***
2,488
2,799
3,442
(1,063)**
(1,278)**
(1,159)**
-0,171
-0,188
-0,164
(0,110)***
(0,034)***
(0,016)***
0,266
0,174
0,143
(0,110)**
(-0,125)
(0,076)**
INDUSTRY EFFECT
Yes***
Yes***
Yes***
McFadden R2
0,347
0,314
0,254
Akaike
0,922
1,068
0,127
LR statistic
157,55***
102,744***
316,533***
S.E.of regression
0,369
0,407
0,119
EBITA/TA
TD/TA
SALES/(TA-CA)
SPREAD
Fullsample(3)
The variables represent annual values. Industry effects coefficients are not reported for brevity. However, they
are similar to the ones in the quarterly regression. *** denotes 1% significance level, ** 5% and * 10%. The
standard errors are homoscedasticity-adjusted using White method. Overall, the coefficients and their
significance are similar to the quarterly regressions, having slightly lower standard errors. The regression
statistics show a slightly poorer fit and information criteria, compared to the quarterly observations.
quarterly observations
annual observations
Smpl(1)
Smpl(2)
Smpl(3)
Smpl(1)
Smpl(2)
Smpl(3)
cutoff
32%
47%
0,17%
34%
49%
0,16%
type I
22%
25%
25%
21%
27%
27%
type II
11%
20%
18%
15%
20%
22%
average
17%
23%
22%
18%
24%
25%
Smpl (1), (2), (3) represent the 3 different specifications based on different choices of treatment groups we
repeat the same sampling technique for annual observations as well. The cutoff score is obtained by dividing
the 1 observations to the total sample.
18
Given the nature of the dataset, and the small sample size (which doesnt allow a large number of
variables), omitted variable bias is an important concern.
Since we cannot possible account for all the potential omitted variables, we consider using fixed
effects as a reasonable alternative. However, given the unbalanced nature of our dataset and the nonlinear estimation model, using FE with our model would generate an error.
Therefore, we estimate an OLS model using the same variables, and by adding year fixed effects. For
brevity, we report only the results for model (1) regression with quarterly data.
Lastly, a small sample is particularly sensitive to outliers. We run all regressions without several
observations which we label as outliers according to the interquartile range rule33 and we dont find
significant differences.
Although OLS and logit coefficients cannot be compared directly, the results are comparable in terms
of signs and relative magnitude. Therefore, in spite of potential omitted variables, the results are
consistent.
By focusing only on fallen angels and the quarter immediately following their downgrade, this paper
shows that a parsimonious logistic model performs well both in terms of explanatory and predictive
power. All our variables are significantly both statistically and economically, and robust throughout
different specifications, while the holdout prediction accuracy for type I error ranges between 78% and
75%.
This paper can be considered as a prerequisite for a further research on issuer rating prediction in
general, and fallen angels in particular. Through the impact of rating changes, especially around the
aforementioned boundary, all stakeholders involved in the credit market should benefit from such a
research.
A follow-up study could analyze the presumed asymmetry between downgrades and upgrades around
investment-to-speculative grade boundary, at a variable level (i.e. using factor analysis to investigate
whether the extent to which the two rating changes are influenced by different factors). The same
rationale could be extended further to analizing the entire rating spectrum (e.g. showing that different
variables are more relevant for certain rating groups only).
33
Rule of thumb which considers as an outlier observations outside the median quartile range
(Q1 - 1,5* IQR; Q3 + 1,5* IQR), where IQR= Q3 Q 1 , and Q are the quartiles.
19
Model(1)
224
223
224
106
105
106
-0,801
-0,789
-0,107
(0,114)***
(0,115)***
(0,016)***
-10,114
-10
-0,903
(4,350)**
(4,395)**
(0,197)***
3,062
3,044
0,394
(1,060)**
(1,082)**
(0,166)**
-0,612
-0,6
-0,078
(0,144)***
(0,145)***
(0,020)***
0,311
0,365
0,009
(0,110)**
(0,162)**
-0,025
Industry effect
Yes***
Yes***
Yes***
Time effect
No
No
Yes***
McFadden R2
0,357
0,358
0,496
Akaike
0,893
0,889
1,017
LR statistic
146,753***
147,439***
S.E.of regression
0,361
0,359
0,369
Variables
LOG(MV)
EBITA/TA
TD/TA
SALES/(TA-CA)
SPREAD
Regression tests
First column represents the main regression output - the same as in first column of table..2nd column shows the
regression output without 2 observations which were considered as outliers according to the quantile thumb
rule. Last column is an OLS estimation including year fixed effects- thus, the coefficients and the test statistics
from 3rd column are not directly comparable with the other estimations.
20
Reference list:
Allen, L., Saunders, A., 2003. A survey of cyclical effects in credit risk measurement model. BIS
Working Paper,126
Altman, E. I. ,1968, Financial Ratios, Discriminant Analysis and the Prediction of Corporate
Bankruptcy, Journal of Finance, 23, 589-609
Altman, E.I., Avery, R.B., Eisenbeis, R.A., Sinkey, J.F., 1981. Application of Classification
Techniques in Business, Banking and Finance. JAI Press, Connecticut.
Altman, E. and Duen Li Kao ,Rating Drift in High-Yield Bonds ,The Journal of Fixed Income March
1992, Vol. 1, No. 4: pp. 15-20 ,DOI: 10.3905/jfi.1992.408035
Bellovary, G., Jodi L., Giacomino, Don E. and Akers, Michael D.,2006, A Review of Bankruptcy
Prediction Studies: 1930 to Present. Available at SSRN: http://ssrn.com/abstract=892160
Blume, M.E., Lim, F., MacKinlay, A.C., 1998. The declining credit quality of U.S. corporate debt:
Myth or reality? Journal of Finance, 53, 1389-1414.
Chava, Sudheer and Jarrow, Robert A.,2004. Bankruptcy Prediction With Industry Effects . Available
at SSRN: http://ssrn.com/abstract=287474
Chernenko, Sergey, and Adi Sunderam. The real consequences of market segmentation. Review of
Financial Studies 25.7 (2012): 2041-2069.
Cram, D. P., Karan, V. and Stuart, I. (2009), Three Threats to Validity of Choice-based and MatchedSample Studies in Accounting Research. Contemporary Accounting Research, 26: 477516.
doi: 10.1506/car.26.2.7
Dimitras, A.I., Zanakis, S.H. and Zopounidis, C. (1996), A survey of business failure with an emphasis
on prediction methods and industrial applications, European Journal of Operational Research, Vol.
90, pp. 487-513.
21
Ederington, L. H., 1985. Classification models and bond ratings, The Financial Review 20, 237-261.
Ederington, Louis H., and Jess Yawitz, 1998, The bond rating process, Handbook of Financial
Markets and Institutions, 6th ed. ,John Wiley and Sons, New York.
Figlewski, Stephen, Halina Frydman, Weijian Liang, Modeling the effect of macroeconomic factors on
corporate default and credit rating transitions, International Review of Economics and Finance,
Volume 21, Issue 1, January 2012, Pages 87-105, ISSN 1059-0560, 10.1016
22
Moodys Investor Service, 2013, Rating Transition Risk for Investment-Grade Issuers
Moodys Investor Services, 2010,
Corporate
Comment.
Moodys Investor Services, 2002, Rating Methodology
Nickell, P., W. Perraudin and S. Varotto (2000), Stability of Rating Transitions, Journal of Banking
and Finance, Vol. 24, pp. 205-229.
Norden, L. and M. Weber, 2004. Informational efficiency of credit default swap and stock
markets:The impact of credit rating announcements, Journal of Banking and Finance 28, 2813-2843.
Ohlson, J. A. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of
Accounting Research 18 (1): 109-131
Palepu, K. G. 1986. Predicting takeover targets: A methodological and empirical analysis. Journal of
Accounting & Economics 8 (March): 3-35.
West, R.R., 1970. An alternative approach to predicting corporate bond ratings. Journal of
Accounting Research, 8, 118-125.
Zmijewski , Mark E. Methodological Issues Related to the Estimation of Financial Distress Prediction
Models,Journal of Accounting Research , Vol. 22, Studies on Current Econometric Issues in
Accounting Research (1984), pp. 59-82
Web Resources:
http://www.bis.org/statistics/index.htm
http://www.msci.com/resources/factsheets/MSCI_Global_Industry_Classification_Standard.pdf
http://www.oecd.org/std/leading-indicators/41629509.pdf
http://www.sec.gov/spotlight/xbrl/nrsro-implementation-guide.shtml
http://www.sec.gov/spotlight/dodd-frank/creditratingagencies.shtml
http://www.moodys.com/ratings-process/Ratings-Definitions/002002
http://www.msci.com/resources/factsheets/MSCI_Global_Industry_Classification_Standard.pdf
23
24
variable
description
Size
Altman(1968)
Shumway(2001)
Blume(1998)
log(ta)
log(total assets)
log(sales)
log(sales)
Profitability
Altman&Kao(1977),
Ohlson(1981),
Hwang et al(2010)
Leverage
Beaver (1966),
Platt&Platt(1990)
Amato&Furfine(2004)
CF adequacy
Beaver(1966)
Altman&Katz(1974)
Hwang et al(2010)
Liquidity
Altman(1977)
Shumway(2000)
Blume(1998)
operating efficiency
Altman(1968)
Laitinen(1992)
Hwang et al(2010)
macro variables
Chava&Jarrow(2001)
Allen&Saunders(2000)
Amato&Furfine(2004)
industry effects
log(mv)
log(market value)
ni/ta
oi/ta
re/ta
ebit/ta
ebita/ta
td/ta
tl/ta
nd/ta
ni/(ta-tl)
oi/(ta-tl)
td/mv
tl/mv
nd/mv
ic_a
interest coverage
ncf/ta
(ncf+capex)/ta
ncf/td
ncf/tl
ncf/Nd
ncf/sales
ca/cl
current ratio
(ca-cl)/ta
(cash+ar)/cl
ca/ta
cl/ta
sales/(ta-ca)
sales/fixed assets
sales/ta
sales/total assets
ca/sales
current assets/sales
spread
gdp
og
output gap
si
cli
sovereign
industry_risk*
expected
sign
+
+
+
+
+
+
+
+
+
+
Std,
Dev,(1)
Std,
Dev,(0)
p(mean)
Rsq
Akaike
LR
SE
p(coeff)
15,962
1,207
1,291
0,003
0,07
1,20
34,42
0,45
0,00
14,295
1,323
1,412
0,003
0,09
1,19
43,83
0,44
0,00
7,557
8,648
1,652
1,492
0,000
0,17
1,13
88,18
0,43
0,00
-0,030
0,046
0,147
0,068
0,000
0,24
1,01
101,39
0,39
0,04
0,005
0,022
0,029
0,054
0,000
0,20
1,05
85,66
0,40
0,08
224
0,171
0,202
0,254
0,225
0,300
117
262
-0,011
0,021
0,073
0,055
0,000
0,21
1,04
88,11
0,40
0,00
117
254
0,050
0,135
0,149
0,093
0,000
0,26
0,99
111,70
0,38
0,00
122
272
0,359
0,304
0,160
0,133
0,001
0,29
0,96
121,82
0,38
0,00
122
272
0,654
0,616
0,161
0,143
0,027
0,27
0,98
114,16
0,38
0,03
121
271
0,275
0,239
0,188
0,166
0,069
116
258
-0,253
0,175
1,373
0,819
0,002
0,26
0,99
111,95
0,39
0,07
120
267
0,017
0,080
0,147
0,329
0,011
0,26
1,00
110,52
0,38
0,09
111
250
3244
3902
16634
35509
0,811
111
250
5461
9326
26720
90284
0,537
110
249
2753
3645
14753
34735
0,733
135
255
-9,030
7,210
144,944
10,104
0,196
117
257
0,073
0,104
0,053
0,070
0,000
0,29
0,96
125,10
0,37
0,32
115
252
0,121
0,169
0,080
0,097
0,000
0,30
0,95
124,07
0,37
0,33
117
257
1,107
1,733
8,407
13,597
0,587
117
257
0,124
0,184
0,107
0,142
0,000
0,29
0,97
122,3
0,38
0,72
116
256
0,274
1,747
1,672
13,526
0,089
119
255
0,463
0,663
0,463
0,528
0,000
0,30
0,96
125,20
0,38
0,45
120
272
1,564
1,364
0,962
0,714
0,044
0,28
0,97
119,44
0,38
0,10
120
272
0,099
0,071
0,171
0,129
0,113
0,33
0,92
140,14
0,37
0,00
Obs(1)
obs(0)
Mean(1)
Mean(0)
122
272
15,550
124
270
13,853
141
260
117
258
121
267
103
75
195
0,872
0,851
0,624
0,653
0,806
122
272
0,337
0,309
0,191
0,195
0,188
120
272
0,239
0,238
0,119
0,135
0,914
122
270
0,452
0,551
0,485
1,306
0,000
122
270
0,237
0,246
0,176
0,201
0,649
122
270
1,763
1,487
1,284
0,821
0,031
0,30
0,96
125,55
0,38
0,15
157
274
-1,132
-1,676
1,362
1,506
0,000
0,36
0,89
147,75
0,36
0,00
157
273
0,285
0,484
1,163
0,856
0,062
157
276
-1,027
-1,643
2,488
2,837
0,020
0,34
0,92
138,63
0,37
0,32
156
275
2,931
-1,124
8,517
9,539
0,000
0,35
0,91
141,41
0,37
0,01
156
274
99,454
100,15
1,450
1,396
0,000
0,34
0,91
139,92
0,37
0,01
149
275
9,779
9,687
0,743
0,738
0,228
161
284
1,739
1,919
0,905
0,961
0,079
First part of the table represents descriptive statistics for the downgrade (group 1, treated) versus non-downgrade (group 0, control) groups, for the time matched sample model 1 . All the variables except the industry dummy are included. P(mean) is the t-test for equal means between the
treated and control groups ; the last 5 columns show various tests and statistics obtained when adding each of the variables to the model including the intercept and industry dummy: mcfadden r-squared, akaike information criterion , maximum likelihood, standard error of regression and
coefficient p values. Below the variable groups there are references to the models which use either of the group variables; last reference for each group refers to a rating change model, while the first ones refer to default prediction; the references for macro variables include all industry effects.
SA LES/(TASP
-CA
REA
) D DOWNGRA
lo gDE
(M V)
DUMEB
M ITA
Y /TA TD/TA
SA LES/(TASP
-CA
REA
) D DOWNGRA
lo gDE
(M V)
DUMEB
MITA
Y /TATD/TA
SA LES/(TA
SP-CA
REA
)D
Mea n
0,321
8,341
0,107 0,319
0,547
-1,455 0,486
8,040
0,090
0,327
8,573
Medi a n
0,000
8,452
0,111 0,311
0,309
-1,510 0,000
8,236
0,106
0,325
8,719
Ma xi mum
1,000
11,929
0,435 0,844
17,666
0,414
0,844 19,291
Mi ni mum
0,000
0,223
-0,769 0,001
0,041
-3,820 0,000
-0,083
330
-0,769
0,001
218
6438
Mea n
0,323
8,333
0,098 0,320
0,600
-1,427 0,470
8,019
0,080
0,329
8,580
Medi a n
0,000
8,444
0,108 0,311
0,322
-1,480 0,000
8,292
0,106
0,319
8,719
Ma xi mum
1,000
11,391
0,415 0,844
17,666
0,414
0,844 19,291
Mi ni mum
0,000
0,223
-0,769 0,001
0,041
-3,820 0,000
-0,083
-0,769
0,001
Std, Dev,
0,469
1,580
0,136 0,150
1,426
1,577 0,501
1,659
0,156
0,151
1,642
Q-reduced:obs.nr
217
1,455 0,127
149
The first table contains the summary statistics for the final regression variables, using quarterly data, Q-baseline: obs,nr, represents the total observation number for the
sample used for the main regression specification, Q-reduced represents a random 70% of Q-baseline, being the subsample used for estimating the model for prediction
purposes (the predictive accuracy is tested on the remaining 30% of the sample), The second table is analogous to the first one, except for the fact that annual data is used,
Annua l
Va ri a bl es
DOWNGRADE
l og (MV)
DUMMY
EBITA/TATD/TA SALES/(TA-CA)
SPREAD DOWNGRADE
l og (MV)EBITA/TA
DUMMY TD/TA SALES/(TA-CA)
SPREAD DOWNGRADE
l og (MV)EBITA/TA
DUMMY TD/TA SALES/(TA-CA)
SPREAD
Mea n
0,348
8,289
0,099 0,320
2,036
-1,451 0,517
7,982
0,082
0,325
8,559
Medi a n
0,000
8,403
0,107 0,312
1,181
-1,510 1,000
8,168
0,103
0,333
8,697
Ma xi mum
1,000
11,929
0,391 0,884
73,227
0,348
0,884 71,638
Mi ni mum
0,000
0,223
-0,769 0,001
0,129
-3,820 0,000
-0,083
-0,769
0,001
Std, Dev,
0,477
1,602
0,122 0,141
4,583
1,511 0,501
1,624
0,133
0,146
4,962
A-baseline:obs.nr
351
1,383 0,127
236
Mea n
0,345
8,289
0,091 0,321
2,227
-1,404 0,494
7,969
0,074
0,329
8,566
Medi a n
0,000
8,411
0,106 0,304
1,228
-1,450 0,000
8,164
0,105
0,331
8,702
Ma xi mum
1,000
11,391
0,355 0,884
73,227
0,348
0,884 71,638
Mi ni mum
0,000
0,223
-0,769 0,001
0,129
-3,820 0,000
-0,083
-0,769
0,001
Std, Dev,
0,476
1,603
0,134 0,145
5,441
1,549 0,502
1,678
0,151
0,146
5,926
A-reduced:obs.nr
229
160
1,429 0,126
150
speculative
100
50
0
Number of defaults from 1920 to 2010 for investment and speculative graded companies. (Moodys
Investor Service, 2010)
Aaa
87.65%
1.01%
0.06%
0.04%
0.01%
0.01%
0.00%
Aa
8.48%
86.26%
2.78%
0.19%
0.06%
0.04%
0.02%
A
0.61%
7.82%
87.05%
4.65%
0.38%
0.13%
0.02%
Baa
0.01%
0.34%
5.21%
84.41%
5.66%
0.35%
0.14%
Ba
0.03%
0.05%
0.48%
4.20%
75.74%
4.81%
0.42%
B
0.00%
0.02%
0.09%
0.79%
7.25%
73.52%
7.52%
Caa-C
0.00%
0.01%
0.03%
0.20%
0.61%
6.35%
62.40%
Default
0.00%
0.02%
0.05%
0.17%
1.13%
4.37%
16.68%
Wr
3.22%
4.47%
4.24%
5.35%
9.16%
10.42%
12.80%
Figure 2. Average One-Year Broad Rating Migration Rates, 1970 -2009(Moody's investor report, Special
Comment); highlighted percentage represents the "fallen angels" from 1970 to 2009
represents the vountry distribution for the initial sample of 450 companies. The total
number of countries is 37,however the figure shows only the countries with more than 5 companies in
the sample; the rest of the countries are aggregated as other. Figure B shows the General Industry
Classification Standard (GICS) industry distribution across the initial sample. Figure C represents the
time distribution of the number of downgrades to speculative grade during the analyzed period for
the sample companies.
A.Country distribution
B.Industry distribution
BASIC
MATERIALS
SWEDEN
AUSTRALIA
CONSUMER
GOODS
FINLAND
CONSUMER
SERVICES
GERMANY
NORWAY
HEALTH CARE
MEXICO
FRANCE
INDUSTRIALS
UK
JAPAN
TECHNOLOGY
CANADA
TELECOMMUNI
CATIONS
USA
OTHER
UTILITIES
1985
1990
1995
2000
2005
2010
2015
The "date" and the "rating" columns represent raw data for the rating changes. The dates were converted to financial quarters, adjusting for noncallendaristic quarters and matching with sovereigns financial year ends. For simplification, in this example the same dates and calendaristic
financial year ends are assumed for the 2 companies. The fallen angels receive "1" only for the quarter following the downgrade - as opposed to
other studies, which take the entire BB period into account. The investment-grade companies are the ones which were never downgraded to
speculative during the period of this study. They get a "0" during the entire period of BBB rating, except for the first BBB quarter. The one quarter
delay is consistent with the fact that it is non-realistic to assume that companies are downgraded exactly when their financials start to deteriorate.