Documentos de Académico
Documentos de Profesional
Documentos de Cultura
www.elsevier.com/locate/jce
1. Introduction
A resurgence of interest in income distribution in developed, developing and transition
economies is found in the literature. Atkinson (2001) notes the doubling back of inequality
after an inverted-U pattern in developed countries. Cornia and Popov (2001) investigate
rapidly rising inequality in transitional economies. Datt and Ravallion (1992) and Dollar
and Kraay (2002) consider the recent controversy over the effects of growth versus
redistribution on poverty reduction in the developing world. Traditional approaches
E-mail address: wan@wider.unu.edu.
0147-5967/$ see front matter 2004 Association for Comparative Economic Studies. Published by Elsevier
Inc. All rights reserved.
doi:10.1016/j.jce.2004.02.005
349
to income distribution are mostly descriptive rather than prescriptive. They involve
measuring the extent of inequality and speculating on its determinants. Following the
work of Shorrocks (1980, 1982), inequality decomposition by income sources requires
an identity to express income as the sum of several components. Conversely, inequality
decomposition by population subgroups provides rather limited information on the
fundamental determinants of inequality, e.g., differences in human and physical capital,
dependency ratios, globalization, and technical change.
Since the early 1970s, economists have used the regression-based approach to inequality
decomposition. Unlike its traditional counterparts, this approach allows the contributions
of the regression variables to total inequality to be quantified. Although the early work is
limited regarding the number and type of variables that can be considered, recent advances
have relaxed this restriction. In theory, regression-based inequality decomposition permits
the inclusion of any number or type of variables or even proxies, including social,
economic, demographic and policy factors. The flexibility of this approach, particularly its
ability to accommodate endogeneity of income determination and random errors, makes it
attractive to economists, and policy-makers. Oaxaca (1973) and Blinder (1973) are the
pioneers of this approach; they focus on the difference in mean income between two
groups. Other moments of the income-generating process are not considered. Juhn et
al. (1993) extend this approach so that the decomposition depends on the difference in
the entire income distribution between two groups rather than on the difference in mean
income only. Bourguignon et al. (2001) relax the requirement of a linear income-generating
function imposed by Juhn et al. (1993). All these authors focus on explaining differences
in income distribution between distinct groups of income recipients; they do not quantify
the contributions of specific factors to total inequality. Hence, only a limited number of
inequality-related impacts can be identified, although these impacts could be functions of
more fundamental determinants. For example, the technique proposed by Bourguignon et
al. (2001) can be used to decompose differences in income distribution into only three
broad components, namely, price effects, participation effects and population effects.
In a different strand of literature using semiparametric and nonparametric techniques,
DiNardo et al. (1996) and Deaton (1997) describe and compare the entire distribution of
the target variable in terms of the density function, rather than attempting to decompose
a summary measure of inequality. Although they impose few structural assumptions, the
findings are less conclusive than economists and policy makers would prefer, as Morduch
and Sicular (2002) argue. Fields and Yoo (2000) and Morduch and Sicular (2002) employ
conventional techniques to specify and estimate parametric income-generating functions
and derive inequality decompositions based on the estimated regression equations. Their
conceptual framework allows for any number of fundamental income determinants, but
suffers from a number of restrictions. Our paper builds on the work of Fields and Yoo
(2000) and Morduch and Sicular (2002); we present a critical evaluation of these papers in
Section 2.
Our paper is motivated by three issues. First, any regression-based inequality decomposition inevitably involves a constant term and a residual term. These terms give rise to
specific problems, which are neglected or not properly addressed in Morduch and Sicular (2002) and Fields and Yoo (2000). Second, the current state of art in regression-based
inequality decomposition imposes severe limitations in terms of the functional forms and
350
inequality measures used. Finally, the determinants of regional income inequality in rural
China are studied descriptively in the literature, with the exception of Ravallion and Chen
(1999).1 Our paper quantifies the contributions of various determinants to total inequality
in rural China and provides a prescriptive analysis. Section 3 proposes a general regressionbased framework for inequality decomposition in which the Gini coefficient is used as an
example measure of inequality. The proposed method can be applied to any inequality
measure and it imposes few restrictions on the underlying regression model. In Section 4,
the root sources of income inequality in rural China are investigated. Section 5 concludes
with a summary and policy recommendations.
351
effect on inequality in FY.3 Podder and Chatterjee (2002) show that a positive (negative)
constant will lower (raise) inequality if relative inequality measures such as CV 2 or the
Gini index are used.4 All the measures used by MS and FY are relative inequality measures.
To avoid these problems and restrictions, we develop a general regression-based
procedure for decomposing any inequality measure, including the Gini coefficient and
Theils measure. Our approach is consistent with the natural rule of decomposition. It is
also flexible enough to be applied to any type of income transformation, e.g., the logarithm
and BoxCox transformations. Most importantly, our approach can be used with any
parametric specification of the regression model.
(1)
= + Y
is the deterministic part and Y
represents income flows from
From Eq. (1), Y
= i Xi = Yi where Yi = i Xi is the
various determinants. If F (X, U ) is linear, Y
income flow from the ith factor. The model is more general than those of Blinder (1973),
Oaxaca (1973), Juhn et al. (1993), FY, and MS. In fact, the regression model need not be
additive as in Eq. (1). The additive form is assumed to facilitate the exposition but any
linear or nonlinear form could be used.
Total inequality is given by I (OI), where I denotes an inequality measure and OI
corresponds to the original income variable so that OI = Y if no transformation is
applied to original income. The objective of regression-based decomposition is to attribute
I (OI) to the various components of the X vector, the constant and the residual U .
FY apply the variance operator to both sides of the regression and divide both sides by the
variance of the dependent variable. This approach was used by Wan (1988) to decompose
output variability and by Zhang and Fan (2000) to decompose output inequality. In this
method, the contribution of the residual term is (1 R 2 ) and the other terms contribute
Cov(Yi , Ln(OI))/ Var(Ln(OI)). Moreover, the constant term contributes nothing to the
inequality of Ln(OI). When the Gini coefficient is applied to Eq. (1) as in MS, both the
residual term and the constant term contribute nothing to overall inequality. In other words,
the deterministic part of the model explains the entire income inequality no matter how
poorly fitted the regression model is.
3 Under the semilog income-generating function of Fields and Yoo (2000), the constant term is not a scalar
unless inequality is measured using original income rather than the log of income. Hence, if correctly considered,
the constant term should have an effect on total inequality.
4 Fields (2001) and Kolm (1999) discuss the merits of relative versus absolute measures of inequality. Most
inequality studies use relative rather than absolute measures and the preferred absolute measures include the Kolm
index and income range.
352
= + i Yi .
where Yi = i Xi and Y
Following MS and applying the Gini operator G to both sides of Eq. (2), we have
G(Y ) = 0 +
E(Yi )/E(Y ) C(Yi ) + 0,
(3)
where C denotes the concentration index. To allocate total inequality to individual sources,
MS use Eq. (3) so that each source contributes (E(Yi )/E(Y ))C(Yi ) to G(Y ). The
disturbance term plays no role in determining income inequality. Although by definition
U is white noise, so that it does not alter the shape of the estimated Lorenze curve,
U does affect the underlying income density function and thus inequality. The residual
term represents unobservable factors or determinants of inequality that are not included
in the regression model. The percentage of inequality that is not explained by the model
is useful information. Equation (3) also implies that the constant term has no effect on
income inequality. However, as a relative inequality measure, the Gini index declines
(increases) when a positive (negative) constant is added to every income recipient. Thus,
it is inappropriate to use Eq. (3) for inequality decomposition.
To account for the contribution of the non-included factors or the residual term, we
follow the procedure in Shorrocks (1999) and eliminate U from Eq. (2). Hence, we have
Y (U = 0) = Y
). The contribution of U to G(Y ) is defined as
and obtain G(Y | U = 0) = G(Y
.
COu = G(Y ) G Y
(4)
=
G Y
E(Yi )/E(Y ) C(Yi )|
i
.
rank by Y
(5 )
353
The only difference between (5) and (5 ) is the ranking of Yi in calculating C(Yi ).
Having identified the contribution of the residual to total inequality, we turn to the
identification of the contribution of the constant term. Applying the Gini operator to Eq. (1),
we obtain
G(Y ) = /E(Y ) C() +
E(Yi )/E(Y ) C(Yi )|rank by Y + 0
=0+
E(Yi )/E(Y ) C(Yi )|rank by Y + 0.
(6)
MS use (6) to decompose total inequality. Hence, the constant term makes no contribution.
However, E(Yi )/E(Y ) changes for different values of . In particular, the addition of a
positive (negative) constant leads to a decrease (increase) in E(Yi )/E(Y ) because E(Y )
becomes larger (smaller) but E(Yi ) is not affected. Therefore, the impact of the constant
is distributed over, or absorbed by, other terms in (6). Our goal is to extract this impact
from these terms.
) + COu so that we focus on G(Y
) and note that Y
= Y
+ .
From (4), G(Y ) = G(Y
Following Shorrocks (1999), we have
| = 0 = G(Y
) =
C(Yi )|rank by Y or Y.
G Y
(7)
E(Yi )/E Y
i
(8)
Clearly, if > 0, CO < 0, and vice versa. Without , the individual concentration indices
i = E(Yi )/E(Y
) in (6). When is nonzero, the weights become
C(Yi ) are weighted by W
+). Intuitively, the weights can be adjusted by the difference (Wi W
i )
Wi = E(Yi )/E(Y
and the resultant difference in the underlying inequality is attributed to . Following this
reasoning, we have
i C(Yi ) = G Y
G Y
,
CO =
Wi W
i
354
355
sample consists of 30 regions. Population data are taken from China Agricultural Yearbook
and China Rural Yearbook (NBS, various years). Rural population, instead of agricultural
population, is used because the rural survey covers non-urban households. Our analytical
results should be robust to the choice of population data given the high correlation between
these two series.
In China, more than 50 percent of rural income is from farming activities and nonfarm income come mainly from family businesses. Under these circumstances, inputs other
than labor are needed to generate income. Thus, the following variables are considered for
inclusion in the underlying income-generating function: the per capita disposable income
(OI), the family or household size (HH), the dependency ratio (DEP), the per capita capital
input (K), the average level of education of household members (EDU), the per capita
possession of cultivable land (A), and the proportion of labor force employed in rural
industrial enterprises (TVE). Family size is incorporated to take account of the income
earned from sideline production, which often involves non-laborers. Once household size
is included, either labor or the dependency ratio must be excluded. Given the divergence
in dependency ratios across regions and its converging trend, we choose to include the
dependency ratio. In any case, labor is unlikely to be a significant variable because of its
surplus nature in rural China as Wan and Cheng (2001) demonstrate. The inclusion of TVE
can be justified because of its important impact on regional inequality as Wan (1997, 2001)
and Rozelle (1994) show. All variables in value terms are deflated by rural CPIs.
Given the large number of cross-sectional units relative to the time span, only year
dummies are considered. These dummies are intended to capture changes in income
attributable to factors such as reform impacts and technical change. However, the effects of
dummy variables on income inequality cannot be identified independently in some cases.
To minimize misspecification errors, we adopt the combined BoxCox and BoxTidwell
model:
()
OI () = + 1 X1() + 2 X2() + + K XK
+ dummy variable terms + U.
(9)
In this specification, OI () = (OI 1)/ and Xk() = (Xk 1)/ . As approaches 0, the
limit of (OI 1)/ is Ln OI from LHopitals rule. Hence, OI () = Ln OI when = 0
()
(Judge et al., 1988). The same arguments apply to and Xk . Equation (9) encompasses
many functional forms, including the semilog income-generating function of FY if = 0
and = 1, and the standard linear function of MS if = = 1. If = = 0, a doublelog equation is obtained. If or equals 1, the relevant variable becomes its reciprocal.
By allowing and to take values of 0, 1, and 1, and to be unrestricted, we obtain 16
different functional forms. In addition, the extended BoxCox model is estimated with the
restriction that = . Hence, a total of 17 models are estimated; see Table 1 and related
discussions.
In the first round of estimations, all variables are included with year dummies. However,
the dummy variables for 1992 and 1994 are not significant in most equations. Therefore, we
remove them and re-estimate the models. As shown in Table 1, the removal did not change
the estimated models in any significant way, except the constant term. Therefore, we use the
specification with only a dummy for 1993. For the purpose of model selection, we consult
356
Table 1
Log-likelihood values of alternative models
Model
Ln
Lin
Inv
Ln
LnLn
LnLin
LnInv
With all
659.507 666.026 690.006 661.748 703.929 659.668 693.774 662.048 707.331
dummies
Degrees of
108
109
109
109
109
109
110
110
110
freedom
0.786
0.808
0.622
0.775
0.522
0.791
0.631
0.783
0.538
R2
0.82
0.83
0.604
0.81
0.53
0.808
0.66
0.819
0.5786
r2
1993 dummy 659.904 666.326 690.291 662.591 706.534 660.017 694.163 662.774 709.611
only
Degrees of
110
111
111
111
111
111
112
112
112
freedom
0.788
0.810
0.626
0.778
0.511
0.794
0.636
0.784
0.529
R2
0.819
0.829
0.598
0.810
0.534
0.807
0.654
0.817
0.569
r2
Lin
LinLn
LinLin
LinInv
Inv
InvLn
InvLin
InvInv
With all
672.05 718.007 673.445 728.493 665.019 691.306 666.031 704.985
dummies
Degrees of
109
110
110
110
109
110
110
110
freedom
0.820
0.612
0.816
0.538
0.748
0.609
0.743
0.51
R2
0.839
0.642
0.830
0.573
0.767
0.513
0.716
0.464
r2
1993 dummy 672.399 718.642 673.795 730.18 665.876 691.563 667.435 707.768
only
Degrees of
111
112
112
112
111
112
112
112
freedom
0.822
0.615
0.818
0.534
0.749
0.614
0.742
0.495
R2
0.838
0.638
0.829
0.561
0.763
0.512
0.691
0.485
r2
the log-likelihood values and the squared correlation coefficient between predicted and
actual OI, denoted by r 2 , for all the 17 models in Table 1.5
The column headings in Table 1 consist of two parts; the first part indicates the
transformation of the dependent variable and the second part indicates the transformation
of the independent variables. The symbols and mean that the transformations are
determined by the data. For example, Ln, Lin and Inv mean that or equals 0, 1 and
1, respectively. The most flexible model, i.e. ( ), produces the largest log-likelihood
value. However, the (Ln ) model has similar values for the log-likelihood and r 2 , as do
the (Lin) and the (LnLin) models. Interestingly, the extended BoxCox model, which
imposes the same transformation on both dependent and independent variables so that
= , yields the second highest r 2 although its log-likelihood value is relatively small.
However, model evaluation and selection will be based on log-likelihood values not r 2 ,
which is reported for the purpose of indicating goodness of fit.
5 Table 1 also reports the R 2 statistics produced by Shazam. Since the properties of R 2 in the combined
BoxCox and BoxTidwell model are unknown, we do not use these statistics for model selection.
357
Since all of the other 16 models are nested with the most flexible model, a standard
2 test will be used for model selection. At the 1 percent level of significance, three
models, namely (Ln ), (Ln Lin) and ( Lin), are accepted. Among these models, the
(Ln ) and the (Ln Lin) models are nested and the 2 test rejects the (Ln Lin) model
at the 5 percent significance level. In addition, the (Ln Lin) and the ( Lin) models are
nested and found to be equivalent by the 2 test. Having rejected the (Ln Lin) model, the
( Lin) model is rejected by association. In any case, the ( Lin) model is rejected at the
5 percent level of significance when tested against the most flexible model. Consequently,
the (Ln ) income-generating function is chosen for inequality decomposition. Since
both the semilog specification of FY and the linear form of MS are rejected, decomposition
results using these models will suffer from misspecification errors in addition to the errors
generated by the deficiencies in their decomposition frameworks.
Table 2 presents the estimation results for the (Ln ) model. The t-statistics in Table 2
and the r 2 in Table 1 indicate that the model fits the data well. No t-statistic is reported
for the coefficient because its estimate is obtained by a grid search. However, the
hypothesis that = 0 can be tested using the 2 ratio, which is derived from the loglikelihood values of the (Ln ) and the (Ln Ln) models. This test indicates that is
significantly different from 0. In Table 2, estimation results for the linear and semilog
income generating-functions are also reported. Although most estimates have the same
signs and similar t-statistics, the large and negative constant estimate in the linear model
indicates its inappropriateness. The coefficient estimates for the semilog specification are
smaller than their counterparts in the (Ln ) equation, because the independent variables
in the latter are transformed by a power function.
The signs of the coefficient estimates for the (Ln ) model are as expected, including
the negative sign for the land variable. Regions with larger land are usually more backward
and more heavily involved in farming compared to land-scarce but more affluent regions,
e.g., Pearl Delta and Yangtze Delta. Farming has been a loss-making business in China
since the late 1980s or early 1990s. Well-known slogans assert that the more one cultivates
the greater the loss. Farming loss is attributable to depressed prices for food products, rising
prices of inputs, and the imposition of countless fees and levies by various government
departments at the local levels. Although these fees and levies may not be linked to the
Table 2
Estimated income generating functions
Variable
Dependency
Capital
Education
Family size
Land
TVE
1993 dummy
Constant
Ln model
Semilog model
Linear model
Estimate
t-ratio
Estimate
t-ratio
Estimate
t-ratio
0.0305
0.0073
0.3309
0.4696
0.0559
0.0519
0.0807
3.794
0.720
3.864
6.416
8.840
4.494
3.639
11.460
2.483
13.590
N.A.
0.0109
0.0016
0.2354
0.3069
0.0437
0.0236
0.077
3.530
4.411
6.218
9.596
4.556
4.096
10.840
2.318
10.800
4.4182
0.5642
88.8210
127.3600
16.5260
13.5650
33.766
554.710
4.099
5.174
8.287
4.327
3.549
14.290
2.326
3.884
358
cultivated land area, they can be enforced only on farming families under the administrative
control of local governments. Families who move away or who engage in non-farming
activities are less subject to the administrative abuse of local governments. As a production
factor, land contributes to gross income but not to net income. Therefore, the coefficient for
the land variable is expected to be negative when net or disposable income is used as the
dependent variable although it may well be positive in a production function or in a gross
income equation. In addition, heterogeneity in land quality and crop composition across
regions may contribute to the negative sign on the land coefficient.6
The positive coefficient on the family size variable is consistent with expectations
because of economies of scale in income generation. In addition, it may indicate the
importance of the contributions of school children and the elderly to family businesses,
sideline production and farming. Given that the dependency ratio is included in this
model, the impact of family size may also capture effects of an omitted labor input
variable. Except for a different intercept for 1993, the income-generating process remains
unchanged during the period. Hence, the estimated function can be used to compute
inequality decompositions for each year from 1992 to 1995. However, the final model
selected is nonlinear in terms of original income, although it is linear in the log of income.
To compute inequality decompositions for income rather than the log of income, we solve
the estimated model for original income. As a consequence, the constant term becomes a
scalar so that it does not contribute to inequality. Hence, both the constant and the dummy
variable terms can be removed without affecting the decomposition results.
Because the model is nonlinear, the Shapley value approach developed by Shorrocks
(1999) is used to identify contributions by individual variables. The contribution of the
residual term is obtained according to the procedure outlined in the previous section.
Regarding inequality measures, different indices are associated with different social
welfare functions that assume different aversions to inequality. In addition, they put
different weights on different segments of the underlying Lorenz curve. Therefore,
different indices of inequality often produce different measurement results, which may
carry over to the inequality decomposition. Hence, decomposition results are likely to
depend on the indicator of inequality used. For generality, we consider most of the
commonly used measures of inequality in our decompositions. These include the Gini
coefficient, the squared coefficient of variation (CV 2 ), the Atkinson index, Theil-T and
Theil-L.7
The decomposition is implemented using a web-based program developed by the World
Institute for Development Economics Research of the United Nations University (UNUWIDER). From the rows of Table 3, denoted total, an increasing trend in inequality
6 Relative to the western regions, the land-scarce eastern regions have high-quality land and produce more
value-added crops; hence, they are less likely to suffer losses. We thank one of the referees for making this point.
7 Let Z denote the target variable, , the mean of Z, j , index observations (j = 1, 2, . . . , N ); the following
formulae are used:
Zj 1/N
Zj
1
1 Zj
Ln
.
Atkinson = 1
,
Theil-L =
Ln
, and Theil-T =
N
Zj
N
359
Table 3
Decomposition results
Gini
Atkinson %
Theil-L
Theil-T
CV 2
1992
Dependency
0.0246 15.96
0.0061
16.60 0.0063 16.61 0.0067 16.82 0.0153 17.53
Capital
0.0163 10.56
0.0029
7.77 0.0029
7.76 0.0032
8.12 0.0072
8.26
Education
0.0294 19.07
0.0067
18.12 0.0068 18.10 0.0067 16.69 0.0138 15.81
Family size 0.0041 2.68 0.0066 17.74 0.0068 18.10 0.0075 18.82 0.0187 21.43
Land
0.0061 3.96
0.0012
3.25 0.0012
3.24 0.0014
3.43 0.0033
3.74
TVE
0.0457 29.71
0.0130
35.10 0.0132 35.03 0.0148 37.17 0.0353 40.33
Residual
0.0360 23.42
0.0136
36.92 0.0141 37.35 0.0146 36.59 0.0313 35.73
Total
0.1539 100
0.0369 100
0.0376 100
0.0399 100
0.0875 100
1993
Dependency
0.0237 14.79
0.0059
14.61 0.0060 14.62 0.0064 14.46 0.0143 14.78
Capital
0.0239 14.88
0.0049
12.21 0.0050 12.17 0.0052 11.89 0.0109 11.25
Education
0.0293 18.27
0.0070
17.41 0.0072 17.36 0.0070 15.92 0.0144 14.84
Family size 0.0013 0.78 0.0059 14.51 0.0061 14.84 0.0066 14.98 0.0161 16.61
Land
0.0069 4.27
0.0014
3.44 0.0014
3.42 0.0016
3.64 0.0038
3.94
TVE
0.0471 29.32
0.0134
33.28 0.0137 33.18 0.0152 34.49 0.0353 36.44
Residual
0.0309 19.25
0.0136
33.62 0.0141 34.08 0.0152 34.61 0.0343 35.37
Total
0.1605 100
0.0404 100
0.0412 100
0.0439 100
0.0968 100
1994
Dependency
0.0250 14.92
0.0073
17.14 0.0075 17.17 0.0081 18.04 0.0189 19.56
Capital
0.0234 13.96
0.0056
13.19 0.0057 13.16 0.0062 13.71 0.0139 14.36
Education
0.0342 20.42
0.0087
20.37 0.0088 20.32 0.0086 19.04 0.0173 17.97
Family size 0.0015 0.91 0.0064 15.12 0.0067 15.48 0.0075 16.55 0.0184 19.06
Land
0.0058 3.44
0.0013
2.94 0.0013
2.95 0.0014
3.11 0.0033
3.45
TVE
0.0433 25.86
0.0132
31.06 0.0135 31.01 0.0152 33.67 0.0359 37.25
Residual
0.0373 22.31
0.0129
30.42 0.0134 30.90 0.0131 28.99 0.0255 26.47
Total
0.1674 100
0.0425 100
0.0434 100
0.0450 100
0.0964 100
1995
Dependency
0.0231 12.82
0.0063
12.77 0.0064 12.76 0.0070 13.59 0.0161 14.97
Capital
0.0316 17.55
0.0075
15.22 0.0076 15.13 0.0081 15.90 0.0179 16.56
Education
0.0288 16.00
0.0069
14.00 0.0070 13.93 0.0069 13.49 0.0143 13.27
Family size 0.0030 1.64 0.0064 13.10 0.0067 13.34 0.0073 14.35 0.0180 16.70
Land
0.0053 2.94
0.0009
1.88 0.0009
1.85 0.0011
2.15 0.0028
2.62
TVE
0.0457 25.38
0.0135
27.45 0.0137 27.27 0.0153 29.94 0.0361 33.50
Residual
0.0485 26.96
0.0205
41.78 0.0213 42.38 0.0201 39.28 0.0386 35.79
Total
0.1800 100
0.0490 100
0.0502 100
0.0511 100
0.1078 100
360
quantify their contributions to total inequality. However, in rural China welfare payments
are virtually non-existent and poverty alleviation receipts are minimal at the household
level. Moreover, private transfers depend on the earning capacity of family members living
and working away from home. Thus, affluent regions may attract more private transfers so
that these may not exert any significant equalizing impact.
To explain the equalizing impact of family size, we point out that poorer regions have
larger family sizes so that the associated income flows more to the poor than to the
rich. Moreover, the concentration index of family size is expected to be negative. When
combined with a positive partial coefficient, this variable makes negative contribution to
income inequality so that family size is inequality-reducing. However, this equalizing effect
will disappear over time because family size is converging in rural China. Consistent with
the conventional decomposition results, we find township and village enterprises (TVEs)
to be the most significant identifiable contributor to regional inequality. TVEs contribute
significantly to economic growth and are more developed in richer areas. Hence, regional
inequality would be much lower if the income flow from TVEs were removed or equalized.
However, conventional decompositions attribute over 50 percent of total inequality to
TVEs in contrast to between 25 and 35 percent as reported in Table 3. This difference
is attributable to the crudeness of conventional decomposition techniques.8
By 1995, capital ranks as the second largest contributor to regional inequalities, overtaking education. Given that rich regions can afford more capital input, this disequalizing
effect is expected. Moreover, the absence of formal capital markets in the less developed
areas of rural China is detrimental to capital formation in poor regions. Hence, the contribution of the capital input to regional inequality is likely to continue to increase unless
governments establish rural credit markets to assist poor regions and poor farmers to obtain
capital. Education had been ranked as the second largest contributor to regional inequality
until 1995. Due to its positive marginal effect and a positive concentration index because
better educated people are richer, education is expected to be inequality-increasing. However, investment in education may reduce inequality in the long run, especially in rural
China where many families rely on educating their children to escape poverty. Therefore,
the government should increase its efforts to provide education to the poor regions.
Land is more equally distributed among households than other factors within individual locations. However, significant disparities exist between regions in terms of land endowment as poor regions have more land. If the marginal impact of land on income is
positive, this variable would help to reduce regional inequality. Unfortunately, due to rural
levies, taxes and fees, the marginal impact of land on income is negative so that land is
an inequality-increasing factor. In the long run, land could become an equalizing factor
if government taxation and levies were reduced. Overall, our model explains between 73
and 81 percent of the total inequality when the Gini coefficient is used over the four years.
Considering the other measures over all years, our model explains at least 57 percent of regional inequality in rural China.9 In terms of the ranking of contributions, all the indicators
8 Conventional decompositions can be viewed as a special case of a regression-based approach with a linear
income-generating function; the product of a variable and its coefficient is equal to the income source.
9 We do not compare our results with those of Ravallion and Chen (1999) because the latter are obtained using
the flawed framework of FY. In addition, we use different data in terms of variables and level of aggregation,
361
present a similar picture so that the choice on the inequality indicator is not crucial from a
policy making perspective.
5. Conclusion
which renders such a comparison difficult. Nonetheless, more than 76 percent of total inequality is not explained
by Ravallion and Chen (1999).
362
Acknowledgments
I am grateful to Professor Tony Shorrocks for numerous discussions during the
preparation of this paper, and to anonymous referees and the Editor for useful comments.
All remaining errors are mine.
References
Atkinson, Anthony B., 2001. A critique of the transatlantic consensus on rising income inequality. World
Economy 24, 433552.
Blinder, Alan S., 1973. Wage discrimination: reduced form and structural estimates. Journal of Human
Resources 8, 436455.
Bourguignon, Francois, Fournier, Martin, Gurgand, Mark, 2001. Fast development with a stable income
distribution: Taiwan, 197994. Review of Income and Wealth 47, 139163.
Cancian, Maria, Reed, Deborah, 1998. Assessing the effects of wives earning on family income inequality.
Review of Economics and Statistics 80, 7379.
Cornia, Giovanni A., Popov, Vladimir, 2001. Transition and Institutions: The Experience of Gradual and Late
Reformers. Oxford Univ. Press, Oxford.
Datt, Gaurav, Ravallion, Martin, 1992. Growth and redistribution components of changes in poverty measures:
A decomposition with application to Brazil and India in the 1980s. Journal of Development Economics 38,
275295.
Deaton, Angus, 1997. The Analysis of Household Surveys. Johns Hopkins Press, Baltimore.
DiNardo, John, Fortin, Nicole M., Lemieux, Thomas, 1996. Labour market institutions and the distribution of
wages, 19731992: a semiparametric approach. Econometrica 64, 10011044.
Dollar, David, Kraay, Aart, 2002. Growth is good for the poor. Journal of Economic Growth 7, 195225.
Fields, Gary S., 2001. Distribution and Development: A New Look at the Developing World. MIT Press,
Cambridge, MA.
Fields, Gary S., Yoo, Gyeongjoon, 2000. Falling labour income inequality in Koreas economic growth: patterns
and underlying causes. Review of Income and Wealth 46, 139159.
Judge, George G., Hill, Carter R., Griffiths, William E., Lutkepohl, Helmut, Lee, Tsoung-Chao, 1988.
Introduction to the Theory and Practice of Econometrics. Wiley, New York.
Juhn, Chinhui, Murphy, Kevin M., Pierce, Brooks, 1993. Wage inequality and the rise in returns to skill. Journal
of Political Economy 101, 410442.
Kolm, Serge C., 1999. The Rational foundations of income inequality measurement. In: Silber, Jacques (Ed.),
Handbook of Income Inequality Measurement. Kluwer Academic, Dordrecht, pp. 1994.
Morduch, Jonathan, Sicular, Terry, 2002. Rethinking inequality decomposition, with evidence from rural China.
Economic Journal 112, 93106.
NBS (National Bureau of Statistics), various years. In: China Statistical Yearbook. China Statistical Publishing
House, Beijing.
Oaxaca, Ronald L., 1973. Malefemale wage differences in urban labour markets. International Economic
Review 14, 693709.
Podder, Nripesh, Chatterjee, Srikanta, 2002. Sharing the national cake in post reform New Zealand: Income
inequality trends in terms of income sources. Journal of Public Economics 86, 127.
Ravallion, Martin, Chen, Shaohua, 1999. When economic reform is faster than statistical reform: Measuring and
explaining income inequality in rural China. Oxford Bulletin of Economics and Statistics 61, 3356.
Rozelle, Scott, 1994. Rural industrialization and increasing inequality: Emerging patterns in Chinas reforming
economy. Journal of Comparative Economics 19, 362391.
Shorrocks, Anthony F., 1980. The class of additively decomposable inequality measures. Econometrica 48, 613
625.
Shorrocks, Anthony F., 1982. Inequality decomposition by factor components. Econometrica 50, 193211.
Shorrocks, Anthony F., 1999. Decomposition procedures for distributional analysis: a unified framework based
on the Shaply value. Unpublished manuscript. Department of Economics, University of Essex.
363
Wan, Guanghua, 1988. Factors affecting the variability of foodgrain production in China. Paper presented at the
Annual Conference of Australian Agricultural Economics Society. Melbourne.
Wan, Guanghua, 1997. Decomposing changes in the Gini index by factor components. Unpublished manuscript.
Centre for China Economic Research, Peking University, Beijing.
Wan, Guanghua, 2001. Changes in regional inequality in rural China: Decomposing the Gini index by income
sources. Australian Journal of Agricultural and Resource Economics 43, 361381.
Wan, Guanghua, Cheng, Enjiang, 2001. Effects of land fragmentation and returns to scale in the Chinese farming
sector. Applied Economics 33, 183194.
Zhang, Xiaobo, Fan, Shenggen, 2000. Public investment and regional inequality in rural China. Discussion paper
No. 71. Environment Production and Technology Division, IFPRI, Washington.