Documentos de Académico
Documentos de Profesional
Documentos de Cultura
IRB 2003
Edited by
Colin de la Higuera, Pieter Adriaans, Menno van Zaanen, Jose Oncina
Publisher:
ISBN: 953-6690-39-X
Zagreb 2003
!"$#%#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#)(
*,+.-0/ 213#4#&#&#&#'#%#'#&#'#%#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&# 5
6 -078+.-09;: <=> - !? +A@9CB 1#&#'#%#'#&#&#&#'#%#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#(ED
F=2<1#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#<G?D
HJIKML&NOL&IQP&LRSKUTWVXY;P&Z[T"\]\^R?K_R"I2`]L'a`cbNOL&LedN)TSfWfgTSN)\^hF\]ijIQkWlR2\mij`]i[n?LoVT?fqpQZjLE\
rs8tus3vw,xy{z}|~www|,ww|s3ew,ww)wo#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#&#G?D
c YNOi[T?Ik?XCTSk?LRL&ZYQT?\]L%uR?IV2`]RP)QT"\`Oi[Pe_R"I2`]L'a`].KMNOL&LodNOT?fWfWT?NO\
s8vz}|,w~Q"3r3sOs3|,QzJ3w|,rsOtusUw |;,Q#'#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#MC(
HJIQP'NOL&fWL&I2`)TSZ,L%T?N]IQijIkuRSK^_R"I2`]L'a`obNOL&LWdNOT?fqfgT?NO\{Y a`]L&ICL%HJIQXQP'`]i[n?L_
Z[k?R?NOij`]f
s3w3wyF,w#'#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'# 5?
,L&n"L&N)TSk"ijIkq,L&aiP&T?ZVL&fgT?I"`Oi[P%\A`]RgHJIKML%N{_R?I2`]L&a`mJbNOL&LdNOT?fWfWT?NO\
gsUow~Q?gs8t~y'~|33rs8ew~~zM"3w|,s84yFw|W#'#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#]25
,L%T?N]IQijIk_R?I2`]L&a`bQN]L%LqdN)TSfWfgTSN)\Fi[I`OLqijfWij` iL%Y `OLVTSfWpZ[LWFi\`ON]i[YX
`]i[R?I
sU;Q3z}|,e#%#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#O"
,L'K}` Zji[k?IQL%dNOT?fqfgT?NO\HL&I2`]ijK ijIkTP'ZT?\O\qRSKo_R"I2`]L'a`].KMNOL&LdNOT?fWfWT?NWijI`]QL
Zji[fWi`$KMNOR?flR"\]i`Oijn"LT<`)T
qsC2w~8z}#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#&#&#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#)"D
F \]pNOijIk? T?IIR?`OT<`OL%pN]R"YQTSYQijZ[i[\m`]iPP'R"I"`OL'a`mKMNOL&Lk?N)TSfWfgTSN)\
rs8vFs3q~g Ow3?3rsFw2w~],;zM8w|ssFw~w3E;#%#'#&#'#%#'#&#%#'#(%Q(
hcIQ\]Xp;L&NOn2i\]L%dNOT?fWfWT?N^HJIQXQP'`]i[R?IijITWbN)TSfWL%R"N]gR?K8HJIKMR"N]fgTS`]i[R?I_R?fWpNOL%\O\mi[R?I
Y XZj`]i[pZjL Z[ijk"IfWL&I2`%QhFIijCP&TS`]i[R?ITSIQVLETSN)P)
rss322#'#'#%#'#&#'#%#'#&#&#&#'#%#&#'#%#'#&#'#%#'#&#&#&#'#%#'#&#%#'#&#'#%#'#&#&#&#'#%#'#&#'#%#'#&#%#'#&#&#&#'#%#'#&#'#%#'#&#&#&#<(?(E
12
!#"%$'&)(+*-,.,
$76859;:<=)6?>@BADCB6E6GFH1IJI<8
/1020354
KMLONQPQRTSUVNQWYX;P[ZD\?U^]W_a`PQUcbdUe]f;S?]PQWDWOR?ge_
h U^R?RL'ijWORlkmWDWOR?UeRnWOR?SloOLDgUqp R?r^P[RaW
sutwvcxzy|{~}1xMy|e
{y|ey|8 } yz{e yz{{c{j
cy|xQa}xzO{y|^Qyz{{e
s'yz}yc{j
{yvc@c{cjcewvccy {}twveM){yyzvc{! ?tm9 {yzcyy|
dc{~} mxzc{}
cy|xQ?yz8y|{e d {x
s'~{{vdc{~c{{B
twveMyzvc{cxBByz{cyymy|xz {yzcy|cmx{j}
cy|xQ?2e{{O^ {x
s'cv^2{y|{
cM{^v@}2{em ye {ve yzve {yzcy}c} }2x
yz{^e yz{
cy|xQ?vc{y|{j^8}xyQ
a
mwcO{2cvyz{yz{evemxccByz{)xzc{yz{x|{y|{ve
vexeccvee^{avc}}cc{y|jexzc
c8[vcjve{}yz{8v8{y|vceeaO{Exzx}yz}eyzajc?{vc
O{;evcve1y|e8xzem~cve{^[8c
{@}8y|{yz{2c{}Mcyz{xz{yz{veM){y|yz8{})jvc
dmdvc!yz{cxzaxzveyz{^}8x|{y|{eDv2xyQcx|edvc{x|yz{xzc{
yz{eve{5vc~vca^y}5}vc{^cy|{5x}cExzxem^
yzcve{}c5myz)}exzvc{^e{c{Yy|{^y|{Y{xc1v
xzc{yz{7xzcvey|M{}{#vcx|}vcyz{u{T~@yv8yxyz)
Ovcc{~yz{yvc{^@yzcyz{}yzvc{B {yz@x}
Oey|{{y|{Vvq{yz{^yzvc{'vy|{j}9y|ve{'vcvc{^QecMc {
}j9yzvc{ vc?xzeEv?x{eeEj1x|vEmyz{ )twvc@y|cywx|x
}y|}wcxcevce)yz{vcy|{}yxjxzyy|ve{c{}y|{vc
23
vvc)}yyzx|yz{?xzy|e}~y|{y|{jayzvOj9}7vcG~jM{}
{}y|{;va)xzc{jyzx|y| vyzvcDxcvve{^[)c {
yEvcve@Ey|xzxcyzc{5vcyzvc8vcy|{;yzveMyy|{
x} Oyx{^yzvc{2yzxzx^Omcyzc{v1ve}j{}{jyzBy|Myy|{5{}
}yz{yz{ BmwvevcVy|xzxajyxzx|~OvEdcx|)v;c
y~{5y|{^yz{5yz{^9y|ve{;vc j7{} }My|{yz{j
$%&('*)+,.-/'0'12
3546' 7
~ve8{^yzvc{~yz{xz}vcy|{}9y|ve{cO{qvc{ Mvexz
vBxzc{yz{1{y8cvee{^y|{exx{ee8wxzvc{
}O{j}{y|dcx|yz{}vcy9c{eyBxzxcyz{^{cxc9
dMcjEy|{;y}{cvcxzyy|ve{c{}cj}y|O}@^vc{^^
2ecMcm)jv^)va2Evcve~yzv;vy}2@vcOyjv
y^yzvc{{cxzy|{Mv{^my|v^{^mx|EvcE
yz1v?x|{y|{Mvc{^^[1ecMc
#9
:<;>=?@AB'DCE@F/;!ABG
{yz}H8cc {}9y|ve{Y;vey@vvc{^Q@cxzc{y|{Vyz)9
vce{y|}~ec}~^y|ve{Bm)c+cjxzyz}~e~ccve{
y^y|ve{{}xzcxz{y^?wdcmO{{^}Mdcm{vcO{
veMj}y|Me)vcmmyz?{v8e{cx^e})j{Du}yz
vc{;y|ycxyzEvj}x|yBm1Evcveyzx|xyz#
sYvyz}y|{xzcyzvcx}vcm^yzvc{jBdcO{)vcxzc}
c{}5vve1j{};e{^{^y|ve{
svcyz{veM1vc{^E{yz^
sYx|veyz{5yvcExzc{yz{;ve.ce} vcyace}8}
yzeyzvc{~OcE{xzxz@{ }xzy|{y|J
ILK>Mu}
scx|xzvy|{M{vc{ Oyzcx|yEv{vjymO{y|{yz{51jx}
sy|{v wx|xD1{y^E{j9yzvc{ yz{ e9yzc
34
@G *;
m8vcyve{;yz;xzxvcjOmwvexz};Oyz};y|xzxBO11vex|xzvy|{
s 8{cxBx|{yz{x|vc{j{yz{M2x|{yzxzy@vc?vc{^Q8ecM
c vcxcEv
sutwxze1vcvc{^QMecc8veyYjv^y|yzcxzc{yz{ xc{
jvecy|{}
syyzc{}5x|mve{ycxc
sdcx|y}yzvc{5yzaj{veve{eQ1cEx|{yz{
s'xzyy|ve{mv?ve{eQ1cmxz{yz{
s'vjcy8vc{^QecMcExzc{y|{
sx|{y|{ve{eQecMcmvc9}}jve8y|9}
}
s'Bc{y|{)cve
mvcec{y|av8wvevcyzc}vcxDvc
yyzvc{evemvBy
E2xz}Bm){^2Od5jMxzxQOMdee1^cx|y|5Eec
yzc {j}5}yzc}yz{MOmmcm}y x|c
mMO1c}}Mv^1vMy|ve{yz^Bcxzyzyzvc{jc{}vc
xzmvc{8{ve{^myz{~vc{^Qeccmx|{y|{
{l7veycx1cMwvey|{jde{j{}lcecYvTdcvcy}
8vexz} {^yzc2x|8ve{{y|{yz}{^y|jyzvc{vc Oveyy|e2x|8xzvc{
vcOveyz{M{!xzeEvave{^ecMceyz{ yzO
!
"#%$'&)(+*,!.-$0/,/1"2'3
"4+568795;:=<8,>? $A@B@C$4,/EDF/GHJI'KL"0/GH5;GHM0ON7P$A@RQS-T/9Ue
cyz{+HVF,$A+GHPIW3
"A)5X 795=:=<Y8,Z>[8$A@B@\$A,/\GH5^]_`VGH@`GH5LabGc,c1*.de5H]+C&$A@RQS-
f Gg/5=G*.(h5=G"4+U)yz{^v}7{
x|{yz{7c}yzc {}uMV}yyzy|ve{
vewcx|cc{52ycmyz{ vc}Evx|{B
B{yz{cy|{c}2vccx^ecccyvcyxzxz1cx|mdBO{
c{2yzMOvcc{^?}yz9y|ve{2yz{mvexz+vtwvc{^^[Exz{yz{c22+g+:
!.8.@C.+56$A-iVF $4+GHPI%"2Z3
"A)5X 795=:j,8, I8$4@`@\$A,/k*dkN7956c,ckcA(+!.5=GHMAe3mlSnoab-j:
{^8op+qFrYsit+uwvy8eO{;}cxzvcy|{)vc
emlec cyz yy|xzxjyz{^cx_^Vi=,5ab-xGxI4 c1>[8$A@B@\$A,/
IJ"AGH5^]P@%U
9cE)y|yzvc{ vx|{y|{yz8ecc
m jvcBc{^y?vxzx|{yz{vayvcOve}M8R5
$4-
yz{2yzcjOHVF.M0.8$.IGHJIVF879G!,$A-+&.@C$4+5;G!/L56"Rw. 3
"A+56 75;:=<Y8,m>[8$A@B@C$4,/9U9
vcx yz{^v}O
K~ kpYcMwveq;cx|yzc{M{^)y2}y|{vc
}mvxzc{ ve{eQ1cyz{.D|_/(Q_.MGg/,c%>? $A@B@C$4bcA(+!.5=G"41GH$
f
$A56$)U
45
< $A@\
9
m@yzMOvcc{vx|{yz{ve{eQcveH@c{yz)}
v;y|ey|yz8yz{qx{ee2v}xzyz{c
a} c\
.5b$A- +
[/;QGHJI:
ab+"A56$A56 c`K?8"*,$9*GH-jGg/5;G!Z3
"A+56 75;:=<Y8,\>[8$A@B@\$A,/9UEvejv^8vvccyzx|y|yz
vc{Mvcmvcxz2avcj{ve{[y|{cxzdcyzc{28MvewxOvcwyzve
} Oy|{^ve{ wyz}yvc{}aByz{b5 $4-ay|veyz
cEvcyz{};vc c{my
DQecEvex{ee8v}x|yz{e
yyz{eyz{x+Had*.GcVi$4JI(+$.IJ1"wc9-b$0/.,c"AB&)56"! ]+$0/5=G!3"4+568795
< k>[8$A@B@C$4,/9
&,$A ! ]FU
56
67
78
89
910
11
10
12
11
13
12
14
13
15
14
16
15
17
16
18
17
19
18
20
19
"!$#%
&'$)(
*+&(( -,.)(/1024356
798:
;<;<
?= >@BABCEDGF@IH/J7ABKLMFNOH
PRQTSVUXWP&YQTZ\[^]`_OacbX[ed_f g\_ioWnho[^Zip
dXW\jkPRh[^dcZ6[^_ldXZ\jkPwZmdc_n[^Ym_lZmopqZ\_nacY\r\\_ag\hr)Z\uls1_ tiuOdc_lra?vwhrmxzy{[^u}|\_lx^tnZ~Y
|mtn_Th_n 6``6n4`\nn6`i6nnnw6``66ik`i nY n`6wkii
lu[^ZmtuOwZixdclrm_Olg\6'_ldXb5jdcacn|m_ltn_bX _}|\aca_Oh_{I_OIact_OxZmx^htula7&_lbx^[^_Zmdch'|\ac_Zmh[^hacZmr\_dc |\rm tbaw|mdG_bX_ltx^x^[^[^b_ld]`_/_Otbzac
_OdcacIt?_O_OacBG_O_O_acZm_lulg\Z\_l[eulacb_l_lu}bBtdcx^[a7b7dcuObX|m|\tn_/ZmtaXuOd_OacactnhZm\Zm_xgEg1_O&tnZ\[edctx^|
dc[^Z|\g_[^u[^bXhbXdcrm[^]`__ntY\4hbultdcZ6|mdc_O_}ac_Td?hBacac_1_l_haZ6hqac_OIh_OaTac_OxZm_lulh_lacb&Zm[^x^Z\[^Zm\` _l g|m&_[edcx^[^|)bdtbXdc|m|mt_}a
rmxacg_lbXG__hacduh|)`_lmZ2_lxg\hbb
Bta?[Z\bdhZ\ul_ultn\r\dh'dc[tZhx~x^[^Zmrm[^bdc[^ulb}Ydc|mhdVhac_Z\tdV[^Zmulx^rmg\_gM
RH6`C5+mRz>ABC\~OFNlk)}i>+R'>\Cm}F'k5{7N}FnFKnNlH6/n4wH6N/n@6FnH67NO CMABNOCMHK~/
/RH6F}FAI H4@mFABFCG\`FN}VFC~G'H6FiH6'>\CmAB~OH6H6NlABC H
MH6Gm~J>C~H`iH6NHC~D%DMF@BH/J7ABKL~FNlH% JDG@IJ ~DGF@IH/J7ABKLMFNOH DG@BJ`
G5?>mMFn`&OMF/ >DMH}FF@D>6&>C\'5}FkH\ lM6ABCMk K6 / >GDG FM@I
$ ) \
m \i iq DG{F$ @BH 4kJ7 ABK LMiFNOH2~ 6N}>
DM@BJmn
M
VMGFNOiF{}MF/>\5FnCM CMwFOm>6`?k}" M!F/ML~NO>N}\~F@F/~qH6zNOFF@INHi>OFMD}M}"F > #~CMF>@B%D C$DGF }F! NOCM/K\A@C~L~ABA'C }&AI) AI1#~ CM! A}CMF2K\ H6 LM
}( >\/H6OH>\N
'>\Cm}F'}kNOFFK\NOH{HNOAINOFH6@B@FnD 4 F
zFNl~A+HC~D+F *mLMABiH6@BFC~F
*mLMFNOAFn7O>K\F'}~FN>NO H , ~ - /. 021iX4
3~ M2 C OMF/H}F
>?'>\Cm}F'}
N}FFKNlH6/H6Nl&OMFCMFKmHiOA\FMNO>>6T ! C~K@BLMACHC~D 6 5 5H6NOAO>CM>i ! ( AO
798;: VHN}F1NOF@IHiOFD
}>'NOG}>\KNlH6M~ABTHOL~{M}AB>C~ C{}MF7>}MFN
~H6C5Dw6Aw}NOL~'
}LMNlH6@zABCG>NOHi}AB>CAITH``HA@IH6~@F\\GH6iHmAB~HNOHMNO>iFn=< }~F@BFH6NOC~HMA@BAc)>w}MFq'@IHO
>6
'>\C\OF'NOFFqKNlH6/H6Nl?ABC%}MAI/>GDGF@V GH\`
> MF1/>DMF@G~H\NOFFABFnD
'>\C~}ABDGFNOHM@BF?Hi}}FC\OA>\C{HC~D
}MFN}F1H6NOF?HCm5H6zFNl>\C
@FnH6NOCMABCMK ( A}
DG@A ?FNOFCm}>N}Ow> *mLMFNOAFn GH6iHmAB~HNOH GH6; Amn\@BFH6NOC~'>CmOF'N}FF
KNlH6/H6Nl?N}>\B *mLMFNOAFn
C
F'V}GFiC~DGGABCME KDk+NOFR}LM@lTDkMN}>\nF }< ~C~F
F '@I#~HNlOcT>6>G}OMA>\FCNOF}K\>%LM>@BHGN1lH6@BHABC CMK5L~>mHKA}FnABG F
$ N}MFnFLMC @ONOFMAI7NOFO}~FHiCmqOF>6D
I HFJ 8 & > MAB%AI}~F@BACMF>@B@> ( FD$>\N 2 K`L z M @iO NP1( +N\R Q >N ( MAIlH
CLM
zFN>NOF}LM@l/AB/CM> ( C4 > H`H\DMH > H6; AAiVF/5FN}FH6C5D HNO= TS H C
H6C~D H6AC~FC+ H6Gn~H`FH@@ >\N}\FD%>CE}~ABq'@IHO7H6C5D%K\A\FAB/A@IH6NN}FnL~@l
DGV@A ?FU NOFCm}FnlW MCMU XA *mLMF / Y FC@BA( C~FH6N
@BHCMKL~HKFnH6NOFKFC~FNlHi}FnD m KNlH6/H6Nl
( MFNOFq}MFNOLM@BF1H6NOFq~H6@IH6C~FDwOMFNOAK\\1~HC~DMH6NOF'>/z>\}FD%FA}MFN>6}FNO/A
C~H6@)5>\@B1>\CM@\~>\N7H6NOF>6}MF>\N}[
Z\^] MFNOF>C~@% \ AICM>\CG}FNO/AC5H6@4HC~D
ZH6C5%
D ]~H`FAIDGFCm}AIH@R@BFCMK} _ >@B@B> ( ACMK{O(MAB}NOFC~DE>6OMFNN}FnLM@O>C~FNOCMABCMK
21
20
}MAIq'@IHO>6
@IH6CMK\L~H6K\F
$>\N}A/AB@BHN&1HN}FHCF'}FC~}A>\C }>H~AFNOHNOl2>@BABCMFHN
@BHCMKL5H6KFn
$ > H6iHD~HM > H6G C &~}MF{H\F ( MFN}F
>CM@B%z>\}A}ABF
}NOAC~K\7HN}FH`iH6AB@
H6M@BF $ >\}MA5H ' P BT > &$MLM
}~FC+>\CM@H }LMGX'@IHOAI
AIDGFCmOA#5HM@BF &
H6C~D{OMFH}F MFNOFOMF7z>\}AOA\F7ACM>NO/H6}AB>C/AI?}NOL~OLMNlH6@ $AR\>L/CM> MFNOF
}MF l~XR )>6( }MF}NOABCMK\1AI & $F/zFNOF
H6C5D H6K\HNOH HMw = A` & A ?FNO(IFCm( }LMN}
F7>CE}MF)LM cF~H`FzFFC ( N}AOFC &>\>/>\N}A& &>; AizRFF2 RFF\`>\N
GH6iHmAB~HNOH GH6Gm 5 C DG@IJ < H`H/@IH6NOKFN7'@IHOO~Hi>DMF'}FN}/ABCMAIcOAB@BABCMFHN
KNlH6/H6Nl
AI/MNO>iFnD DGF @BH J7ABKL~FNlH HC~D < C5'ABC~H O>5F AIDGFCm}@A #5HM@FENO>
5>\@CM>\/ABH@w}AB/F
H6C~D DMH6OH ! KFCMFNlH6@ ( H`2>DGF}Fn}ABCMK2A}MAI7AI1}~F
H\F>\N
>6OMFN7@BH\}}F1>KNlH6/H6NlTAB1K\A\FCE}MFOH6/FH6LM}M>\NO1ABC DG@IJ < 5 AB>N}
DMH6C~>E A>m C R~N}>\5>mFn&O>/}FF}~Fq~N}>\M@F H?}5Hi>6H6CFG~H6L~}ABF}FH6Nll2ABC
H@IHiOABF
DG2F #~CMFD %}MF 2 i P Q)Ki2 {>i\FNK\NOH{HNO1ABC =?M>\/}CM>\N}H@
>NO
wi 5 +Gnz 5 C HFN}ABF{>6~H6zFNl2GH6iHmAB~HNOHKABF
}FnlMCMAX*mLMF{O>
@FnH6NOC'>\Cm}F'}kNOFF)KNlH6/H6NlNO>cON}L5}L~N}FnDDMH6OH GH\`D~HiOH%'>\C\lH6ABCMABCMK
5>mA}ABF }NOL~OLMN}FnD+DMH6OH GHG
LMC~}NOL~OLMN}FnD+DMH6OH mK\FCMF}AIH@K\>NOAOM
`7H6C~D DMH6OH '>\C\lH6ABCMABCMK}>/FcON}L5}L~N}FH6KmH6ABCK\FCMF}AI H@K\>NOAOM
` > MFNOF&H6NOF&H6@I>HCLM)5FNR>6G\FNO}5Fn'A@#5
NOF}LM@l}5Hi AB@@\5F?>6ACmOFNOF
}> }zF'AIH6@BAIcl 5 }MA H`H 5 } A`@FnH6NOC~7H6C~>6}~FN7NOF}NOAB'}FnDE@BH\}( 1>&'>CmOF'N}FF
KNlH6/H6NlO~Hi>6}MF)AB/M@F{DGF'OFNO/ACMAI}AIKNlH6/H6Nl qM>\LM@ID 5FC~>6}AI'FnD
}~H6)OMF}FK\NOH//HNO
H6NOFCM>)@BACMFnH6N LMNOF'>\C\OF'NOF;F25 @IH6CMK\L~H6K\F)H6NOF2MLMAB@
N}>\ K\NOH//HNO ( MFNOF}~FCM>\CGkOFNO/AC~H@~}m)5>\@B?H6NOFH6@I}>}FNO/AC5H6@mHC~D{}MFAN
@FnH6NOC~H6~A@BAc ~H\zFFCcOL~DGABFD >\}MAB~H+
P > i N}F/FN% NOFm
MNO>z>\}F1H/~H6NlH6@B@BF@wH6@BK>\N}A}MABC >NlDGFNTO>@FnH6NOCE>Cm}Fm}kNOFFqKNlH6/H6Nl
! }5Fn'AIH6@1H\F}M>\LM@BD+5F%/H\DGF> 7FAB@@ H6CMC~ACMK H6C~D AOFC /H6@BK>\N}A}M
m` @}M>\LMK/A?HCMCM>6&zF7ABC~'@BL~DGFnD/ACmO>}MF@BH\}>wKNlH6{
:
H6N4ABC~DGL5}AB>C) H6@BK>NOA}M! H4A~H4CM>K\FCMFNOH@A H6}AB>CH~HAc\AAI4HCF@BFKmH6Cm
( H`>1DMFDGL~AC~K H >Cm}Fm}kNOFF{KNlH6/H6NNO> cL~>C~F $L~}L~H6@B@FNO@B>CMOK &
FCm}FC5'F > MAB7KNlH6/H6N1H6CEOMFC KFCMFNlHiOF cL~c7>\CMF}NOACMK5}MF>\N}ABKABC~H@z>\CMF
LMCMC~ACMKEABC@BACMFnH6N}AB{F/H6C5D}~HF : AB/>NOFO~H6C cL5cH>/MNOF
AB>C}FnlMCMXA *\L~F{H\AH6@I>%F'GM@IH6ABC~}MFDMH6OHA~H\O>E>/MNOFOK\AABCMK%AO
cON}L5}L~NOH@wMAFNOHNOl~ABH6@zC~HiOLMN}F
Vi +G > NOFF?HLG}>\Hi OHH6NOF
}MFTDGABN}FnF}FC5AB>C>' HFJ 8 HC~D J 8 >\N
}NOFFn1AC~}FnHDE>6cON}ABCMKm > MFH6@I}>{~N}>iAIDGFqH/}/>m>} @BACM5F ( FFCH6LGO>HilH
H6C~D>Cm}FN}FFqK\NOH//HNO RFnH6NOCMABCMK{}NOFFH6LM}>H6OH/~H\7H6@BN}FnHDG5FFC DGFnH6@
( AO AC _ LHC~D >>} L~N}\F _ n
> MFNOF H6NOFFN}}NO>CMK @ABCMG/zF' FFC$@FnH6NOCMABCMK >Cm}Fm}kNOFF%KNlH6/H6NlNO>
MNlHlF}FnDED~HiOH $>N}~F)H'}L~H@}( F@F}>C5>N~H6NlFkON}FF ( A}M>\LGABCMCMFN@IH6zF@I &
H6C~D @BFHN}CMABCMKEN}FKLM@IH6NqON}FF{K\NOH//HNO _ >NABC~OHC~'F\GH6iHmAB~HNOH GHG`&L~}F
H6@BK>\N}A}M4}~H6@BFHN}C
N}FKLM@IH6N4}NOFF1H6LGO>HilH $ OMF}F1H6C)5F1HD~H6GOFD\FNO>6 }FC
N}>\ H6@BK>NOA}MO~Hi@FnH6NOCH6LGO>HilH1N}>\ cON}ABCMK\ &nH6C~D}NlH6C~>NO}MFnF?ACmO>
H6@BK>\N}A}M?}5Hi7@BFH6NOC '>CmOF'N}FFKNlH6/H6Nl&NO> ~NOH\lF'OFD%DMHilH
> MF HDGiH6CmlH6KFnH6NOFC~FFNOMF@BFOO~HiHDMF'}FN}/ABCMAIcOABEH\F2FGABOH@@B> ABCMK
}> N}FL~}F2NOF}LM@ONO> HFJ 8 @FnH6NOCMAC~K AC+}MAI)}F'}}ABCMK ! C F'}FC~AB>C >6 (
}>DMFH6@ ( AO}NOFFnAI7MNO>iABDGFnDEm H6Nl= TS H/H6C5D < C~'ABC~H < 5H}NOFF\FNlAB>C
>6&OMF/5FncCM> ( CH@K\>NOAOM>NcO>Gl~H}AI) HFJ 8 $
8 &ABqMNO>z>\}FD
"!$#%
#&
'
(
*"!+#+%
.-
#+&
,
0/
.132"4%
65
87.9:"&
;$%
22
21
65
-
-
)
65
23
22
' AIH6@ABC\OF@B@ABKFC~'F
}FnlMCMAX*\L~F5}L~lHKFCMF'OAB)H6@BK>NOA}M $ H6//F\FNHC~D
F@BF 4 n&
)G z~i( D % i +Gi =?>/MLGlHi}AB>C5H6@z@ABCMKL~ABOT~H`\F7@>\CMK
zFFCABCm}FNOF}FnD
AC ( >\N}ABCMK>C K\NOH//H6}AIH@R/>DMF@I7O~Hi ( >\LM@BD C~>6 #~ABC\O> =?M>\ MABFN}
H6Nll = _ LMN}}MFN}/>NOF\}~FABN?>\ cF'}ABFAI?}> #~C~D}LMAOHM@Fq/>GDGF@I&>\NT}CmOHi2HC~D
FH6Cm}AITO>{zFACmOFNO@ABCMFnDwMH6C5DMNO>iABDMFqH{@B>KAI~H}FDEDGFO'NOABG}AB>CE@BHCMKL5H6KF
F+AIDGFnH{N}F@BH6}ABCMKL~l/>GDGF@I AO}MF *mLMF}AB>C5/>6@IH6C~KL~HKFAIDGFC\OA #zHi
} AB>C HC%zF>LMC5D%ABC H6C~H H ( H &( z>m>\ H6C~= An~HC~DEDGABO'L5}}A>\CNOF@IHiOACMK{OMAB
}> OMF ( H` l~A@IDGNOFC @FnH6NOC @IH6CMK\L~H6K\F{H6CzF/>LMC~D ABC~H6zFNlHE`HN}ABF'c>6
H6LGOM>NlH\?>N1AC5clH6C~F > F@B@ABFN > F@B= A` > MFq}A}L~H6}AB>CEAB1}AB@@wLMC5'@BFH6NnGHTz>\
AOA\FNOF}LM@O{H6C >CM@B5FE>GlH6ABCMFD>N/ zFABH@T@BH\}}F
>67KNlH6/H6Nl $kFF>\N
AC5clH6C~F _ >NOF'H6C~D RF 7ABN/ _ & ( MFNOFH\MFN}F{H6KmH6ABCR~OMF'>\N}NOF}5>\C~DGABCMK
'>\
MABC~HiO>NOABH@wMNO>M@BF
$ >\N7ABC~clH6C~F}~H6>6E #~C~DMACMK2}MF
}H6@B@Fnc>C~}AIcOFCm
KNlH6/H6N &VHMzFH6N1O>/5FABCm}NlHlH6M@B-F $=?>mclH _ @B>N F C5'AB>~ _ @B>\ &
})FN} FncABCN}GFn@BH6z}ABCM K DK?NlH6/5HiOABkH6 @
>ABCGMFFNONOFF
C~~FH\7>AVO'>C~L~HiNOO}LMFNlzH6@&FF@ICH6CMH6K\@ L~( H6H`K\GF 7H@CM>F2DM>A&NOFACG'
/ <
}AB>C~H\7zFFCOHFC ! DGNOABH\H6C~}MNO>LMK\ ( 5H6@B@> ( KNlH6/H6Nl ! DGNl $L~}AC~K
H6}FK\>NOABH@&KNlH6/H6Nl&wOMAB)OMF>\N}F}AIH6@ ( >\N} AB
}~F%~H\l5>\CMF2>6OMF
MNO>6}>c5F ! q
\ }HN}ABE@BDk}\MGFDMk5H6OHH6C~ D)}M F~N}>\M@MFz4>\ NDK{NlH6/ Dz5Hi}AI H@\>\ABCG@Fn'FLMN}F@IH6C~N'FMOAB>'@BAB>FCmK\} AIc~lH\}> CMF>\FN}
>C 6 _ >Nq}MF{~H\c FHNO}~ABq~H\zFFC }>~wH6C5D F\FC AVOMF{{>mc}L~F( OcLM@
/F'}~>D~1ACE}~F #5F@ID%HN}FqCM>6CMFn'Fn}OH6NOA@B{}M>mFL~}ABCMK)K\NOH//HNO?>NHLG}>\/H6OH~
}MFN}FH6NOF7 L 2AFC\
FnHiOLMN}Fn&ABC@BHCMKL5H6KFTOMF>\N}
>N ( >NO}>)'>CmOACLMF NOH H
' 4MNO>z>\}F{HC>i\FNOmABF >VOMFA}L~H6}AB>C
A` AOzFABH@F/M~Hi
AI{>C$~Hi}}FNOC @IH6C~KL~HKF ( F'OFNO/ACMABCMK'>\{/ >\C ~H6}FN}( C5
ABIC H 8 8 >\N
MNO>6}FAC }+F *mLMFC5'F1H@@B> ( ?}>/MLMAB@BDEH6@BABKCM/FCmlMDGAIO'NOA/ABC~HiOF/F)5FNO1>4H)
A@BABF/NO> C~>C{F
zFNl&HC~D$}MF DMABO'>i\FNO >CMF ( /F
zFNl H6CMK P
&iHMM@B/}L~lOFlMC~A *mLMFn&>\N H 8 }+F *mLMFC~'Fq'@IHO@A #5H6}AB>C/lH}5Hi
}FN}C~HN}F{ABC~DGL~FD}MNO>LMK\ HE@BFH6NOCMABCMK clH6KF\wH6C~D L~}FD}>}>NOF
ABCH%@BH\}}A #~
H6}AB>COHKF GH`HA~HNOH2" P R J
C w@FnH6NOC2cO>l5H}AI'>\Cm}F'}kNOFFKNlH6{
H6NlN}>\
8 F *mLMFC~F > MFH\OMF2ABC~DGL~FDK\NOH{HNAB)'>CmOF'N}FF
H6@B@> ( O> DMABO'>i\FNH6C~D />GDGF@&5H6N}>VOMF2Fn'>\C~DMH6NOcON}L5}L~N}F 6 5 C OMF2OH6/F
@ABCMF1>6zN}FnFnH6NllR`Fn'>C5DMH6NOcON}L~'}LMNOFTMN}FnDGAI}AB>C H\DGF'OFOFD/H6@I}>
'>\Cm}F'}
N}FF/KNlH6/H6Nl7 ! zFH6C~D H{AO}LMiH ! mn( ! C FzFNOAB{FCmOH6@
NOF}LM@q
GH6@BiHDG>\NTHC~D FCMFD S AI?}~H67H)>
~AC~H6}AB>C%>6>Cm}Fm}kNOFFKNlH6/H6Nl
H6C~D
MAK\NOH/> MOH6ABC~T K> >GD NOF}LM@OOMFL5FGH6iHmAB~HNOH H6@BK>NOA}M $ GH &
>CDMHilH%}~H6
H6C MNOF}FCm>\{F/\FNO cON}L~'}LMNOFD NOFKAB>C5AI}>@IHi}FnD HC~D >6}~FNl
}~H6H6NOFCM>67}NOL~OLMN}FnD
@B>64>m/>NOF
KFCMFNlH6@ >NO7~H\4>C~FNOCMFD}MF?cOL~DGq>6GMAIDMDGFC H6NO>i/>GDGF@I
}! MFANNOF@IHiOA>\C~}MA ( A}( KNlH6/H6Nl RCMKm
G 4}L~DM H6@B@c MAIH6@&DMABOHC~'Fn
5F ( FFCDGAIcON}ABMLGOA>\C~ABC \`HC~DEMN}>i\FABCm}NlH'OH6~A@BAcN}FnLM@O1ABC
> MF1}FnlMCMXA *mLMFAB&A/MNO>iFnDO>5F7HM@F1O>H6@I>>/~H6NOF'>\C\OF'NOFFcO>Gl~H}AI
KNlH6/H6Nl
`G RH@5 ABCG5 C~FNODGFL~C5''}F ABF@B>K\5ABH67NOFMVN}>\}K>Nl/H6/F>6/4AAC~OK2?> cL~FnK\OAi\zF~G H\$ V}MFF\CFNl@BH6F@MH6NO@BACMC~ABCM K( NOFA}2LMNOK\}NOABHF7{NOLM@BH6F} A&
O 5
(
*
% 9
-
$
134
104
"
24
23
ABCmz\F>N@B>6GF25>m>}C~}ABM}NO@F&L~HOMAMC~@BKABDGHiOAIA>\}ABC~>RC5DGH6FnNOAH6Fn@
>
(
$ ! M>CMFC P Bw ! C &>N?ACMFNON}ABCMKOMF7KNlH6/H6NVKFC~FNlHi}ABCMKOMFOHK\&}~H6
~H`FzFFCEL5FnMD $ &>LMCMKH6ARHC~D > >/~H~R > 6 &
> MF7NOAB}F1>6 5H@BFHD)O>
>\/FTC~F l~H6@B@BFCMK\F>N&}MF #~F@BD $ _ FN}C~HL/5>\ACmO
>LG}>/F{>6&OMF}F/AC _ FNl & =?MABAIDG(@>iG}A>\GOHAC5q}>/F{MNOF@BA/ABC~H6NON}FnL~@l
L~ABCMK'>\C\OF'N}FFTK\NOH//HN4@BFHN}CMABCMK}Fl~CMA *mLMFn =?~AB P _ >N}MFnF1NOFH\>\C~
}MFN}F5H1@IHi}F@%5FFCABC~'NOFH\ABCMK{ACmOFNOF7ACE}NOFFHLG}>\HiOH/HC~D%}NOFF~Hi}}FN}C~
$ ! N}AB
LMNlHMR ! ! &
C )M F >\GiOHACMnFnzD KNlH6/:H6N2H6 C }MFCmn>wC~@B@F$HN}KC~FTC~HFNlK\HiNO}HF /O/MFHN&}NONOA>C~ K cL~1>CMFcON}AB~CMH\K
>
:
@FnH6NOCMFDENO> =?>/MNOFOAB>CNOF}LM@O ( AO }~AB/F'}~>D H6NOF
'>\/~H6NlH6M@BFO>2}~F
5Fnc'>\/MN}Fn}}AB>C/F'OM>GDM
>\N}F>iFN : F'}NlH'O
}MF cON}L5}L~N}F%>6
}MFOF'
}H/KF
NlH6>\N~>6H`&F1AIDGH6F@BH\@> ( FDAB}>6> ML5L~H6AC+@ID{'K >P>G D)>/= MNOF=TO AB>/ C{#5}NOlMF/@BFH6FNOC >C)
H } mF}Fncl
H6MAB@BCMFKO}N}~FFF
H6LGO>HiO>CR\H6C~D)}~FCMNO>~HMAB@A F1OMAB > ~F>GlH6ABCMFD//>GDGF@~AB&}MFCL~FnD AO
FN}2K\>>DE'>\/MN}Fn}}AB>CNlHiOFT>C%}NOFF@B ABFDMHilH $ )#~@BFG>NABC~OHC~'F & (
&
9 % 4
.%
&
9:% 4
9
"!$#%
#&
*"!+#+%
)
"!$#%
#&
#+&
9
25
24
26
25
m_}aiI
m_}a`op
x^t`nop
T} `op
T[^t> %
h B|p
y%W
pW
L
f ``o
TL g\x1L 3I
1L W1Bdp
UkbX|]dBp
>5v3I
h hZ]Bdp
h f%W
h y Bp
h y W
h ac_Bp
h Wi %
h W`op
27
26
~_l_Bp
5 v3I
5 v}BW
z W\np
tnWMO_liu5|mY1_lZmL1_n[^h'uach]nx5h'SV_ah'g)_lacGZmQ?t[^ZmZmaXd [^]` _OtacSV7bX[ej@dulI'itnoY Zij<dcB_Oh6YqdXjBac_lac_OZi[g_dcn_Oxa_nhYZmBty/arhhS?bXbc_Oh_lbXblu}_|6 hr\ac bXu_O|EdXbXdcr[^blZ acY#]`I*_OB tB6tmr\dcdc|\[^_Z\ x[edc _O_Oau}h'|\dcr\Zmactn_nxej
Sy/hacfV`t'] ~Zmt6bg\
%_lx^hbl+Zg UkZ C}.-6R?+@WM+-4\%02v$3_&%gE
_Oac6zbX4M_lZ5Rri U Utn9 \b c>x_}\Y[edkZ r\qtTG_Oulatnononmohp ac[^Zm[^Z 2|\x[} g\g\ _lWMZ Y
h_lb&1I*p{i`o%d6Yk f_Oacx^[^Z5Y]L?_l[g\_lx^G_OacYGon3In~W6\ac[^Zm_OaXjg_Oacxh\
S_hfVbX#racM_lbZmBntb a
Y |\[gmk g\}_OZ5WMy/Ghvac_`g\t'_O] acbX_OZ5t6Y5g_lhx^Zmblg %LUkZ }TC`[^.a_l6x^?/bX+@_l+-Z~\%02$'y{&%E_OdX6zac4Q[^ulRb7rihq1Zm(Mgb bX$[^Yz[^xhhacn[ed_l b
Ip%d{]I*dBYI*B%
vbXw[`z\hx^[Z\uOn[^dkx^
_O+m[hhZblRg Uk,ZWM&CW6.-dX6ac?tn+@+@\Wb02dc$3_O&%Z5EP 6a4 5_h'9#acqZm[^Zmnic>ulc>tcpZ6_dc_}BWdX;YjX ac_l_%(3.-6/aF1h+-"%$ 9h'ac6%b
$*4&+/[e.@dc+/|$$]?/h +
6%$ _Oacxqh\"BY?oX3n02$+Q #+@"W./$102$3&%_IYR]ntnx^rm_ Id3I%t#e^9r\YRhn_lb
oop{6onoWdW\ac[^Z\n_OaXj
g
'~W h]hac5b
hrmac[bX[^hZ\Zqgdc|mWG_1/ \[^Z\1bX~[gt_OrmjZ\tn\rdcbX [g|m_7_7h_lx^nbdct[^ac[edch|mdc/[^tn[Z{9t46%7bdcF)ti(iu};<|m+/h.[bdcr%[^F1u7+ul+-tn?XZidc"%_O6$Jd&\PB#"ac_O%_$'&%()ah"*&'+}j Y
y3h 1%W \xP&h mZm>p |*3y{irh|%h66[^Y_lZmblI_O %Z5nk 6(3 $J\>Zm"%t7Qdc_T+/$)tZ);"^dcR|m$_?4*6%.a7Qh"W;G0=?@h"'dc+}[^YMuho%xG| [^ ZonI}_OacI_lBZ\|ul{)_Idn\o6acYtmI%x^_lB6 Ita&_O]`_lZ
x[^Z\_ha
y/gx LBp 4IBIl> y{q[[^uO,Z x_}xd1} hZ g U}iY f_Og\acx^_1[Z~x^h]Y L?L?_l[[g\rm_l_Ox^aGh6_OYmac_lg\#Y [eIdc%tBacbl6 MWC`.aac6[^?Zm+n+-\%_}aX0Y$3<j g4&%EP_Oac6ax4VhRW 9`NRb Y\Z6rmqG_}a
y{r\>Bp WM6a45;Yy{X)+yrm9n6@x^_O&%dc$1t02Z5;G02H%+UkZr]g?rm0+/uO$Jdc[^?]`+/P_ E.5%2qtnR%[^vuvR9actr$*ahy{U [Z\vR\ac_ObXUblZ #Y Ivx%X)B6+ qR%v $J?tB?*A6/F1+-\%0="
}1y):> bX _lirm}T_l_OZ\]ul[^_lx^xebljzy/h)hx^Zm[^ZmZ\_[^Zmh'aXjdch[^Zmg _U}hxtac[e[edXdcdc|m_l/Z5 3Ug\6W_l(iZi.dc$J[e"%BAO[^6aZm4PEU|m.[^;G_O0 a Oh?ac0Gu"B|mA[^uRh$1x1;<+bA2dXA 0ac&'rm+/u}$]dcr\?/+"ac_2D[^+Z S
}?S%dW E*?X3+-] "%02$}{.a?+V_lX6f g\+kY _li-x^"%x^ B_l.u$){'02h$'d`Zm&o6g Y#Y Z6I rmBBSV6G tn_Or\a]`Il_lpB[eacBtndx Y[^Z_lg\k[edc} tac blU} m>WcW;YacX [Z\n(3_OaX.-j6/g4F1+-_O"%acx$ h9Yx6%I*$B4*B+/d6.@+/$J?+^6%$,q"BS
S S `o;dc\_lVbdS7h\?x_SV[^dXulact_lj _nrmxhhZmZ~nY rm\h_l b/hx^h_OZaghjSVhmrmmmx^[^[^tuYTh'dch[ZtZmg+blS UZ gach[aXhnahhZmbXbult\_Od{hW6x dc tiu |hb1] dcKM[u `o 6j Y
STW1%W sTbdch=jti~qu}_l|mGbhS?Ibdc%h)[^puh{ZuOo'tngI'Zioidc _O6 dX jBzac_lW6_7dctiuahhZ5h acblV Z\R$*_O4*6%.tn7QZ\"%_q;G0G6%h$,bXbCh.-x^6?t+/acE@[eE@dc02|m$3 &y+Bt/a7;G;<+/_l.bEcdc#Y[^no6hdcBp [^Z\{
Whi d> ~q'_O\|6mhYxW\Ih%mnBx^h_lbl [^ mh a_Ohu}|\UZmZ\[^Bu_OhaXx
ac[^SVZm_lGmthaXacQd bX_OdiacIb
Y4tmwrult[edcZ6bXdcr _}~dXj[^ac_l[^_1dc_lxg~hYZ\nUkZirdch_Onac_lZb&h'dcBac[ttZ hx
bdXUacZmrmbuOdcdc[edcrradch_ x
Whi np I~qGttnaRxeW Zmhg\tn]hh6[Zm[^huOx~_h'dcg{a[^hWi
dc_nr5g6_lvxhX1act+@Zm46W[^Z\.@W6+/ti;G0=ulul?@["BtnhAOZixzdc9Uk_OZ\66WB7dXtjacF]Bac(3_lh;<_+/dc.[[^tarJZh?W0+/uO$][_O?hZm+}aculY b7_n%BY16ac}Tt on r\op {bhdXo'aclir\r~o6uOY<Y dc`hr\I*aBmhnhx
6Z5gmYkh'Id%h/d>[^iZ
Whi`o ~qbdXac5rmW\uOhdcnrha6h[x~m_Oh\ahh6mPx^_lblulO [^_lRZi$*4*d6%x^._7Qh'ac"%Zm;G[^0GZm6%/$
t"%$J\,ult9Zidc6%_O7V6F]dXj(3ac;"%_l_q;G0=6%$\aJYh>i oh'pp ac{'bTnB6acY tI BnGo6tn bX[edc[^]`_
Whi > ~q96%7W\F]h(3h;<+/6.[[^rJh'?a0=h+/ $J?SV+}Yx_lI*uld>_lZi|id
I|hn{`g']n|6hYZmI*ulB_Ob
>it ahhdc[^uhxT[^Z\B_Oac_lZmuO_n vxX)+-6%.@+/;G0=?@"BA
iW f> \ah Wh'hf acZmbVu|mac_tn h\Zacg t\h
\y%[^x[^bdcf[^uq_OZm_Ob_dcgM[f h'dc [^tntnZ2ZmbX[^bhdcbX_l_Z\guO tnZ2t1acbtdc
tiuldch|bdcdX[^au2hZ\ulbtnIZitdcac_O6adh'{`dc[^tnac_lZ5_
RI
> |%|6 xY Iv)B.aB"%6$) Eu`6%$ C"W;G;<+/./$wU$J"BA tWE@02Ey"%$]\q
">?X302$J+ZR$1;<+A2A 0 &'+/$J?+}YI n}I>|o{
iW f`o U}ul&tnZiWdch_Ox^6]dXhnjg\Bact_la_ThZahg jy%h'acD b&fhZm_OZmg_GgMjkf ahS } t6g_lt6x^blgO_lx^R[$)Z\;<+/.i$J"%;Gul0=6Wt$Jq"BmA[^3Z\6W[Z\(i.$Jb"%dcAtiu}6a4|mhCbdc"W[^;=u S
;<+/./$&D+@?@6@&%$102;G0=6%$ "%$J\ZU.;G0 O?0="%AR$);<+AYA 0 &'+/$J?+}kY I pn} pnB {63p IYzon`oi
28
27
WifOLW
W< %
W h BW
W\y/nW
W3}Bdp
Widct>>|
hidBdW
hi %
_Ox %dW
gfd>
`op
STW %W
~ W
~ tnidBp
29
28
30
29
!
"$#%'&(#*)+
, .-/-0 12%34567
%38#%39":;.-:<4
=?> @> ACBEDGFIHKJLBNMOBQPRBJLBBJLST$> UVBWRBM8BEXYBZ
[\^]_a`abdcOegfh8ijc8kRhOclmkon"pqkar^ijkRcOcOedijkRrs'c8_Rl4etbd]c8kEb
u konNivlmk u kRwxbdiybd`abdcz\m{|}cOh~akR\^j\mrmER\^]Vol8
u s u4QQQN}^N4Q4tQ4mqxE}x
.88om^ u bCijwqkR\kbd~olbq{lm]ijy\4{?hO\^kEbdcObt{edc8cvlmkar^`ol4r^c8wCijwkR\4bqiKnNc8ka
bdiyGl4Rjc
ivkbd~ac
jij]iybz{Ked\^]_G\^wtiybdijQcnal4bYlNoz\%c8^cOeGiyb'ijwzka\kIbd~Rl4bzwt\^]c
`awtcO{`aawt`aGh8vlmwtwtc8wq\m{h8\mkEbdcObt{Kedc8c%vlmkar^`Rlmr^cOwledc iKnNc8kEbdiyGlmajc ivk
bd~Rc jij]iyb u k
bd~aivwL%\4edg%cqijkEbted\na`RhOcCl kacOwt`aGh8vlmwtw}\4{Nh8\^kEbdcObt{edc8cvlmkar^`Rlmr^cOw8^|?cOed]ijkolm
s'ijwxbdijkRrm`Rijwt~ol4Rjc
[\^kEbdcObNedc8c}l4kRrm`olmrmc8w|%s[%Cbd~Ned\^`ar^~"lVrmeYl4]].l4bt
ijhl4?hY~oleYlmhObdcOedijlbdij\^k?o|%~Rc
gedijc8RlmhY~ \4ed].lmLR\4ed]7edc8_aedcOwtc8kEbYl4bdij\^k"ivw`awtcn
bd\edc8_aedc8wtcOkbzbd~RcVedc edijbdijkar"ed`Rjc8w\4{%`akonacedjNijkRrrmeYl4]].l4ez\m{ |%s[%lmkRn
bdcOed]ijkRlmonNijwxbdivkar^`aivwt~Rlmaivjiyb_aed\^_GcetbVivwCc8]_Rj\Ecnbd\]cOedr^c h8\mkEbdcObd`olmac8ka
bdiybdijc8w8 _G\myaka\^]ivlmCbdiv]c$lmjr^\4ediybd~R]ijw_Ned\^_G\^wtc8nbd\;ivnac8kEbdiy{l;|%s[%
l4koniybCivwqwt~R\kbd~Rl4bbd~Rcl4vrm\mediybd~R]h8\4etedc8hObdyivnNc8kEbdijRc8wClmkE|%s[%IivkVbd~Rc
jij]iybgiy{cOkR\^`ar^~I_G\mwtiybdiv^cwdlm]_Rjc8wledcr^ijQc8kL
RGV
}d
TMOBNF$FBH4BgHKJ4MO4JL4(H"Bz4g^d8BH^SM^Y^BM88BNM^BHKJFBa8HKJL(K^BMOJHJW
o4LOYHJWNJOMa4F
HSGmJaOHvPoHJWB4NMOM^"MO4LM^YmJa8BEOHKaJO+OBMOWN4
BNJWNLBNWN
MaFBWaHKa4J'%BNFIK^4>oHJL4TaS qYmFIHKJLBNgaMIHJ;H8L
HKJRMOGSGL4mSO4NJL44G$N4L g8EEj$O4MO(LBa}4mJBMO4FBMOEB
BFINJRNgaM.^d8BHYIOO4aMP$CTM8BFIFBEOHmBtJGmMmJLa>R'LgaMG E
BJLS NqWNHNBJ4DG44KmJRMONmP(BJLS+HJaOMGSG?HNJIO.GXdmm>
AmBNMJLHKJWmJaOHKMOBNON
L4NJR4DoYMm;BJWaLBWam"LYHJWNJP}aHvOHKa
BNF$LK^
H
JLVHSGmJaOHv}B"HJO"HF$HKBNV?mMTaS VM^YKm>oHJL$HKV'BNVmYOB
KHYLmSRPOzNMOo"@JWaKHJvaQOLBE(BNOY^NVMO4WNLBNM"BJWaLBWam$BMO
HSGmJRHK?B.MONFL}aHvOHKaBNFIK^4}BJLBM8BCNGOHKaJH;4Do4J?S"M^YLv8
OBNO.NJRODoMO4BNJWNLBNWN^4>q'L+4Ba4a4JHJmBNM"BJWaLBWamH$B
YBNOzC4NJR4DoMO4BNJWNLBNWN^gBNJLS(}aOY^mFBJoPMON}4MHmgqMO4WaBNMgBJ
WNLBNWNmm> mJLBaSGLBoMOHJLBNJBJ?SUVBWaBNMOBXYB ^xo4FI}4MOBJLSTBM8HB E
BNJLS
L4M8LBQaVB}gaMamS$NJOHz4BagBJLS(Ma?RY^SSGH }mMmJa'HJG4MO4JL4HKJLWFIOGS4>
A4 ELLMEoHSGm%BWNoGS"MONmPKHKmMOBMO'NJ$K^BMOJHKJLWN}4NJR4DR%MO4BJWaLBWamm>
@ 4NJR4DRMO4$WNM8BFIFBM4BJ ?IMO4MOm4JROmSHJ BJoF}4MN RHEB4JRJLNM
FIBNNMOF4>
tJ OHzNMO}zMON}aLd8BJLSBNMOSTMOHKmLBN8 UaMFBN
NMOF TU MmM^YmJR"MmMHKHJWMOK^NOWaMOBNFIFIBNMm>?@ YL}4Ba
L
NJRODoMm BNJWN?BWNaqmMFIHJLB
HdOHKJLWNHY?B gNJR4Do MO4AqBNJWN?BWN
%A
h8\metedcOwt_G\^konNijkRr.lm`Nbd~R\4e
31
30
Y
!Y
%L
32
31
0
!
2aGV
)
g
z g
8
0/ 1&
;G3G-
:q9H
2)"(Ez4z4{zTz{z.t{`$}C8E(}K4x4}{zTz2(-`+(.`pw; ,
.`aU6O|:4lB 8!%lB q
paP:'
56& A0h
8)7(E z4z4{ zTz{ zt{ `$}z4z$KE` w8(-`+(.`pw _
{a :9-: . \O;:\jtSn:0,E:/ O=<R:I$k:E}6a c I0,lAadc I/E`
3 MH v0, $E} 3 q; H v/ P>lB @!A
Q*vL
? B3B Gq 3Ax Sc aES>cC-
.
`
S:4a:9- UO E}9S z 6z8tE`zPwE8E E`OXQ 4 3 U H Ga E}
;G3G-
lB G
;G3G-
^ 8`
;R`
{zTz
;G3G-
lB q .
^ `
E`E
zTz
^ `
2^
"
I|:
R`
.`
tzTz PzTz
pw
.`
KL
tn:4S
E`JL
.`
D- > B
?
%E F&Lt a OO E$ 8EK8IHGL "EJI. G 4K
& Lt a OO " KGIL& + , >(L 4 8N+* N4t $L gt $}
O( K4x4} .};x $}EE "K M( +(
N F& Lt a 8!qO L C4MOF$HJLBN
HdOHKJLWNHY?B gNJR4Do
MO4AqBNJWNLBNWN
A
, $ a .PGQIR&
oE}( KTS
'WaMOBNF$FBNM
9VU
W- U - U
- XU HB TBNJLSSGmJOmV$BJWaLBWa BY
Z0U "BNJLS
Y HMO4a4M8BN?N%YMOHKJLW R>
A [<
?B(NVWNHNmJOBFIdOMHJWR4>]V
\ JmBJSG?JB
q6H
^ z Gz
jES:a
-`
.`
{`
z "
z '
{`
Pz.
Lk_ E`H
: n:{jtI-: ln:{jtScIS
j3( (
E`
; {S2a I ; PaU:8a*c
(
pI
'(n
;S
jlK,E:/":.131.1-5Kn
33
32
2^oE
(}K fVo}aOLBESW< !djtIJUt:IIJU Ut:4IIIJU U Utna>';4BNJNJLHmB4NJR4DRMm
WNM8BFIFBM'HKJTU L ` H
` !UjtI-:VUlnR
O ` !kjtS:4a),4,t:a,/:a/T,E:.131.1a/ :aT,:.131.1a"n
Q ` !UjES
cba),4,Ea /P,EaT,0
a),4,AcbIa,/N0a),/ cCUQ
a /P, cIa /4/ a / a /
a /4/ cI}0 a / c UE a / c UQ
0
a T, cbIa / a a
a a 0 a / cbI? a cI}0 a c UQ0 a c UQ a c Un
N11
N 31
21
N12
22
23
24
32
N N N
33
34
35
36
b
a
'
wEKBJGSG
O 7 N $=e&%(' GO 7 HJBS4MOHKEBEOHKaJOMmHOYMOHKJWaGOBNHKJ^S
R PKoaRHJWIENmMBHKO$I8Et4GKE`4 H> N>}YHHKJWRHJ+HFIFImSGHBEO.JDo4a4
MaF*K4OMHWNRm>xHVaoRHN?OLBE4a4MOPJGSGO 7 HJ BTU CxNmK4aJLBaBNJ
PRH4S+.NMOF IL4MO
I;MBNJL;
S M9O>LA4O*) SW< SG4JL^'L.Y4
HKJRmMJ?B%JGSGmaJLS HKJBTU CtYa4ONJLA
S < mM
S < HB+Y4NzYMOHKJLWam>
NMOFBKPa O*) SW< 8!*m,+.-#/102jtO 7 o
<"<k (A7T nmMI
<=<k SW<
v>
'JGSGm'N%BTU CtYa4ONJBNMSGHoHSmS+HKJRO"dzISGHXdNHJRWNMON?
O3),VBNJLS
O*)/
LBaY^SNJ;4JWNOLYMOHJW %> NMOFBKPa
O*) pS[< J! O*),mO*)/
YL8LB)
O*),8V|O*)/! ZG}4MO'
O*),!rjl?XO*) pS[< )$ =e4%5' G? A!kIBNJLS
876k#nIBNJLM
S O*)/!jl? XO*) pS[< 8$=e&%(' ? !iIBJLS 8!r#na>'$JGSGm
HKJIOVJ
O3) , MmM^YmJRNJRODoLBNHJGNMOFBEHNJL4MOmBN
O*) / MmMOm4JR
JGSG^'LBQoHKJLW$HKJLWNVO4MOFIHKJLBNYPoF"?aBNzL4H
M wEK8> 9zP(MOmMMOHKJL;
W :DGBNFIKG
O*)
S O*) / !Ujta ,/ :4a /4/ :a // :a / :a / :a / :a :a :4a :
, !Ujta ,4, :a /T, :a T, n.BJL
34
33
HBJaMOS4MOmS+LBHMl-O'7 H
FBN?^S+B(EB"HKJOM8BJLWN7BN
WNHN4JoP
S8S2O|@
pO 7 8!5A_tIB' GO'7 {:4lB G@S GO 7 >mtJ CHWNLMNMqJGSG
O 3 @S GO !
IUKU UQ A_tI' pO !$IIBJL6
S S8S2OM@ GO 8! >II:.jlI-:VUtn 6>
zo
C G T
P
x'HzoJEJ(OLBEzLF$LHKJWI4FIFBHzLmS;IYEOLBE44MOBHJ;BJLWNLBNWNmgBMO
J %A>L
EgmNmMmR4MOgLHvaM'MO}aNCHJLSGLHNMOmBNNJLHKJW$NJL
BNF$LK
gMaF B %A>o@
g?mMO
LF$LHKJW"KmFIFIBLNaMOVYMOHJW !(
0q
SW<'z4DG?^
YMOHKJWRzHKJ+LNMO
F 7 (
7 q9HGL4MO
HH'LOBMOWN4 %ABNJLS
=
N>%'H$WaWNmYO.LBIB$mBad$NJOY?dOMHJWa
:( BJL;
S YaS}
JNJGx4FIGdP$BJLS$mJLmMBNM V}aOYH4BNm DG}4MOHKFImJaHKq">
aJLYHSGmM
4BaY4MO
BJL
S BNM;JNJmFIGdPN> Ez4a4MO+Ma?RY^SFILoS4BNJ}
DomJLSGmSO$NmM4BaY^BN'z4>
q"WNB$SGmHKMOmS(OBNMWa T$GgVJ4^S(O"HSGmJaOHvPFIaMaJaODo?B}4JRHKHm'HJ
2
O*) pS[< >@aJaODo?Bo4JRHKdPH B
JaJGx4FIGdP.L%- O3);/ NJLS"HJ$BNJRP
TU CxNmK4aJ>}@VVB(LM8ddO4}z"4BNJaJLYMOLVI4BNJNJH4BN T!HKJTU Ba
DGBHJmSHKJ'MO4oHNLY^OHKaJ>Qx%HMON}amSOVL'B
MOJHJWFIoS4RNG8BHJ
BY4(NGOHKFB4NJR4DoLBN4JRHKHmm>'LMJLHKJWFIGSGm
aJLdOM?OBNJHKJHKHB
MLJ.Y42
O*)/GOHKJ?SGHmBEOGmMOMmJL
2! N%NFI.aJRDoOLBmJaOHvOHK^4>?4MON
BMLJ;H.SG4LJmSBa.BL.N 0 > 9zP HKJLSLHJWLF$LHKJWMa?mMYdP
HKJRO'HKJHKHB &MLJY4 B
LJLBNGKHY%N}4NJR4DROLBGmJRHKHm%H%NG8BHJmSLYHJWH8
B"FIGSGHKLmS+JGSGHdH'WNmJ4M8BEOmS>
"
$#&%
'-
3.
(')
3p
+* -, $# /. 4
{ l
54 _G }^N
032 698 ' , d , (A= '
:
698 B?
;G3G-
O
` p_ G@S pO
{z;O
lI
GO'7
7698
:O
|WO
pS
tI
GO
E`
. vy
Zt
` p_ G@S pO
.w
V
NMmNmMPLBNHKM
CJLoSm
O 7 ":4O 698 6O*) / 0 O 7 @032O 698 HaF$LG^S+BNJLS+Hv L8
RHQBNKmJLVDGHdOmSoB4KLYmMgNJGSGmgHaMFImS>R'HzLdO4MzJGSG^HNMOF$^S
YVLBHK DGHHKO NFIg4NJR4DROLBoHJGNMOFBEOHKaJ>:BN8"YL8$LY4M%H%BNSSG^S.HJRB
Y4'mBK^S(MOJY4
Az>'HzMOo4mOH'YLEJ;BNgMOo4mSGLMtJHKHBH &MOJ4to4Om>
/ h
G
E
9J
Jt
35
34
% BC#
t
(tJHKHBHY4C&MJLdo8
"
E]
[\m]_R`Nbdc.bd~ac"jc8karmbd~\4{zwt].l4vjc8wxbh8\^]]\mkwt`=<\m{6>@? BA C
* , Vl4konD>@? BA C
-/
l4kon$wxbd\4edciybgijkFE
[\m]_R`Nbdcbd~Rcjc8karmbd~\4{wt].lmjjc8wxb
h8\^]]\mk+_aedcR(\m{)G+%C
+* ,^lmkRnHG+%C
.-0/lmkRn
wxbd\4edcijbij2
k EI4
EKJMLNO E E
a\med] h8j`Rwxbdcedw
\4)
{ PQE+kR\nacOw
jijQc RqC
+* ,TS=UV
+* ,:S=U XWYWYWQ
+* ,ZW0WYW[
+* , U S\Y
+* , USX'lmkRn
C
.-0/ S]U
.-0/ S]U WYWYW[
.-0/BWYWYW[
.-0/ USI
.-0/ U S
\
nRn.bd~Rc'kacOy"{\med]cnh8j`RwxbdcOedw\m{ka\nac8w bdF
; 3 $
; 3 $ %o
+tJLSL4 FIHKJLW
% BC#
t
"
C&
:9X;
/ m$
. % &D "
G lG
'-K3po
l
r % .
r
ED
2!
$ e&%('
=
GO
)$ w e4%5'
pO
<
$ e&%('
, 1.1.1 =
K8
GO'7
36
35
t
"
'mMHYHLm;NFImF$LHKMOHmBVF$^BNMOmIOSGm4HS LJmO(N.B
LdO4MHJ!n>8L4JmN4MdzLdO4M8FIBO8HKJ L4HM.SG4MOHKamS!rbtD % F}NJINL
LdO4M8
HYmK^^SO;}"MO8BHJmSHb
J n Ba44NM8SGHKJLW;$JoF}4MgBN}mBM8BJ?
HvLBaHKJO4MLYmMOm>@WaBNHKJ HvzLI4m^S 4KLYmMH . }aMIMOOBNHKJLHKJWL
LdO4M H@
J nV?B4NJLHdO4JL4P8m8H mBMOMHmSNGBNJLS.OLYmM H BaSSGmS$HK?O4MO'H
J"4NJ LH'BFINJLW.DGHdOHKJW$LY4M8gHvO . >Rxdg$LY4M8?BMHBKPIENmMBI^BN8
L4M^L4J(HKgHgmBK^SB.aJ LH^>a@LMG^SGMOVomK^Y gLdO4M8HMOm4JROmSIH8
YLm^YKPHF$LKmF$mJROLHmMHYHBJ?S4J?YMOm
?BENJPL LYmMOBMO
M4OBHJ^S;HD
J nV>
Ct
"|
n
VgmBJNmJaOMOHJRLJDoOBaY
4NJLYMOLHJW T*HJTU z> aM$^BN8LdO4MIMOm4JRHKJ n*zSGNJ?dOML
BTU CxNmK4NJHKBLJH RB}4 WaHKa4J +LMo^> :BN8Y?8 MOoBN?m H
BNSSG^S(OIJLoS.Y4A
O3),E>
UENJ?YHSG4MdzJGSG^
O 7 :O7698O3),E> VLJL
O 7 O 6>8t4H S8S2OM@ GO 7 8!
S8S2O|@
pO 698 za
M @S GO'7 8!*@S GO7698 >
@WaBNHKJaJLHSGmMdzzJLoSm
O 7 ":4O 698 6O*) / > V?J
O 7 $O 698 H ;$ =s&%5' pO 7 J!
$=e&%(' GO 6>8 >
9gP+LHKJLW^Y" RHEB4JL4.MmBHNJL
NJOJGSG^aJLSHM
J O*),BNJL9
S O*)/
g4BNJWaB$LBMHKHNJ+NJ+OY4CJLoSmA
O*) SW< >
HKJW$OHLBMHKHNJBJLH
S n
37
36
'
& R% : 9
$ ;R
kR\mkRc8]_Nb.wtcOb\m{_G\^wtiybdijQcwdlm]_avc.
u kN{cOetedcn"|%s[%V
m u nac8kEbdiy{M
I
d#
M
'
`Rwtijkarbd~Rc }wtQc8jcObd\^kaw {\me6
PE gwtivkarl_Ned`RkRijkar]\nacOL\mabYl4ivkIl]\naiyoc8n"_aed`akRc'wtcba
a h8\^kawxbted`RhObz }wt^c8jcObd\^kaw{\me ? h i c l4kon$lmnRn"bd~ac'ed\E\mb
R\4eclmhY~h8j`RwxbdcOe c 1
vlmGcOvw\m{wt`ah~I }wtQc8jcObd\mkRw%bd\
a EVedcO_Rvlmh8ijkar
c8lmhY~h8j`RwxbdcOe\m{ c 1
a {\m`RkRn
_ \nNij{Kbd~Rcg }wt^c8jcObd\^kawC\4{ `RwtijkRr+
ijk( }wt^c8jcObd\^kaw h8\4etedc8wt_G\^kRnaijkRr }wtQc8jcObd\^k"\m{ c
G lmkRn
G+
\m{cl4h~"\m{bd~RczkR\nacijk
M
[\m]_R`abdcgbd~Rc+
l4konlm_R_ajivc8n"bd\
2m
[\m]_R`abdcgbd~Rc'cE`aivmlmjc8kah8ch8vlmwtwtcOwg\4{lm_R_ajivc8n"bd\
z
pk`R]cOeYlbdccOQcOetVcE`Rij^l4jc8kRh8ch8vl4wtw X W0WYW \/ |%~acgcE`aivmlmjc8kah8ch8vlmwtw~aijh~h8\mka
bYl4ivkaw bd~Rc
}wtQcOvcbd\^k"ed\E\4bdwgijwvlmGc8jcnijbd~@%
NEHKJW4DBFI;SGmOMOH?^LY4?.HJWNmJ4M8BEOHKJW
CTMONFBWaHKa4J
Y4OBFIm'?YHJW$LMG^SGMOtJG4MtTM8BFIFBM^>
2^oE
(}Ka
o
A4.OBFIY4? SW<]!\jtIJUt:IIJU Ut:4IIIJU U Utn
'TU qtN4ONJLzBMO
4NJLYMOLOmSIaMOOBFIVA
SW<BNgYLEJ(HKJ CHWNLMIN>
HKJWOmTU CtYa4aJLBJGSGM
O*) SW< +HaJLdOM?^S
O*) SW< X!
O*)
jlO ,/ :O // :O / :4O /
:4O 4/ :O :O
, m9O3) / !jlO ,, :O /T, :O P, nAm
:4O :4O na>
mBaSdOMHJWRBJLSMONJRH4MVdOMHJWaNM^BN8%LJGSGHJ
O*) SW< BMO.aF$
G^SBJLSBMOYEJ+HJBLKIN>
tJHvOHBN%MLJ$HNFImS LHKJLW;O"LMG^SGMO"tJLHvOHBNKHY4C &MLJdo84->
4BJE LBJGSGm O,/BJL
S O /BNM2
& ! : RHEB4JRm>4G (A_tI' pO,/ 8! o pI 8!
jlInBJLS
o (A_tI' pO / |! o pII |! jlI-:IIna>zoHKFIHBNMPN
&MO p@S GO,/ |!
&MO DU J!\j UtnBJ?
S &MO p@S GO'/ 8!*&MO DU U 8!UjqUt:VU Utna>RGHKJL4o 5A_tI' pO,/ -V
o (A_tI' GO / B!*
? Z.BJL
S &M4 G@S pO,/ V &MO p@S GO'/ r! ? ZGag
mBJ(4NJL4KLS
LB O,/=0 2O />LtJ+BIHKFIHBNMz'BQP(z4BJ+YLE?BE O,/=032)O BNJL;
S O'/@0 2O >
'HmYOBLKHYLF$mJR'N
& RHKEBNKmJL?4dgm4J?BHM%JGSGm'mK?HKJ+aMFIHJW
HKJHKHB
&MOJ.o@
A !\j GO ,, :O ,/ {: pO /4/ :4O / P: GO :O Pna>
UDo'dO4;HHJLSG?VFIHJWLMa?mMYdP$HKJROOV=
Az> gaJLHSGmMgB"Ld
4M . , ! GO // :O / Az
> \VLYmMaOLBE
HID3F (A_tI' pO /4/ '!HID3F GI ! I0!W
?
BJL$
S @=CBD3F G@S pO / M! HID3F DU ! UA![
? q>oHKJ?
HJID.F 5A_tI' pO /4/ 0! ? 7BNJLS
@= BD3F G@S GO /
r! ? mg%aMF B'J44KLYmMq4BNKmS . / ! GO'/T,E:O /4/:4O /:4O / BNJLS
BNSS$O.LV
n> 9zP"LHJWL
MOGmSMtJ?SGL4C &FIHJWLoBFIGSGHv?mSMOJVY4
nHWNmJ4M8BEOmS4H> N> nk!\j pO,4,t:4O,/ P: GO'//:O'/ P: pO :4O P: GO'/T,:O'//":O /:4O / {:
GO 4/:4O :4O
:O P: GO P,:O 4/:4O :4O :O :O Pna>
38
37
a& ; o [\m]_R`abYlbdij\^k$\4{~Rc8l^nIl4kon{Ked\^kEbdijcOegwxbtedivkar^w
zclmn
.
.
f
k
"
"
k
"
""
"""
|?cOefR
#
D
#
#
D
D
#
#
#
D
D
D
"
"
"""
""
"
BQoHKJLWNMOFImSL
nomK^Y gLdO4M8.MOG^SGMO(H"LmSO4m$FIaY
BMONLMHBEO4KLYmMOMO4MOm4JR'.PoJROBNHLBEmMJLHKJ+LBNFIK^> gNJ
YHSG4MOLdO4M8 . , ! GO ,, :O ,/ BNJLS . / ! pO /4/ :4O / VLBQoHKJLWHSG4JROHmB%PoHKmS>
LJLS+?BE
HKVEBNK . , H..BJ?SHvEB. . / HV>?
4JL4 . / HMOOBNHKJLmS
BJLS . , HSGMONL?^S.> 9zPSGaHKJW;HMOo4mOaMBNKCILYmMO
HKJIY4VzWam
nU!Uj GO :O P: pO 4/ :4O :4O :O P: pO P, :4O 4/ :4O :O :O :O Pna>
NM^BN8LLY4M
MmFIBNHKJLHKJWHJ nVB(TU qtN4ONJ4BNJ?.4NJLYMOLOmS>
'LM8YgJGSG)
O 7 mBa8(LY4MzLBa "?MO4BN4mSIoP$BNJ;BLMaMOHB'mMFIHJLB
YPoF}N?H8(HJ
@=CBED3F p@S GO'7 >o@J(LJH RYPoF"?aLFBQP}VBaGHBE^SIHvOL
Mo^BN8NmK4NJ"BJLSOazJGSGmBMOgBaSSGmSOzJGSGz2
O*),^>QtJ.OHmBN
g"LYa
,^ K/$BJ?H
S +MO4MOm4JRO$4K?dO4M8
Hb
J nV>UE?$JGSG$Y4
O*),!
jlO,4,t:4O /P,t:4O P,l:4 ,t:9/:4 tnBJ?SJGSGY4
O*)/9!jlO,/E:4O /4/:4O /:O'/ :O /":O :
O
:O ":O na> 9zPBLKPoHJWOMO4BEOHKaJ NJLJGSGzY4 O*),%zWaqO?BMHK
HN
J , LBQoHJWOzHJWNG8H>
K, !UjjlO,4,l:4O /P,t:O T,E:4 ,l:9K/":4 Enna>^@H4BEOHKaJ
MOmKOHJ.B'LBMHKHN
4Ld
R
a@4q2
'Ma?mMYOHK^$
OMON}amSLMJLHKJWFIGSG4BNJLStJ4MtTM8BFIFBMIBWNaMHKF
BMOBNJLBPGY^SBNzNE4>LAq)
S[<}.WNHNmJ+OBFI.VBNJL6
S LM`! GOa`3:T;:QR:TS
` '}
4BJLNJH4BN T HJTU N
M SW<'>A4
O3) pSW< V}$IY4JGSG^HKJ (TU q
Ya4aJLaJLYMOLOmS;NM'LBNFIK^'HK|
J SW<'>?'mMFOLt ^a LEq4Lp wVHLY^S
SGmJOBLY4MJGSGmMa[
F O*) pSW< >g@ 4NJR4DROLB
4JRHKdPH;BaSSGmSO
HJHKHBMOJ@
A Hv%HK
B}mBNMO'HJFIaM?BJNJ.TU qtN4ONJLm> aKEHJW
NL4MOEBEHNJ?zBNMFBNS> M^> MOJHJWIFIGSG4
39
38
4 EEE#SAqFA !k
? Z(?$HKJLHvOHBNCMOJ$Y44NJLYMOLOmSaMOL ` > A4
A?VB.LY4MgJGSG^ MONFiO3) pSW< >O 7 pSW<R: . BNJLSO 6>82 . L8
LB O 7 032O 698d>
{z {`Py
{z4{
`PyEEEHf gLYmMOgHKJ n MmM^YmJRNGOHKFBN}4NK^HNJ;qaJRDoOLB}4JRHKHm
HKJ|SW<
4J?;HK"H$NooHKaLOLBEIHvOY4n !dZL4JO4MO;4DGHYO"J 4NJR4DoLBN
4JRHKdPH8H$4NFIFINJF$aM;LBNJaJTU CxNmK4NJq>tJY?8BmBNNL
tJG4MtTM8BFIFBMBWNaMHKLF*NGOGOB4BNJNJLHmB%
T$LBWa4JmMOB^V4DBNOKP
BNF$NYMOHKJWR'2
SW<'>
"
$D&%"
)D4 #
'-
xH+}$LMEamSOLBEOIMa?RY^SBNKWaNMOHvOF*NJoa4MOWNmB;OBMOWN4WaMOBNFIFIBNM
Hv%4JaWNHKJNMOFIBHNJ+H'LM^YmJaOmS>'LHJGaMFBEOHKaJLM^YmJaOmS($LMG^SGMO
tJG4MtTM8BFIFBM HBNYMOHKJLWam> JL4mS$OMOENgLB
MG4mSGMO'HS4JRHKLm
BJoP
T WaMOBNFIFIBNMHJHKFIHKm>'MOo
HILBaY^SNJL+4DGHYmJLB
8LBM8BN4MOHdOH$BNFIK(aMBJoP
CTWaMOBNF$FBNMHJTU z> LNKEHJW4FIFB
YEgO.aJLdOM?HNJ+ L8+OBFIY4m>
X *]-bOC4
LbO;#
GIL& R
@GK. `+ ^a zG z?x
z44V _$ (-GA
z4E
(}P z;NK p
O `E t
LWzPL4 ; E _ z v yEE z
q (KG x" (-`OV (Lt z48I3 _. `4}8 (-`^88E `{
L zE G (-Gp z. `E$I ` LJoT z z4EE ` (0oIt
v
L
:`^ _{VA Cd
T Lh! pO9:T;:4QR:4S > NM$4a4MOPBNGDGHKHBMOPPoF}NR
a \OCHKO
dOMLM8BGHJGaMFBEOHKaJ$BNJLSIBNOYG4HBmS"BNFIKYMOHKJLWmBJI}4BLBmS$BNaKE
}a e>
) .
},.a),PeK,A
131.1
}5a5e5
}a e
} (
40
39
BJLS
S
},{a),{eK,A
1.1.1
}5a
5e-5
>}
e
}(
e
BMOSG4MOHKEBHNJLIHK$a !;
? -;L4J%lB ( #! ? B3B G( BJLS a BNJLS%- BMOJN
MmBmSLJLSG4M NM
">'H"H"SG(OOt{`$}'8(Kt4}{z4zBNJLSt{`$}
E zTz"K `p wLMa?mMYOHK^'
T$>
\VJOL4MLBNJLS (L}a+LBm HJOYa4ONJY4$aMIHJG
BNF$LK^4odz$SH 4MO4JRHKJRO4MOJLBJGSG^ BJLS oLBMOBN?mK^S;oPO.BNFIBNGDGHKHBNMP
YPoF}N>'mJaJIz4NJLSGHKHNJ? NM^
> B ? B ; ;aM
H'LvLmS>
MONF7zMO4oHNLaJLSGHKHNJLCg'4BNJEB %MOF LB%@
KWaNMOHvOF!z4NMOM^OKPMmB^
RY"HKJRmMJ?BqJLoSmVHKJLSL^SoPaJ"BNGDGHKHBMOP+PoF}N0> $mBJaJLLSGLBVHK
8LBM8BNmMHdOHBNF$LK gLBQN SGLJLmSH+WNHNmJ BNHKJLGSBE8BO4J 4NmMP
BGDGHKHBMOPYPoF}N%NL$WaMOBNFIFIBNMVHMmMOm4JR^SBJ?SLI@
KWaNMOHvOF aMMOmP
SGHYHJWaH^zmF+>
5 h
" HG po 9J )D4 #
'-
p
'%HFIaF$LK4DoHKdP
No@
KWaNMOHvOF HLM^YmJaOmSBNaKEm>4A4S < !kj3( ,E:(A/:31.1.13:
I}OHKJLG"OBFIY4NMO(BWNaMHKF>Aq
?IO(H mzHJG
H> N>?F KmJWOLN%BNKCHKJLGYMOHJWa}4NJLWNHJWIg
SW<'>?x
HN?YmMamS;LB
L
FIBDGHKF"F}aOYHYH 4N n2H
>'4NFIGOHKJWOHKFINFIGSGm
CLMJLHKJW$FIGSG4oH a>GtJHKHBH &MOJ4x84RtJLSGL4xFIHJWIBJLSo4mY gK?dO4M8
BMO
/ qBNJL
S / MOm?^OHKa4PN>E@ozMO4FBHJHJWYmLOBWNaMHKF
tJG4MtTM8BFIFBM^oM RHKMOm
'aFIGHJW$HFIN>
4J? z4BJ4NJL4KLS%LBNOBaNFIHJW'HFIM RHKMOmSVaM@
KWaNMOHvOF
H
LB$}NPRJLNFIHBNJLHNJ
>
(R5n
HG +% * #
5 m$
'MON}amStJGmMYdTMOBNFIFIBNMqBWNaMHKLF LBa?m4J.HFI4FI4JR^S.L44mOdP
LHKJLW
B4GSGMOHKYmJHKJ !MONWaMOBNFIF$HJW BNJWNLBNWNJLSGmM(mSVBE(AHJoGD G> YPGYmF+>
' FBNHKJ SLBEOBYMOLOM LmSH+SGNLKPHKJLNmSKHd+YNMOOHKJGaMFBHNJ
MmWNM8BNSGHJW'LJGSG^qN?TU CxNmK4aJL4>EoaF$gGO4Do}4MOHF$mJROBRMOmv8aGOBNHKJ^S
BMOV8BLBmS+BJ?S+EJ+HKJBG>
g
Yod
@ JEa4gzBQPHSGmJaOHvPoHJWB}4Ba"
4NJR4Do"MO4WNM8BFIFBM8MaF BI
?RYHKHNOBFIHLM^YmJaOmS>'LHKJGmMmJLFI4GSMON}amSNMILBNOI
T HKJ4M8g4NMOM^P.aMgFBNJRP$BJWaLBWamHJOV4BagHKJ;?aKPoJaF$HB?HFIVBNJLS
SBE8B>
'LBaYHYM8BEO4WaP+mBJ 4NJoNmJH4JRP}$F$GSGHKL^SLBJ?SGK$O4MMO4FBHJHJW
4BNm%FImJaOHKaJmS$HJ$LBaYH'dOMOBmWNPN>@VBVGLMzNMOBVFIaMSG4OBHmSIdOLSGP"
8LBM8BNmMHdOHm
T4BNJ(?aJLSGL^S;BJ?SDGNM8BEOHKaJOV}aOHKHK
HvOHK^gNqLHKJLW.LHzHJG4MO4JL4HKJLWFIm8LBNJHYF $}aOYHVBLKH4BHNJLgHKaVYMOLOM^S
SGGFI4JR
BNJLBPGYHzFIBQP;?4DoLKaM^S>
41
40
|}l4edr^cbz?lmkar^`ol4r^c
q
D"
4
^sV
q
K
p _ Nz\mvnLN}l4kRrm`olmrmczivnacOkbdiyRhl4bdij\^k.ijkbd~Rcjij]ijb u ka{\med].lbdiv\mklmkon.[\mkEbted\^
[gnacvl!zijrm`RcOeYl[`aetedcOkbbtedc8kRnawijkrmeYl4]].l4bdijhl4ijka{cOedc8kRhOc^ijk a cOb
l4KpqnL namlmkah8c8w'ijk lbtbdcOed
k c8h8\mr^kRiybdij\^kL m\^ijkEb u u kEbdcOedkolbdiv\mkol4 +\medwt~R\m_Rw
faf f P !q\^
\m{q [ fRf_aedijkRrmcOet" !CcOedvl4raa_R_?p P
m P N
m\^~Rk+pLg\^_GhOed\4{b}lmko
n ^$c GedcsV gjv].l4k? u kbted\nN`RhObdij\mk;bd\$lm`abd\m].l4bYl.bd~Rc8\4etEKl4ka
#
rm`ol4r^c8wglmkRn"h8\m]_R`abYlbdij\^k l4ed\mwdl `aRjijwt~RijkRrg\^`awtc^ cOs'c8j~Ri
fGL?c8cm}Lcl4edkaijkRr\m{%hO\^kEbdcObt{edc8cvl4kRr^`Rlmrmc8%w wt`NedQcO(\m{Cbd~RcVviybdcOeYlbd`aedc8w8?|}cOh~akRijhlm
c8_G\4etb%&| TP
N[c8kEbtedc{\4&e c8wtcl4edhY~.ijk[\^]_R`NbdijkRr|}c8hY~Rka\^j\^r4EQzl4edml4eYn zkaijQcOet
wtiybI[%l4]NediKnNr^c^ _ lmwtwdl4h~`awtcObtbdw8
!
zlmna~RlmEedijwt~Rkol4kl4kon V lmr^l4eY)l (tlN u kN{cOedc8kah8cI\4{'c8QcOk jijkRclermeYlm]].ledwlmkRniybdw
'
l4_R_avijhlbdij\^kRwVbd\;_aijhObd`aedcnacOwthOedij_abdij\^kvlmkar^`ol4r^c^ lbtbdcOed*
k gc8hO\^r^kaiybdiv\mk?
!q\^Z
P ^ \a ^
_a_?
P
+V falmmlmijoleYlN- c8h8cOkbl^nN^l4kRh8c8wV\m{grmeYl4]].l4bdijhlm ijkN{cOedc8kah8c^q|%~ac8\medcObdijhl4'[\m]_R`NbdcOe
,
fh8ijc8kRhOc^
N _ ?fc8]_GcOedcVlmkRn 'l4edh8ivl h~Rl4eYl4hObdcOedijl4bdij\mk\m{c8Qc8k(jijkRcleKl4kRrm`olmrmc8w
lmkRn(iybdw
#
l4_R_avijhlbdij\^k$bd\bd~ac'jcl4edkaijkRr_Ned\^Rjc8]I ed\Eh8c8cnNijkRr^w\m{}bd~Rcfc8h8\^kRn u kbdcedkol4bdij\mkolm?[\mj
j\ E`Rij`a]I u [ u o 'u !q\^
PE
GfN_Nedivkar^cO
e !qcOedvlmrN
42
41
Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means
of estimation algorithms. A hybrid language model is defined as a combination of
a word-based n-gram, which is used to capture the local relations between words,
and a category-based SCFG with a word distribution into categories, which is defined to represent the long-term relations between these categories. Experiments
on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech
recognition experiment.
1 Introduction
Language modeling is an important aspect to consider in the development of speech
and text recognition systems. The n-gram models are the most extensively used for a
wide range of domains [1]. The n-grams are simple and robust models and adequately
capture the local restrictions between words. Also, the estimation of the parameters and
the integration of the model in recognition systems are well-known. However, the ngram models cannot characterize the long-term constraints of the sentences of the tasks.
Stochastic Context-Free Grammars (SCFGs) efficiently model long-term relations and
have been successfully used on limited-domain tasks of low perplexity. Nevertheless,
general-purpose SCFGs work poorly on large vocabulary tasks. The main obstacles to
using these models in complex real tasks are the difficulties of learning and integrating
SCFGs.
With regard to the learning of SCFGs, two aspects must be considered: first, the
learning of the structural component, that is, the rules of the grammar, and second, the
estimation of the stochastic component, that is, the probabilities of the rules. Although
interesting Grammatical Inference techniques have been proposed elsewhere for learning the rules of the grammar, computational restrictions limit their use in complex real
This work has been partially supported by the Spanish CICYT under contract
(TIC2002/04103-C03-03)
43
42
tasks. Taking into account the existence of robust techniques for the automatic estimation of the probabilities of the SCFGs from samples [24], other possible approaches
for the learning of SCFGs by means of a probabilistic estimation process have been
explored [3, 5].
All of these estimation algorithms are based on gradient descendent techniques and
it is well-known that their behavior depends on the appropriate choice of the initial
grammar. When the SCFG is in Chomsky Normal Form (CNF), the usual method for
obtaining this initial grammar is a heuristic initialization based on an exhaustive ergodic
model [2, 5]. This solution is easy to implement but it does not use any structural information of the sample. When a treebank corpus is available, it is possible to directly
obtain an initial SCFG in General Format (GF) from the syntactic structures which are
present in the treebank corpus [6].
We conjecture that the structural component of SCFGs is very important for the
learning of stochastic models. So, in this work, we explore this possibility working
with SCFGs in GF by using the Earley algorithm [7] and comparing with SCFGs in
CNF.
With regard to the problem of SCFG integration in a recognition system, several
proposals have attempted to solve these problems by combining a word n-gram model
and a structural model in order to consider the syntactic structure of the language [8, 9].
In the same way, we previously proposed a general hybrid language model [10]. This
is defined as a linear combination of a word n-gram model, which is used to capture
the local relation between words, and a stochastic grammatical model, which is used to
represent the global relation between syntactic structures. In order to capture the longterm relations between syntactic structures and to solve the main problems derived from
large-vocabulary complex tasks, we proposed a stochastic grammatical model defined
by a category-based SCFG together with a probabilistic model of word distribution into
the categories.
In the second section of this paper, we present the hybrid language model based on
SCFGs, together with the results of their evaluation process.
This evaluation process has been done using the UPenn Treebank corpus. Firstly, we
compare the obtained SCFGs by an initialization process based on ergodic and treebank
initial grammar. Then, we also compare the final hybrid language model with other
stochastic language models in terms of the test set perplexity and the word error rate.
"! $ #&%
' ('
(1)
44
43
where
# is a weight factor which depends on the task. Similar proposals have
been presented by other authors [8, 9] along the same line.
The first term of expression (1) is the word probability of
given by the word ngram model. The parameters of this model can be easily estimated, and the expression
can be efficiently computed.
In order to capture long-term relations between syntactic structures and to solve the
main problems derived from large vocabulary complex tasks, we proposed a stochastic
grammatical model defined as a combination of two different stochastic models:
a category-based SCFG ( ) and a stochastic model of word distribution into categories ( ). Thus, the second term of the expression (1) can be written in this way:
' ' .
There are two important questions to consider: the definition and learning of and
' .
, and the computation of the probability
'
(
;=<
>
Estimation Framework. In order to estimate the probabilities of a SCFG, it is necessary to define both a framework to carry out the optimization process and an objective
function to be optimized. In this work, we have considered the framework of Growth
Transformations [12] in order to optimize the objective function.
In reference to the function to be optimized, we will consider the likelihood of a
, where is a multiset of
sample which is defined as:
strings.
Given an initial SCFG and a finite training sample , the iterative application
of the following function can be used in order to modify the probabilities (
):
JI < 0- K
HG B
:> L
( M
'
'
BN :O * < 0- K P0Q / <?R SUT 7 * BC E FUV :O ' ; < ' :> ; < W
* < -0K P0Q / <?R SUT 7 * XC E'F V : ;?< :> ;?< B
(2)
45
44
V :
;A<
A; <
;A<
V ;?<
HG
Estimation of SCFG in CNF. When the grammar is in CNF, transformation (2) can
be adequately formulated and it becomes the well-known Inside-Outside algorithm [2].
If a bracketed corpus is available, this algorithm can be adequately modified in order to
take advantage of this information [3]. The initial grammar for this estimation algorithm
is typically constructed in a heuristic fashion from a set of terminals and a set of nonterminals. The most common way is to construct a model with the maximum number of
rules which can be formed with a given number of non-terminals and a given number
of terminals [2]. Then, initial probabilities which are randomly generated are attached
to the rules. This heuristic has been successfully used for real tasks [5]. However, the
number of non-terminals is a critical point which leaves room for improvements. [13]
Estimation of SCFG in GF. When the grammar is in GF, transformation (2) can be
adequately computed by using an Earley-based algorithm [4]. This algorithm is based
on the definition of the inner probability and outer probability. First, we describe the
computation of these values and then we describe the SCFG estimation based on the
Earley algorithm.
'
The Earley algorithm constructs a set of lists
, where
keeps track of
all possible derivations that are consistent
with
the
input
string
until
.
An
item is an el
ement of a list and has the form
R< R
>
M
> >
> >
>
>
>
> >
> W
> >
>
. This value represents the sum
The inner probability is denoted as
and end
of probabilities of all partial derivations that begin with the item
generating the substring
. For each item, the inner
with the item
probability can be calculated with the following recursive definition:
O
M
>
>
M
M
, " O
( ' '
if > '
#
M
O!
'
* %$ M
&##
*('&)+* , J.-/ if
46
45
:
:
* , which is computed from the probIn this expression, ) * '
'
'
q(
abilistic unit production relation *
),
. )+* '
when the grammar is consistent [14].
% *
, where
This way,
is a dummy rule which is not
in . The expression
is always one and it is used for initialization. The time
complexity of computing the inner probability is " , and its spatial complexity
is .
:
:>
>
(
>
O
:
&
). This value represents the sum
The outer probability is denoted as
), generate
of probabilities of all partial derivations that begin with the item (
, pass through the item
, for some , generate the suffix
the prefix
and end in the final item
. The outer probability is the complement
of the inner probability and, therefore, the choice of the rule
is not part of the
outer probability.
For each item, the outer probability can be calculated with the following recursive
definition:
> >
>
>
>
#
'
M&##
O
J
M!
M!
%$
* $ * - 1 )+* , '
- 1 * $ .- - N
.- - N )+* = '
'
if
>
.-/ if '
'
if #
@
In order to rewrite (2) in terms of the inner and outer probabilities, we need to note
that these definitions associate several items to a rule. Here, we considered only the
) to represents the rule
items with the dot at the beginning of the right side (
( ).
Let be a rule, and
a set of derivations that uses this rule, we assume that
the Earley algorithm selects this rule at the position to extend derivations from the
position ! # of the input string.
, recording the information:
,
The algorithm inserts the item
. And its
. Given that is in , then,
' '
' .
probability is:
.
Using the inner and outer probabilities this expression becomes:
This expression adds up the probabilities of all derivations that have selected the
rule at the position . If we sum for all positions, we can rewrite the numerator
of (2). And if we sum for all positions and for all rules with the same left non terminal,
we can rewrite its denominator. This way, expression (2) can be written as:
M
M
J
:
;A<
M
O
;<
> > >
> M
>
> >
U :O
>
>
M
>
J
O
M
0
P
Q
*
0
K
*
/
7
<
$
?
<
R
U
S
T
O
* < -0K P0Q / <?R SUT 7 *?* $ M
N M N
(3)
47
46
G >
When a bracketed corpus is available, the inner and outer probabilities can be adequately modified by using a similar function to the one described in [3]. This function
filters those derivations (or partial derivations) whose parsing are not compatible with
the bracketing defined in the sample, and takes into account only those partial parses
which are compatible with the bracketing defined in the strings.
These modifications can affect the time complexity of the algorithm per iteration,
which in the worst case is , but in a full bracketed sample the time complexity is " .
In this algorithm, the initial grammar can be obtained from a treebank corpus. Each
sentence in the corpus is explored and each syntactic structure is considered as a grammar rule. In addition, the frequency of appearance is adequately maintained and, at
the end, these values are conveniently normalized. These initial grammars have been
successfully used for real tasks [6], since they allow us to parse real test sets.
G >
G >
where
' '
'
'
'
' '
(4)
48
47
Probability of Generating an Initial Substring with a SCFG in GF. The expression (4) can be calculated by means of a simple adaptation of the forward algorithm [4]:
#
&B
) = ' M.- ( ' '
M-
-
&
M
#
H
'
if
M!
$ M&##
* ' )+* ' J-/ if
'
In this expression, ) ' : , which is computed from the probabilistic left-corner relation ' q(M
), ( ' . ) : ' %
when the grammar is consistent [14].
.
In this way
' = * $ *
M
- O
Release 2 of this data set can be obtained from the Linguistic Data Consortium with Catalogue
number LDC94T4B (http://www.ldc.upenn.edu/ldc/noframe.html)
49
48
that we considered were the following: all words that had the POStag CD (cardinal
number [15]) were replaced by a special symbol which did not appear in the initial
vocabulary; all capital letters were uncapitalized; the vocabulary was composed of the
10,000 most frequent words that appear in the training.
Baseline Model. We now describe the estimation of a 3-gram model to be used as both
a baseline model and as a part of the hybrid language model. The parameters of a 3-gram
model were estimated using the software tool described in [16]. Different smoothing
techniques were tested, but we chose the one which provided a test set perplexity which
was similar to the one presented in [8, 9]. Linear discounting was used as smoothing
technique with the default parameters. The out-of-vocabulary words were used in the
computation of the perplexity, and back-off from context cues was excluded. The tuning
set perplexity with this model was 160.26 and the test set perplexity was 167.30.
Hybrid Language Model. First, the category-based SCFGs ( ) of the hybrid model
was obtained. First, we describe the estimation of a SCFG in GF, and then we describe
the estimation of a SCFG in CNF.
Given that the UPenn Treebank corpus was used, a treebank grammar was obtained
from the syntactic information. The corpus was adequately filtered in order to use only
the POStags and the syntactic tags defined in [15]. Probabilities were attached to the
rules according to the frequency of each one in the training corpus. Then, this initial
grammar was estimated using the bracketed version of the Earley (bE) algorithm. The
perplexity of the POStag part of the tuning set was computed and the results can be seen
in Table 1.
The parameters of an initial SCFG in CNF were estimated using the bracketed version of the IO (bIO) algorithm. The initial grammar had the maximum number of rules
which can be created with a 35 non-terminals 45 terminal symbols, and the results can
be seen in Table 1.
Table 1. Tuning set perplexity of the POStag part with the bracketed version of the Earley algorithm (bE), and the bracketed version of the Inside-Outside algorithm (bIO).
Algorithm
Tuning set perplexity
bE
10.53
bIO
10.24
We can observe that slightly better results were obtained with the SCFG estimated
with the bIO algorithm. This may be due to the fact that did not generalize enough and
there was not enough room for improvements [17].
50
49
based on the classification of unknown words into categories in the tuning set. A small
probability was assigned if no unseen event was associated to the category. The percentage of unknown words in the training set was 4.47% distributed in 31 categories,
and the percentage of unknown words in the tuning set was 5.53% distributed in 23
categories.
Finally, once the parameters of the hybrid language model were estimated, we applied expression (1). In order to compute expression (4) we used:
PSfrag replacements
the modified version of the LRI algorithm [10] with the SCFG in CNF which was
estimated as we previously described;
the modified version of the forward algorithm previously described, with the SCFG
in GF which was estimated as we previously described.
The tuning set was used in order to determine the best value of for the hybrid model,
and the results can be seen in Figure 1.
165
word 3-gram
160
155
150
145
140
HLM-bIO
135
HLM-bE
130
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fig. 1. Tuning set perplexity depending on for the hybrid language model with the estimated
treebank grammar (HLM-bE) and with the estimated SCFG in CNF (HLM-bIO).
The results obtained with the treebank grammar slight improved the results obtained
by a grammar heuristically initialized with no structural information. It can be observed
that a good structural initial model plays an important role in the estimation process.
The minimum tuning set perplexity with the estimated treebank grammar was #
which means a #
% of improvement over the 3-gram. Another important aspect to
note is that the weight of the grammatical part was approximately %, which is less
than the values obtained by other authors. However it is important to note that the weight
of the grammatical part obtained by the SCFG in CNF was slightly better (35%).
Table 2 shows the test set perplexity obtained for the hybrid language model and
the results obtained by other authors who define left-to-right hybrid language models
of the same nature [8, 9]. It should be noted that the differences in the perplexity of the
trigram model were due mainly to the different smoothing techniques. It can also be
51
50
Table 2. Test set perplexity with a 3-gram model and with the hybrid language model and percentage of improvement for different proposals. The first row (CJ00) corresponds to the model
proposed by Chelba and Jelinek in [8], the second row (R01) corresponds to the model proposed
by Roark in [9], the third row (HLM-bIO) corresponds to the model proposed by [13], and the
fourth row (HLM-bE) corresponds to our proposed hybrid language model with the estimated
treebank grammar.
Model
Perplexity
Trig.
Interp.
CJ00
167.14 148.90
R01
167.02 137.26
HLM-bIO 167.30 142.29
HLM-bE 167.30 140.31
% improv.
0.4
0.4
0.65
0.67
10.9
17.8
15.0
15.1
observed that the results obtained by our models are very good, even better if you consider that both the models and their learning methods are simple and well-consolidated.
The weight of the structural model of our proposal was less than the other models. This
may be due to the fact that the our SCFG were not structurally as rich as those models
proposed by other authors.
3.2 Word Error Rate Results
Now, we describe very preliminary speech recognition experiments which were carried
out to evaluate the hybrid language model. Given that our hybrid language model is not
integrated in a speech recognition system, we reproduced the experiments described
in [8, 9, 13] in order to compare our results with those reported in those works.
The experiment consisted of rescoring a list of best hypotheses provided by the
speech recognizer described in [8]. A better language model was expected to improve
the results provided by a less powerful language model. In order to avoid the influence
of the language model of the speech recognizer it is important to use a large value of ;
however, for these experiments, this value was lower.
The experiments were carried out with the DARPA 93 HUB1 test setup. This test
consists of 213 utterances read from the Wall Street Journal with a total of 3,446 words.
The corpus comes with a baseline trigram model using a 20,000-word open vocabulary
and is trained on approximately 40 million words.
The 50 best hypotheses from each lattice were computed using Ciprian Chelbas
A* decoder, along with the acoustic and trigram scores. Unfortunately, in many cases,
50 distinct string hypotheses were not provided by the decoder [9]. An average of 22.9
hypotheses per utterance were rescored.
The hybrid language model was used in order to compute the probability of each
word in the list of hypotheses. The probability obtained with the hybrid language model
was combined with the acoustic score and the results can be seen in Table 3 together
with the results obtained for different language models.
It can be observed that our language model with the estimated treebank grammar
slightly improved the results obtained by the baseline model, in accordance with the
52
51
Table 3. Word error rate (WER) results for several models, with different training and vocabulary
sizes and the best language model weight. The first row (LT) corresponds to the lattice trigram
provided with the HUB1 test, the second row (CJ00) corresponds to the model proposed by [8],
the third row (R01) corresponds to the model proposed by [9], the fourth row (BT) corresponds
to the baseline trigram, the fifth row (No LM) corresponds to the results without the language
model, the sixth row (HLM-bIO) corresponds to the model proposed by [13], and the seventh
row (HLM-bE) corresponds to our language model with the estimated treebank grammar.
Model
LT
CJ00
R01
BT
No LM
HLM-bIO
HLM-bE
Training
Size
40M
20M
1M
1M
1M
1M
Voc. LM
Size Weight WER
20K 16 13.7
20K 16 13.0
10K 15 15.1
10K
5
16.6
0
16.8
10K
6
16.0
10K
4
16.2
results obtained by other authors. However it is important to note that the estimated
SCFG in CNF obtained better results, even if we take into account that the test set
perplexity was slightly worst with this model.
An important aspect to be noted is that although the improvement in perplexity is
important (the same order of magnitude of other authors [9]), this improvement is not
reflected in this error rate experiment. This may be due to the fact that our model is not
structurally rich enough, and that more training samples should be used.
4 Conclusions
We have described two combination of a hybrid language, one with SCFG in GF and
the other with SCFG in CNF, and results of its evaluation have also been provided.
The test set perplexity results were as good as the ones obtained by other authors, even
better if you consider that the models are very simple and their learning methods are
well-known.
The word error rate results were slightly worse than the ones obtained by other
authors. However, we remark that these results tended to improve without including
any additional linguistic information.
For future work, we propose extending the experimentation by increasing the size
of the training corpus in order to avoid overlearning in accordance with the work of
other authors.
5 Acknowledgements
The authors would like to thank Brian Roark for providing them with the -best lists
for the HUB1 test set.
53
52
References
1. L.R. Bahl, F. Jelinek, and R.L. Mercer. A maximum likelihood approach to continuous speech recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI5(2):179190, 1983.
2. K. Lari and S.J. Young. The estimation of stochastic context-free grammars using the insideoutside algorithm. Computer, Speech and Language, 4:3556, 1990.
3. F. Pereira and Y. Schabes. Inside-outside reestimation from partially bracketed corpora. In
Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics,
pages 128135. University of Delaware, 1992.
4. A. Stolcke. An efficient probabilistic context-free parsing algorithm that computes prefix
probabilities. Computational Linguistics, 21(2):165200, 1995.
5. J.A. Sanchez and J.M. Bened. Learning of stochastic context-free grammars by means of
estimation algorithms. In Proc. EUROSPEECH99, volume 4, pages 17991802, Budapest,
Hungary, 1999.
6. E. Charniak. Tree-bank grammars. Technical report, Departament of Computer Science,
Brown University, Providence, Rhode Island, January 1996.
7. J. Earley. An efficient context-free parsing algorithm. Communications of the ACM,
8(6):451455, 1970.
8. C. Chelba and F. Jelinek. Structured language modeling. Computer Speech and Language,
14:283332, 2000.
9. B. Roark. Probabilistic top-down parsing and language modeling. Computational Linguistics, 27(2):249276, 2001.
10. J.M. Bened and J.A. Sanchez. Combination of n-grams and stochastic context-free grammars for language modeling. In Proceedings of COLING, pages 5561, Saarbrucken, Germany, 2000. International Committee on Computational Linguistics.
11. F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1998.
12. L.E. Baum. An inequality and associated maximization technique in statistical estimation
for probabilistic functions of markov processes. Inequalities, 3:18, 1972.
13. J. Garca, J.A. Sanchez, and J.M. Bened. Performance and improvements of a language
model based on stochastic context-free grammars. In F.J. Perales, A.J.C. Campilho, N. Perez,
and A. Sanfeliu, editors, Iberian Conference on Pattern Recognition and Image Analysis,
Lecture Notes in Computer Science, 2652, pages 271278. Springer, June 2003.
14. F. Jelinek and J.D. Lafferty. Computation of the probability of initial substring generation
by stochastic context-free grammars. Computational Linguistics, 17(3):315323, 1991.
15. M.P. Marcus, B. Santorini, and M.A. Marcinkiewicz. Building a large annotated corpus of
english: the penn treebank. Computational Linguistics, 19(2):313330, 1993.
16. R. Rosenfeld. The cmu statistical language modeling toolkit and its use in the 1994 arpa csr
evaluation. In ARPA Spoken Language Technology Workshop, Austin, Texas, USA, 1995.
17. D. Linares, J.M. Bened, and J.A. Sanchez. Learning of stochastic context-free grammars
by means of estimation algorithms and initial treebank grammars. In F.J. Perales, A.J.C.
Campilho, N. Perez, and A. Sanfeliu, editors, Iberian Conference on Pattern Recognition
and Image Analysis, Lecture Notes in Computer Science, 2652, pages 403410. Springer,
June 2003.
54
53
Abstract. This paper describes recent improvements in Synapse system [5, 6] for inductive inference of context free grammars from sample
strings. For effective inference of grammars, Synapse employs incremental
learning based on the rule generation mechanism called inductive CYK
algorithm, which generates the minimum production rules required for
parsing positive samples. In the improved version, the form of production
rules is extended to include not only A but also A , called extended Chomsky normal form, where each of and is either terminal or
nonterminal symbol. By this extension and other improvements, Synapse
can synthesize both ambiguous grammars and unambiguous grammars
with less computation time compared to the previous system.
Keywords: incremental learning, inductive CYK algorithm, context
free language, unambiguous grammar, iterative deepening
Introduction
55
54
is derived. An important feature of this method is that the system only generates
the rules that are applied to in the bottom-up parsing by CYK algorithm.
Synapse uses the incremental learning for two purposes. First, for learning a
grammar from its sample strings, the positive samples are given to the rule generation by inductive CYK algorithm in order, until all the positive samples are
derived from the set of rules. This method is different from the usual incremental
learning as in [7] in that the system checks all the negative samples each time it
finds a set of rules for a positive sample. This process continues until the system
finds a set of rules from which all the positive samples, but none of the negative
samples, are derived. The second use of incremental learning is to synthesize a
grammar by adding rules to the rules of previously learned grammars of similar
languages.
The system inputs a set of samples and an initial set of rules and searches
for the set of rules from which all the positive samples, but no negative samples,
are derived. To obtain the minimum sets of rules, the system employs iterative
deepening. The sets of rules are searched within a limited number of rules. When
the search fails, the system iterates the search with larger limits. The iterative
deepening is widely used for finding the optimum solution by the search, for
example, in inductive logic programming.
The approaches for grammatical induction as well as machine learning of rule
sets in general are classified into two categories, covering-based and generalizationbased approaches, which we might call top-down and bottom-up approaches,
respectively. The term covering is derived from that the learning system gradually covers all the positive samples, starting with empty set of rules [2]. It
iteratively synthesizes rules until the rule set derives all the positive sample
strings but no negative sample strings. The covering approach is employed by
Sakakibara et. al. [9, 10] as well as Synapse. Sakakibaras method is based on
both genetic algorithm and CYK algorithm. Their use of CYK algorithm is different from our approach in that possible tables of symbol sets are generated
and tested in every generation of the GA process.
Many other grammatical inference systems for CFGs employ generalizationbased approach, which is based on generation of rules by analyzing the samples
and generalization and abstraction of the generated rules. GRIDS system by
Langley and Stromsten [4] uses two operators to create new nonterminal symbols
and to merge rules for inferring grammars from natural language sentences.
The objective of Emile system [11] is to generate rules by analyzing a large
volume of natural language texts. The other differences between these systems
and Synapse are that natural languages are the target language and that these
are not intended to synthesize simple grammars for general CFGs.
56
55
rules (or simply rules) are of the form A u with A N, u (N T )+ . Throughout this paper, we represent nonterminal symbols by uppercase characters, and
terminal symbol by lower case characters. For any string v and w (N T )+ , we
write v G w, if w can be derived from v by replacing a nonterminal symbol A
and A
(A N, , N T )
called the extended Chomsky normal form (extended CNF). Note that CNF is
the special case of the extended CNF. An important feature of the extended
CNF is that CYK algorithm can be extended to deal with the rules of this form
as shown later. Another feature is that grammars of this form can be simpler
than that of Chomsky normal form. This feature is significant for learning rules,
since the computation time depends heavily on the number of rules.
Our previous works [5, 6] are concerned with learning grammars of the form
A , called revised CNF. By this form, it is generally possible to simplify
not only the sets of rules but also grammatical inference by omitting the rules of
the form A a, when the number of terminal symbols is not large. Although no
one-letter string can be derived from the revised CNF grammar, the languages
defined by this grammar includes all the context free languages of strings with
the lengths of two or more.
The following coding technique is a method of making up for the problem
that this normal form does not have the rules of the form A a.
Doubling symbols in the strings, so that a string a1 a2 an is represented
by a1 a1 a2 a2 an an .
Using rules of the form A aa instead of A a.
Using Synapse
We have two versions of Synapse. One is written in C++ and the other in Prolog.
The time data in this section is based on the faster C++ version running on
AMD Athlon processor with 1 GHz clock. In this section, as an introduction to
Synapse, we outline how to use Synapse.
1. Input a file of positive and negative samples. For efficient synthesis, the
samples are ordered by their length.
2. Input a file of initial rules (optional). The initial rules are either those of
other grammar or simple supplement rules.
1
In the left derivation, rules are always applied to the leftmost nonterminal symbols.
57
56
3. Select the form of production rules to be generated from the followings,
where each of and is either a terminal or nonterminal symbol.
Since Synapse currently does not generate rules of the form A a in the
case of extended CNF, these rules need to be given as initial rules.
4. Choose which of an ambiguous grammar or an unambiguous grammar Synapse
synthesizes.
5. Set the following optional parameters. The functions of these parameters are
explained later.
The maximum limit of the number of rules (Kmax ). The default value
is infinity. This value determines the range of search by the iterative
deepening.
The limit Rmax of the number of rules to be generated for one positive
sample. The default value is equal to Kmax .
The limit of the number of nonterminal symbols. This parameter is usually not important, since generation of a new nonterminal symbol is
restricted by the form of generated rules (see Section 4).
6. Start synthesis. Every time the system finds a new grammar, it outputs the
rule set and pauses. When the user responds by clicking a redo button, the
system restarts and tries to find the next solution.
Example 1: The set of strings containing the same number of as and
bs The followings are the samples for the language {w {a, b}+ | #a (w) =
#b (w), |w| 2}, where #x (w) is the number of xs in w.
Positive : ab, ba, baab, aabb, abba, bbaa, aaabbb, abaabb, aabbab, baabab.
Negative : aa, bb, aaa, aba, baa, bba, aaaa, aaba, baaa, babb, bbba, bbbb.
In this case, no initial rule is necessary. We select revised CNF and ambiguous
grammars. Synapse outputs the following grammar with seven rules as the first
solution in 0.43 second after generating 2359 rules in the search.
S ab |ba |bC |Cb |SS,
C aS |Sa
Note that the sample set is not the minimum: The same grammar can be obtained from only the first seven positive samples above and the first four negative
samples. In most cases, the result grammars and the computation time does not
depend significantly on the number of samples, especially that of positive samples, provided that the input contains a certain set of sufficient samples.
If we chose an unambiguous grammar for this language, Synapse shows that
there is no unambiguous grammar having less than 14 rules2 . In this case, the
search takes several hours and needs approximately 100 samples.
2
58
57
Example 2: The set of strings containing more bs than as For the
language {w {a, b}+ | #a (w) < #b (w), |w| 2}, Synapse synthesizes the
following nine rules in 31 seconds after generating 2.4 105 rules.
S bb | SC | Sb | bC | Cb, C ab | ba | Sa | aS
Synapse can also incrementally learn a grammar of this language based on
the grammar of the similar language in Example 1. By giving the grammar as
the initial set of rules after replacing every occurrences of the starting symbol
S by S1 , Synapse generated the following additional rules in 0.4 second after
generating 2549 rules.
S Sb | SS1 | bb | bS1 | S1 b.
Note that the number of the total rules 7 + 5 = 12 is greater than 9 of the
grammar above, but the total computation time is much less than the time for
the directly obtained grammar.
This section describes the improved version of the system. The main differences
from the previous version [6] are the form of production rules and the introduction of the parameter Rmax .
4.1
Figure 2 shows the extended inductive CYK algorithm. For inputs of a string
w and a set P0 of rules, the procedure is to generate a set P1 of rules such that
59
58
Input an ordered set SP of positive sample strings;
an ordered set SN of negative sample strings;
an initial set P0 of rules; and the limit Kmax of the number of rules.
Output A set P of rules such that all the strings in SP are derived from P but
no string in SN is derived from P .
Procedure
Step 1: Initialize variables P P0 (the set of rules),
N {S} {the set of nonterminal symbols in P0 }, and
K |P0 | (the limit of the number of rules).
Step 2: For each w SP , iterate the following operations.
1. Find a set of rules by calling inductive CYK algorithm with the inputs w,
P , N and K. The results are returned to the global variables P and N .
2. For each v SN , test whether v is derived from P by CYK algorithm. If
there is a string v derived from P , then backtrack to the previous choice
point.
If no set of rules is obtained, then
1. If K Kmax , terminate (no set of rules is found within the limit).
2. Otherwise, add 1 to K and restart Step 2.
Step 3: Output the result P .
For finding multiple solutions, backtrack to the previous choice point. Otherwise, terminate.
Fig. 1. Top-level Procedure of Synapse
60
59
61
60
4.3
Heuristics
Limitation on the Form of Generated Rules When the process of inductive CYK algorithm generates a rule A with a new symbol A in the
left hand side, this rule is not effective until a rule containing A in the right
hand side is also generated. Therefore, after generating the rule A , we
can restrict the rule generation to not terminating until a rule of the form either
B A or B A is generated. By adding this heuristics, the synthesis
speed increased by a factor of two to twenty.
Intelligent backtracking Consider the case where by adding two or more
newly generated rules, a positive sample is derived from the set of rules but
some negative sample is also derived from this set of rules. In this case, instead
of a rule R of the generated rules, another rule may be generated in the redoing
process, but the rule R is not be used in the derivation of the negative sample.
To avoid this ineffectiveness, the system tests each of the rules in backtracking
whether the negative sample is derived from the set of rules without this rule,
and regenerates a rule, only if the test succeeds. By adding this heuristics3 , the
synthesis speed increased by several times as fast as before.
Hash memory for avoiding repeated search The search tree generally
has two or more equivalent nodes that correspond to the same rule sets. To
avoid repetition of equivalent search, Synapse (C++ version) has a hash memory
for checking whether each rule set has been processed, each time the system
generates the set. This mechanism4 reduces the number of rule sets generated in
the search at the cost required for storing and checking the hash memory. Some
experiments showed that the same rule sets appear frequently in the search,
although the reduction in the computation time is not generally large (5 35%)
because of the overhead for accessing the hash memory.
Performance Results
Table 1 shows the grammars synthesized by Synapse only from their sample
strings, computation time, and the number GR of generated rules, which represents the size of search tree. These grammars are solutions to exercise problems
3
62
61
Table 1. Grammars synthesized by Synapse (Computation time in second).
Language
Set of rules
R Time
GR
A S ( ) | C ) | SS, C ( S
4 0.07
254
S ( ) | C ),
U
6 0.77
3219
C S( | ( S | DS, D S(
S a |b | S | SS | G ) | HS,
A
6
1.9 1.0 104
G ( S, H S+
1.9 1.9 104
A S aa | bb | bD | Ca,
10
C
aa
|
ab
|
aS,
D
ab
|
bb
|
Sb
2.2 2.2 104
U
S ab |ba |bC |Cb |SS,
A
7 0.43
2359
C aS |Sa
S bC | Cb | SS,
A
9
22 1.8 105
C ab | ba | bD | Db, D aS | Sa
S CD | DC | F E | GE,
(f)
A C a | F E, D b | GE,
8 3337 1.4 107
+
{ww|w {a, b} }
E a | b, F EC, G ED
A: ambiguous grammar.
U: unambiguous grammar.
R: num. of generated rules in the grammar.
GR: num. of generated rules.
wR : the reversal of w.
#x (w): the number of xs in w.
(a)
balanced
parentheses
(b) regular
expressions
(c) w = wR
w {a, b}{a, b}+
(d)
#a (w) = #b (w)
(e)
2#a (w) = #b (w)
in the textbook by Hopcroft & Ullman [3], where the language (c) is the set of
palindromes, (d) is the set of strings containing the same number of as and bs,
(e) is the set of strings containing twice as many bs as as, and (f) is the set of
strings not of the form ww. Each of the grammars is the first result among the
multiple results.
All the grammars except (f) are found with Rmax = 2, where the parameter
Rmax is the limit of the of generated rules for one positive sample. The grammar
(f) cannot be obtained by setting Rmax = 2 but obtained by Rmax = 3.
For the grammars (b) and (f), we used the coding technique to synthesize
the grammars that derive one-symbol strings. The grammar (f) is synthesized
by giving Synapse the rules C aa, D bb, E aa | bb as an initial rule set,
and setting the form of rules as Chomsky normal form.
5.2
At present, Synapse has not directly synthesized a grammar for the language
{ai bj ck | i = j or j = k, i, j, k 1, }, which is also in the problems of Hopcroft
& Ullman [3], in a reasonable amount of time. Synapse found a grammar for
this language by giving the grammars of its subsets L1 = {ai bi ck | i, k 1}
and L2 = {ai bk ck | i, k 1} as the initial set of rules. Each of the following
grammars of L1 and L2 , respectively, is obtained in less than 0.1 second (GR
= 599).
L1 : S1 Dc | S1 c, D ab | Eb, E aD.
L2 : S2 aF | aS2 , F bc | Gc, G bF.
63
62
Given these two grammars as an initial rule set, Synapse (Prolog version) synthesized the remaining four rules
S aF | aS2 | Dc | S1 c
of the revised CNF in less than 820 seconds (GR = 5287). By selecting the
extended CNF as the rule form, Synapse found two rules S S1 | S2 in 120
seconds (GR = 2365) instead of the four rules5 .
The system also synthesized a grammar of revised CNF for this language
L1 L2 from the grammar of its subset L1 of the initial rule set. Synapse (C++
version) found the remaining seven rules
S Dc | S1 c | aG, G bc | aG | bH, H Gc
in 240 seconds (GR = 3.8 105 ).
5.3
Computation Time
The computation time depends of the number GR of generated rules, the limit
Rmax , the number of samples, and the number of nonterminal symbols. The
number GR generally grows exponentially with the size of the synthesized rule
sets. The time for synthesizing a grammar with Rmax = 2 is three or four times
less than that with Rmax = 3, and approximately six or more times less than
that without the restriction by this parameter, when the size of the rule set is
larger than eight. If the longer positive samples are given before the shorter ones,
the parameter Rmax need to have a higher value than 3, and the computation
time increases considerably.
Figure 3 represents the relation between the number of rules in the generated
grammar and the number of all the generated rules for the grammars in Table
1 and grammar p of the language {am bn | 1 m n}, grammar q of the
language L1 above, and grammar r of the parenthesis language with elements,
e.g. (), ( ), (()), (() ), .
Conclusion
At present, the inductive CYK algorithm for extended CNF is implemented only in
Prolog version of Synapse.
64
63
108
Ambiguous
Unambiguous
107
106
10
10
e
c
c
r
b
a
r
p
103
10
q
a
p
101
10
11
Fig. 3. The number of generated rules versus the number of rules in the generated
grammars.
65
64
Applying the induction methods to synthesizing other classes of grammars
such as graph grammars and extending Synapse to learn grammars for natural languages.
Developing a method of representing semantics in sample strings. This is
necessary, for example, for synthesizing an unambiguous CFG of arithmetic
expressions. The partially structured examples in [10] might be used for this
purpose.
Acknowledgements
The author would like to thank Satoshi Matsumoto and Satoru Yoshioka for their
help in writing and testing Synapse system. This work is partially supported by
Research Institute for Technology of Tokyo Denki University.
References
1. D. Angluin and M. Kharitonov, When and Wont Membership Queries Help?, Jour.
Computers and System Sciences 50 (1995) 336355 .
2. I. Bratko, Refining complete hypothesis in ILP, Proc. 9th International Workshop
on Inductive Logic Programming (ILP 99), LNAI 1634, Springer-Verlag, pp.4455,
1999.
3. J. E. Hopcroft, and J. E. Ullman, Introduction to Automata Theory, Languages, and
Computation (Addison-Wesley, 1979).
4. P. Langley and S. Stromsten, Learning Context-Free Grammars with a Simplicity
Bias, Machine Learning: ECML 2000, LNAI 1810 ( Springer-Verlag, 2000) 220228.
5. K. Nakamura and Y. Ishiwata, Synthesizing context free grammars from sample
strings based on inductive CYK algorithm, Fifth International Colloquium on Grammatical Inference (ICGI 2000), LNAI 1891 ( Springer-Verlag, 2000) 186195.
6. K. Nakamura and Matsumoto, Incremental Learning of Context Free Grammars,
6-th International Colloquium on Grammatical Inference (ICGI 2002), LNAI 2484
(Springer-Verlag, 2002) 174 184.
7. R. Parekh and V. Honavar, An incremental interactive algorithm for regular grammar inference, Third International Colloquium on Grammatical Inference (ICGI96), LNCS 1147 (Springer-Verlag, 1996) 222237.
8. L. Pitt and M. Warmuth, The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial, Jour. of ACM 40 (1993) 95142.
9. Y. Sakakibara, and M. Kondo, GA-based learning of context-free grammars using
tabular representations, Proc. 16th International Conference of Machine Learning
(Morgan Kaufmann, 1999) 354360.
10. Y. Sakakibara, and H. Muramatsu, Learning of context-free grammars partially
structured examples, Fifth International Colloquium on Grammatical Inference
(ICGI 2000), LNAI 1891 (Springer-Verlag, 2000) 229240.
11. M. Vervoort, Emile 4.4.6 User Guide (Universiteit van Amsterdam, 2002).
66
65
67
66
68
67
69
68
70
69
71
70
72
71
73
72
74
73
75
74
76
75
77
76
78
77
Introduction
The problem of learning context free grammars (in the limit) from structural
examples has attracted much attention ([6],[7],[2],[4]). A structural example for
a sentence consists of the parse tree structure of that sentence without the nonterminal labels. Also known as a tree language, the parse trees of a context
free language can be characterized by a tree automaton ([6],[7]). This allows
methods for learning regular languages to be extended to the learning of context
free languages from structural examples (see [6],[7],[2]).
When Gold [3] first presented his paradigm of learning in the limit, he also
showed that within this paradigm the class of context free grammars is not
learnable from positive examples. While the use of structural examples does
seem to make the learning problem easier, it is still not enough to get around
Golds theorem and (just as with the class of regular languages) the class of
context free languages is not learnable from structural examples. For some, this
is sufficient reason to reject Golds learning paradigm altogether. Others, who
still want to apply this paradigm to the learning of context free grammars, may
resort to two methods. The first is to supply the learner with some additional
information, the other, to restrict the class of languages which has to be learned
to a subclass of the context free languages.
79
78
Sakakibara presented solutions of both types. In [6], the learner is allowed
to use structural equivalence and membership queries, thus gaining additional
information about the language, while in [7] only positive examples are used
but the algorithm is restricted to the class of reversible context free grammars.
The two approaches can also be combined, of course. Fernau [2] extended [7]
to learning the -distinguishable tree languages. Every choice of a distinguishing function results in a learnable class. For any language, when the learning
algorithm is equipped with an appropriate distinguishing function, it is guaranteed to learn the language correctly from positive examples. Clearly, by Golds
theorem, to guarantee a correct choice of the distinguishing function for every
language, the learner must make use of some information beyond the positive
examples for the language.
The algorithm we present in this paper also learns context free grammars
from structural examples. The learning algorithm, which is very simple and
natural, uses a finite set of structures, the context set. Similarly to Fernaus
distinguishing functions, every context set makes a subclass of the context free
languages learnable and together these subclasses cover the whole class of context
free grammars. Another property shared with Fernaus algorithm is that even
when a language is outside the class of languages learnable by the algorithm (for
a given context set), the algorithm converges to a language containing (overgeneralizing) the original language. The resulting language does not depend on
the order in which the language is presented to the learner (compare this with the
approximation property in [1]). Beyond these similarities, the learning algorithm
itself is of a completely different nature, as the ability to distinguish trees, which
is given as an oracle (the distinguishing function) to Fernaus algorithm, does
not exist for our algorithm (until the learning process actually converges). The
workings of our algorithm can in some ways be compared with the family of
tail algorithms presented in [5], even though these algorithms are not learning
algorithms in the sense of Gold (as convergence in the limit is not guaranteed).
In addition to presenting the basic algorithm (section 2) and proving its convergence (section 3) we present (in section 5) a method for choosing the context
set to be used, based on the sample being presented to the learner. It relies on
the fact that even when the learning process is confined to receiving positive examples, the learner has access not only to the positive examples themselves, but
also to the order and frequencies in which they appear. Whether this contains
any information about the language being learned depends on the way in which
the examples are generated. Usually, when learning within Golds paradigm of
identification in the limit, the only assumption one makes about the process
generating the examples is that every sentence will eventually be generated by
it. It has often been observed that this is a very strong requirement (even in
Golds original paper some alternatives are examined).
The method presented here constructs the context set out of the most frequent structures in (some initial segment of) the language sample. Such frequent
structures are clearly evident in natural languages and it has already been observed by linguists [8] that these structures may play an important role in lan-
80
79
guage acquisition by children. Whether this procedure produces a good context
set for the language to be learned depends on the settings in which we wish to
apply the algorithm and is usually an empirical question. There are, however,
several properties of the learning algorithm which make this method a good
candidate. We discuss this in section 5.
A basic property shown in sections 4 and 5 is that there is a natural process
which leads to learnable languages and if a language becomes unlearnable (for
whatever reason) there is a way of returning to a learnable language (even if a
somewhat different one). Beginning with any language, repeated learning (that
is, every learner learning the language which the previous learner converged to)
is bound to arrive (after a finite number of steps) at a language learnable by
all subsequent learners. The class of these limit languages is then a very natural
learnable class.
The learning algorithm receives trees as input and produces a set of rules (the
grammar) as output. It learns, therefore, from structural examples. The algorithm creates grammar rules by looking at subtrees of the sample trees and the
context trees in which they appear. We begin by defining all these structures.
A tree T is either a single terminal or a tuple hT1 , . . . , Tl i (1
l) where T1 , . . . , Tl are trees. We write T () for the set of all trees over the
terminal set . A context is a tree in which a single leaf can be substituted
by a tree to create a tree. We assume that 6 ( indicates the leaf in the
context where a tree can be substituted). A context c is either the symbol
or c = hT1 , . . . , Ti1 , c0 , Ti+1 , . . . , Tl i (1 l, 1 i) where c0 is a context and
T1 , . . . , Ti1 , Ti+1 , . . . , Tl T (). We write CT () for the set of all contexts
over the terminal set .
Given a context c and a tree T we write c(T ) for the tree created by substituting T for the (unique) leaf labeled in c. Given a tree T and a node in this
tree, we can always write T = c(S) where S is the subtree rooted at the node
(when is the root of the tree T , c = ). A tree S is a subtree of tree T if there
exists a context c such that T = c(S). We define S(T ) = {S T () : c
CT () s.t. T = c(S)}, the set of subtrees of T . S
Given a set of trees X (e.g. a
language L), the set of subtrees of X is S(X) = T X S(T ).
Central to the operation of the algorithm is a finite context set C CT ().
This context set C is allowed to change a finite number of times as the language
is being presented to the learner, so we have a finite sequence of context sets
C1 , . . . , Cn . However, in what follows we will assume that the last context set, C n ,
is used throughout the presentation of the language sample. This is a legitimate
assumption, since, in the worst case, each time the context set changes, the
algorithm can rerun all previous learning steps using the updated context set.
For the algorithm presented here it will become clear that much less is needed.
The way in which the context set sequence C1 , . . . , Cn is determined will be
discussed later, in section 5. Therefore, from now on we will assume a finite
81
80
context set C which is fixed at the beginning of the learning process. It is also
required that C.
To construct grammar rules, the algorithm uses the partially ordered set
O = (2C , ) with the subset ordering on the elements of 2C (the elements
in are neither comparable with elements in 2C nor with each other). Every
grammar rule is then of the form (y|x1 , . . . , xl )d where y 2C , x1 , . . . , xl O,
1 l and 0 d. The natural number d is the level of the rule.
Let R be a set of such rules. We define the function ctxR (T ) which maps
a tree T into an element in O based on the rules in the rule set R. IfST =
then ctxR (T ) = . Otherwise T = hT1 , . . . , Tl i and ctxR (T ) = {y :
(y|x1 , . . . , xl )k R s.t. k depth(T ), xi ctx(Ti ), 1 i l} where depth(T )
is defined recursively as depth(T ) = 1 + maxi=1,...,l depth(Ti ) and depth() = 0
(for a terminal ). The language L(R) generated by a rule set R is defined
to be L(R) = {T T () : ctxR (T )}.
The algorithm maintains a set of pre-rules, P. A pre-rule is of the form
(y|T1 , . . . , Tl ) where y 2C and T1 , . . . , Tl are trees. A projection R from
pre-rules to rules, based on the rule-set R, is defined. If the pre-rule P is
(y|T1 , . . . , Tl ) then R (P ) = (y|ctxR (T1 ), . . . , ctxR (Tl ))depth(hT1 ,...,Tl i) . The number depth(hT1 , . . . , Tl i) is the depth of the pre-rule P , which is equal to the level
of the rule R (P ).
The algorithm is initialized with an empty set of pre-rules P and may change
this pre-rule set at every step. For every pre-rule set P, the rule set hypothesized by the algorithm
is the unique rule set R such that R = R (P) =
82
81
2. Otherwise, if there exists already a pre-rule (y|T1 , . . . , Tl ) in P then this
pre-rule is replaced by the pre-rule (y (C {c})|T1 , . . . , Tl ). Otherwise, the
pre-rule (C {c}|T1 , . . . , Tl ) is added (as is) to P.
3. After the pre-rule set is updated, the corresponding rule set is calculated
and all non-minimal rules (and corresponding pre-rules) are removed.
Given any finite context set C and any context free language L, the algorithm
will be shown to converge to a language LC , which is independent of the specific
language sample which is presented to the algorithm and such that L L C .
Not only the language but also the grammar (rule set) hypothesized by the
algorithm will be shown to converge. However, the exact grammar which the
algorithm converges to may depend on the order in which the examples are
presented to it.
From now on we assume that the context set C has been fixed. We begin by
constructing a rule set RL which will be shown to generate the language LC to
which the algorithm converges. The algorithm need not necessarily converge to
the same rule set, but will be shown to converge to an equivalent one. We define
the set P L = {({c C : c(T ) L}|T1 , . . . , Tl ) : T = hT1 , . . . , Tl i S(L)} of
L to be the unique rule set such that R
L = R L (P L ).
pre-rules and then define R
L
L . Because the
Finally, we define R to be the set of minimal elements in R
language L is generated by a finite context free grammar, there is a bound on
the degree of nodes of the trees in L. From this, together with the fact that C is
finite, it follows that the rule set RL is finite.
Lemma 1. For any finite context set C and context free language L, L L(R L ).
L genProof. It follows from the definition of the rule ordering that RL and R
L
L ) = L(RL ).
t
u
ctxR (T ) and then, by definition, T L(R
L
We write ctxL for the function ctxR . For every rule R = (y|x1 , . . . , xl )d
L
R and for every c y, we define the representative class CL(R, c) = {S
L : S = c(hT1 , . . . , Tl i) L, ctxL (Ti ) = xi , 1 i l, depth(hT1 , . . . , Tl i) = d}.
There are finitely many such classes. We call a sample S = {S i }
i=1 class fat,
if it contains infinitely many trees from CL(R, c) for every such class. To fulfill
this condition, trees from each class may be repeated.
The only condition we impose on the language sample is that it be class
fat. Since the algorithm can simply remember all examples presented to it and
repeat the calculations on them as needed, the class fat sample requirement can
be replaced, at the expense of increased memory and time resources, by the
requirement that at least one tree from each class appear in the sample. This
last condition is even weaker than the standard assumption that all sentences
83
82
in the language appear in the sample. We prefer to assume the repetition of the
necessary examples because the repetition seems reasonable in many settings
and reduces the resources required by the algorithm.
We will usually assume some fixed language sample S = {S i }
i=1 . Relative
to this sample, we wish to examine the pre-rule and rule sets generated by the
algorithm at each step. We therefore write P n and Rn for the pre-rule set and
the corresponding rule set as generated by the algorithm at the end of the nth
step (one algorithm step is applied for every non-leaf node of a sample tree and
so several steps may be needed for the processing of a single sample tree).
The rule set will be shown to converge level by level, beginning with the
lowest level rules. We therefore write Rnk for the rules of depth at most k in Rn
RL
k,
and similarly RL
at most k in RL . We also write ctxL
k = ctx
k for rules of level
n
Rn
n
Rn
ctx = ctx
and ctxk = ctx k . The first lemma shows convergence of the rule
set for any given level.
Lemma 2. For any context free language L, any finite context set C, any class
fat sample S of L and for any 0 k there is a number Nk such that for every
L
n
n
k
Nk n, RN
k = Rk and for every tree T , ctxk (T ) = ctxk (T ).
Proof. By induction on k. For k = 0, the claim is immediate from the definitions.
We now assume the claim for k and prove it for k + 1. Let Nk < n and let
R = (y|x1 , . . . , xl )d Rnk+1 . There exists some S = hS1 , . . . , Sl i S(L) of depth
d k + 1 such that xi = ctxn (Si ) (1 i l) and y {c C : c(S) L}. Since
depth(S) k + 1 and by the induction hypothesis, xi = ctxn (Si ) = ctxnk (Si ) =
L
ctxL
k (Si ) = ctx (Si ) (1 i l). Therefore, ({c C : c(S) L}|x1 , . . . , xl )d
L
84
83
Since there is a finite number of trees of depth at most k + 1, there is an N
such that for every N n and every tree T of depth at most k + 1, ctxnk+1 (T ) =
ctxL
k+1 (T ). For step N < n, if the algorithm tries to add a pre-rule (y|S 1 , . . . , Sl )
for S = hS1 , . . . , Sl i of depth at most k + 1 then, since {c C : c(S) L}
n1
L
ctxL (S) = ctxL
k+1 (S), y ctxk+1 (S) = ctxk+1 (S). Since also |y| 1, there must
n1
Rn1
((y|S1 , . . . , Sl )) and the pre-rule is not
be a rule R Rk+1 such that R
n
N
added. It follows that Rk+1 = Rk+1 for any N n. Since for every tree T there
is an N (T ) such that, for N (T ) n, ctxnk+1 (T ) = ctxL
k+1 (T ) and since the rules
of level at most k + 1 do not change after step N , it follows that N (T ) N for
every tree T . Therefore, we can take Nk+1 = N .
t
u
Based on the convergence of levels we can now prove the convergence of the
whole rule set.
Theorem 1. For any context free language L, any finite context set C and any
class fat sample S of L, the algorithm converges to a rule set R such that L
L(RL ) = L(R).
Proof. That L L(RL ) is a restatement of Lemma 1. We therefore show that
L(RL ) = L(R). Let k be the level of the highest level rule in RL . Using Lemma
2 we take N = Nk . Let N n and assume that at step n + 1, the algorithm
attempts to add the pre-rule P = (y|S1 , . . . , Sl ). Writing S = hS1 , . . . , Sl i we
have, by definition, that y {c C : c(S) L} and (as in the proof of
Lemma 1) y ctxL (S). By the choice of k and Lemma 2, y ctxL (S) =
n
n
ctxL
k (S) = ctxk (S) ctx (S). Therefore, since |y| is either 0 or 1, there is a
rule R = (z|x1 , . . . , xl )d Rn such that y z and xi ctxn (Si ) (1 i l)
n
and d depth(S). In other words, R R (P ) and the algorithm does not add
the pre-rule. Therefore, the pre-rule set does not change. This shows that the
rules set converges at step Nk . To complete the proof, it remains to prove that
for any tree T , ctxNk (T ) = ctxL (T ). This follows immediately from Lemma 2,
because there are finitely many rules in RNk and applying Lemma 2 for the level
of the highest level rule in RNk gives the required equality (notice that RNk may
contain rules of level higher than k).
t
u
Our next step is to prove that for any context free language L there is a
context set C such that LC = L. We will actually show that for every finite set
of languages we can find a context set which makes them all learnable. From
the proof it will immediately be clear that for every language L there are many
different context sets which guarantee convergence to L. As a result, an algorithm
for finding the context set C enjoys a considerable amount of flexibility.
Theorem 2. For every finite set L of context free languages, there is a finite
set of contexts C L such that for any finite set of contexts C L C and any L L,
LC = L.
Proof.SIt is enough to prove the theorem for |L| = 1 since then we can take
C L = LL C {L} . Fix a context free language L. For every tree T , we define the
85
84
set C(T ) = {c CT () : c(T ) L}. Since the language L is generated by a
finite number of rules, the sub-trees S(L) of L can be assigned only finitely many
different types by the context free grammar which generates L. Since two trees
of the same type appear in the same contexts in the language L, there are only
finitely many different sets C(T ). Let C1 , . . . , Ch be these different sets. The set
C {L} is constructed by taking, for every i 6= j, a context c Ci \ Cj (if such a
context exists). We also add to C {L} (if it is not already there). From now on
we fix some set C as the context set for the algorithm, such that C {L} C.
We will show that for every tree T of depth at least 1, ctxL (T ) = C C(T ).
This proves the theorem because trees of depth zero cannot be in any language
and for trees of depth at least 1, T LC T L(RL )
ctxL (T ) C(T ) T L and therefore LC = L.
If T = hT1 , . . . , Tl i S(L) then, since {c C : c(T ) L} = C C(T ),
L and it follows
there is a rule (C C(T )|ctxL (T1 ), . . . , ctxL (Tl ))depth(T ) R
that C C(T ) ctxL (T ). If T 6 S(L) then C C(T ) ctxL (T ) trivially, since
C(T ) = .
It remains to show that ctxL (T ) C C(T ). Since, by definition, ctxL (T )
C, we show (by induction on the depth of T ) that ctxL (T ) C(T ). The claim
is immediate for a tree of depth 1, since in this case T = h1 , . . . , l i where
L which matches this tree,
1 , . . . , l and there can be at most one rule in R
namely, (C C(T )|1 , . . . , l )1 . We now assume the claim for trees of depth k
and prove it for trees of depth k + 1.
Let T = hT1 , . . . , Tl i be a tree of depth k + 1 and let c ctxL (T ). There
must be a rule (y|x1 , . . . , xl )d RL such that c y, xi ctxL (Ti ) (1 i l)
and d k + 1. Therefore, there is a tree S = hS1 , . . . , Sl i of depth at most
k + 1 such that ctxL (Si ) = xi for i = 1, . . . , l and y = C C(S). For any
1 i l, if depth(Si ) = 0 then xi = Si = i and therefore also ctxL (Ti ) =
i which implies that Ti = Si . In the same way, if depth(Ti ) = 0 then Ti =
Si . If 0 < depth(Si ) (and therefore also 0 < depth(Ti )) it follows, using the
induction hypothesis, that C C(Si ) = ctxL (Si ) ctxL (Ti ) = C C(Ti ). By
the construction of C, this means that C(Si ) C(Ti ) or, in other words, that
Ti can appear in L in any context in which Si appears. This is, of course, true
also for Si with depth(Si ) = 0 since then Si = Ti . Since c(hS1 , . . . , Sl i) L this
implies that c(hT1 , . . . , Sl i) L and repeating this argument we conclude that
c(T ) = c(hT1 , . . . , Tl i) L. Therefore, c C(T ).
t
u
The grammars produced by the algorithm are not context free grammars.
However, Theorem 2 shows that for any context free grammar, the algorithm
can generate a grammar which is strongly equivalent to it (that is, generates the
same trees). It is also not too difficult to check that for any rule set R, L(R) is
a context free language. We omit the proof of this here.
86
85
language. In this section we show that repeated learning (that is, each learner
learning the language to which the previous learner converged) is bound (after a
finite number of steps) to arrive at a language learnable by subsequent learners.
It is not even necessary for every learner in the sequence to use the same context
set C and the context set can be some random function of the language being
learned (as will be discussed in detail later on).
To prove this, we need some notation for the cycles of learning. First, we
assume that we have a sequence of finite context sets C = {Cn }
n=0 . We then
Cn1
(n)
S (L) = S
, that is, the set of all subtrees in all generations. The
L
0n
(L) =
language L is defined to be the language generated by the rule set R
R (L)
write ctx (T ) for ctx
(T ).
87
86
Theorem 3. For any context free language L and any context set sequence C
such that C is finite, there exists an N such that for every N n, L = L(n) .
Proof. Let T L(n) . Then, by Lemma 3, ctx(n1) (T ) C (n) (T ) Cn1
C (T ). Therefore C (T ). It is easy to check by induction that C (T )
ctx (T ) and therefore T L . This shows that L(n) L for all n.
To complete the proof, we show that there exists an N such that for every
N < n, L L(n) . Since {L(n) }0n is an increasing sequence of languages,
{C (n) (T )}0n is an increasing sequence of sets. From this, together with the
assumption that C is finite, it follows that for every tree T with 1 depth(T )
there is a number N (T ) such that for every N (T ) n, C (n) (T ) = C (T ). Since
for every d the number of trees of depth no greater than d is finite, there is a
number N (d) such that for every tree T S (L) with depth(T ) d and for
every N (d) n, T S(L(n) ) and if 1 depth(T ) then also C (n) (T ) = C (T ).
We prove the claim by taking N = N (k) + 1 where k is the maximal level of any
rule in R (L).
Fix some N n. We prove the claim by constructing a rule set R such
that L L(R) L(n) . Let (x) be defined by () = for and
(y) = y Cn1 for y CT (). For a rule R = (y|x1 , . . . , xl )d we can then
define (R) = ((y)|(x1 ), . . . , (xl ))d . The rule set R is then defined by R =
{(R) : R R (L)}.
Let R R, then there is a rule R0 R (L) such that R = (R0 ). By the
choice of k, R0 = (C (S)|C (S1 ), . . . , C (Sl )) for some S = hS1 , . . . , Sl i
S (L) of depth at most k. By the choice of N , S S(L(n1) ) and {c Cn1 :
c(S) L(n1) } = (C (S)). By Lemma 3 together with the choice of N ,
L(n1) . This shows
ctx(n1) (Si ) = (C (Si )) (1 i l). Therefore, R R
(n1)
(n)
that L(R) L(R
)=L .
We now show that for every tree T , (ctx (T )) ctxR (T ). This proves that
L L(R) because T L ctx (T ) = ctxR (T ) T
L(R). We prove the claim by induction on the depth of T . For T =
(depth 0) this is immediate from the definitions. We assume now the claim for
d and let T = hT1 , . . . , Tl i be a tree of depth d + 1. Let c (ctx (T )). There
must be a rule R = (y|x1 , . . . , xl )j R (L) (j d + 1) such that c y and
xi ctx (Ti ) (1 i l). Since (R) R and by the induction hypothesis
(xi ) (ctx (Ti )) ctxR (Ti ) (1 i l), it follows that the rule (R) is used
in calculating ctxR (T ) and therefore c (y) ctxR (T ).
t
u
We wish to apply this theorem in the following setting. For every context
free language L there is a random variable X(L) which selects a context set for
L (that is, has values in 2CT () ). We assume that there exists a finite context
set Cmax such that for every L, P (X(L) Cmax ) = 1. In this case, beginning
with any context free language L over , we can generate the random sequence
88
87
using any Cn with N n. Every such Cn was generated by the random variable
X(L(n) ) = X(L ). Moreover, since there are infinitely many such n, it follows
that, with probability 1, for any C such that P (X(L ) = C) > 0, C = Cn for
some N n. Therefore, with probability 1, L is learnable by any context
to which X(L ) gives a non-zero probability. This means that if the learner
generates the context set used by the algorithm based on the random variable
X, the language sequence eventually converges to a language which is learnable
by this learner with probability 1. In the next section we will see an example of
such a learning strategy.
One natural way for the learner to choose the context set to be used with the
learning algorithm is to take those contexts which appear most frequently in
the language sample. Since for every language there are many different context
sets which make it learnable, the algorithm is not too sensitive to moderate
(unavoidable) changes in the frequencies and to the exact definition of most
frequent structures.
One reason for choosing the most frequent structures is that the more frequent the contexts in the context set are, the sooner will the algorithm see all the
examples it needs to see in order to converge (since the interesting examples are
those in which contexts from the chosen context set appear). We can imagine that
some learners may prefer quick convergence even at the expense of inaccuracy of
learning (over-generalization). Another reason for taking the most frequent contexts is that it is reasonable to assume that in most settings, the most frequent
contexts depend not only on the language structure (grammar) but also on language usage. The influence of language usage on the chosen context set is then
described by a random variable X(L). This random variable depends, indeed, on
the language being used. However, we assume that even if the language changes
(over the generations) some common properties of language usage remain fixed.
For example, since the size of a context is proportional to the number of words in
the sentence in which it appears and since this directly influences the semantics
of the sentence, it seems reasonable to assume that there is a bound on the size
of the most frequent contexts, regardless of the specific grammar being used. In
this way we satisfy the condition P (X(L) Cmax ) = 1 of the previous section
for all languages L. As we saw before, this condition guarantees that, with probability 1, the language sequence will eventually converge to a language which is
learnable with probability 1.
It is possible to come up with different variants of the most frequent contexts approach outlined here. The discussion above would apply to any of them.
We here present just one, very simple method which is based on the frequencies
of contexts in a predetermined initial segment of the sample. The algorithm selects the k most frequent contexts in the first n examples or, alternatively, can
take all those contexts which appeared at least k times in the first n examples.
89
88
This method is certain to construct a finite context set and to stabilize to a
constant set after a finite number of steps.
Conclusion
The algorithm presented in this paper is extremely simple and natural. It essentially amounts to reading off rules from examples and discarding rules when
they become redundant (non-minimal). This simplicity allows us to distinguish
those factors essential to the correct functioning of the algorithm from those
details which may be more freely modified. What is essential to the working of
the algorithm is the use of a finite set of structures (the context set), the use of a
grammar based on a partial order and the use of rules already inferred by the algorithm to infer additional rules (through the ctx function). Since inferred rules
may initially be incorrect, the rules are layered to ensure correct convergence.
Other details of the algorithm can be easily modified without changing the basic
results. For example, the use of full contexts can be replaced by sub-structures of
those contexts, sub-structures of the parse trees (equipped with an appropriate
ordering) or even by information which is external to the sentence structure itself (such as semantic information). All one has to ensure is that a large enough
variety of structures is available for the algorithm to be able to distinguish trees
of different types (as in Theorem 2). Different choices of such structures can
be made, without any significant effect on the analysis given here. In addition,
different methods for determining the finite structure set (the context set given
as a parameter) can be devised. While all these different variants have the same
basic convergence properties, they may differ greatly in their rate of convergence, extent of over-generalization, computational complexity and applicability
to actual learning situations.
References
1. H. Fernau, Learning Tree Languages from Text. Technical report WSI-2001-19,
Universit
at T
ubingen (Germany), Wilhelm-Schickard-Institut f
ur Informatik, 2001.
2. H. Fernau, Learning Tree Languages from Text. In J. Kivinen and R. H. Sloan
(eds.) Proceedings of COLT 2002, no. 2375 of LNAI, pp. 153-168. Springer, 2002.
3. E. M. Gold, Language Identification in the Limit. Inform. and Control 10, pp.
447-474, 1967.
4. M. Kanazawa, Learnable Classes of Categorial Grammars. Studies in Logic Language and Information, CSLI Publications, 1998.
5. T. Knuutila and M. Steinby, The Inference of Tree Languages from Finite Samples:
an Algebraic Approach. Theo. Comp. Sci. 129, pp. 337-367, 1994.
6. Y. Sakakibara, Learning Context-Free Grammars from Structural Data in Polynomial Time. Theo. Comp. Sci. 76, pp. 223-242, 1990.
7. Y. Sakakibara, Efficient Learning of Context-Free Grammars from Positive Structural Examples. Inform. Comput. 97, pp. 23-60, 1992.
8. J. van Kampen, Bootstraps at Two for Lexicon and Discourse. In Proceedings of
ELA (Early Lexicon Acquisition) 2001.
90
89
91
90
92
91
93
92
94
93
95
94
96
95
97
96
98
97
99
98
100
99
101
100
102
101
1 Introduction
Context-free grammars may be considered to be the customary way of representing syntactical structure in natural language sentences. In many naturallanguage processing applications, obtaining the correct syntactical structure for
a sentence is an important intermediate step before assigning an interpretation
to it. But ambiguous parses are very common in real natural-language sentences
(e.g., those longer than 15 words); the fact that many ambiguous parse examples in books sound a bit awkward is due to the fact that they involve short sentences [1, 2, p. 411]. Choosing the correct parse for a given sentence is a crucial
task if one wants to interpret the meaning of the sentence, due to the principle of
compositionality [3, p. 358], which states, informally, that the interpretation of a
sentence is obtained by composing the meanings of its constituents according to
the groupings defined by the parse tree 1 .
A set of rather radical hypotheses as to how humans select the best parse
tree [6] propose that a great deal of syntactic disambiguation may actually occur without the use of any semantic information; that is, just by selecting a
preferred parse tree. It may be argued that the preference of a parse tree with
respect to another is largely due to the relative frequencies with which those
The authors wish to thank the Generalitat Valenciana project CTIDIB/2002/173 and
the Spanish CICyT project TIC2000-1599 for supporting this work.
The principle of compositionality in natural language is akin to the concept of syntaxdirected translation used in compiler construction [4, p. 25] and to the principles which
inspire the syntactic transfer architecture used in machine translation systems [5].
103
102
choices have lead to a successful interpretation. This sets the ground for a family of techniques which use a probabilistic scoring of parses to the correct parse
in each case.
Probabilistic scorings depend on parameters which are usually estimated
from data, that is, from parsed text corpora such as the Penn Treebank [7]. The
most straightforward approach is that of treebank grammars, [8]. Treebank grammars which will be explained in detail below- are probabilistic context-free
grammars in which the probability that a particular nonterminal is expanded
according to a given rule is estimated as the relative frequency of that expansion by simply counting the number of times it appears in a manually-parsed
corpus. This is the simplest probabilistic scoring scheme, and it is not without problems; we will show how a set of approximate models, which we will
call offspring-annotated models, in which expansion probabilities are dependent
on the future expansions of children, may be seen as a generalization of the
classic -gram models to the case of trees, and include treebank grammars as
a special case; other models, such as Johnsons [9] parent-annotated models (or
more generally, ancestry annotated models) and IBM history-based grammars
[2, p. 423],[10] or Bods and Schas Data-Oriented parsing [11] offer an alternate approach in which the probability of expansion of a given nonterminal is
made dependent on the previous expansions. An interesting property of many
of these models is that, even though they may be seen as context-dependent,
they may still be easily rewritten as context-free models in terms of specialized
versions of the original nonterminals.
This paper is organized as follows: section 2 proposes our generalization
of the classic -gram models to the case of trees, which is shown to be equivalent to having a specialized context-free grammar. A simplication of this model,
called the child-annotated model or
, for short, is also presented in that section. Experiments using this model for structural disambiguation are presented
in section 4 and compared with other family-annotated models, and, finally,
conclusions are given in section 5.
2 The model
Let
a treebank, that is, a sample of parse trees.
and for all trees
"!$#&%' we define the -root of
/
0123
(*) +
""!,#-#. ( )*4
(1)
5-
#6 ( )*4
75 ! #-#987:<;>=?<@A0CB
=
The sets D )
# of -forks and E )
# of -subtrees are defined for all FG as
For all
as the tree
follows:
D ) 55-
9
! #
#.IH J-! K
D ) J #LH
/NM
( ) 55-
9
! #
#9801O:-;>3O= PR?-@AQS0C= B-TS= :-;U5 -
! #
#&VW
(2)
104
103
M 5
"!$#90 19VFQ>=TS:<;L55
"!$#
# V (3)
87:<;>=?<@A0 B-=
# denotes the depth of the tree , The depth of a single node
where
tree is zero).
We define the treebank probabilistic testable grammar
#
E ) 55
"!$#
#.IH J-! K
E ) J # H
through:
( )4
+D )
$#-#LH E *) 4 7
" #LH
;
is the set of lables in ;
is the start
symbol;
( ( #
# (
% ( # % 3 where H #! is
a set of production rules (usually written as "$#&% , where " %' and
% %
(
H
# ! ) and ( # is the emission probability associated with the rule
( . The set is built as follow:
) for every tree O% ()
$# add to the rule
+*,# with probability
- +*,# #
.'/10
32457698:;
>
/=<
(4)
55
-"!,#-#D*A#
. 1/ 0
F 55-
" - ! #
#
G
. 1/ 0
GF ()*4
+
"!$#
# -
(5)
counts the number of times that the fork appers in the
( )*4
5 -
"
! #
#D*E# -
O" U ! #.
#
5-
"
! # % E ) " # add to the rule 5-
"
! #H*A#
-
&"' 9 !
55
-"!,#D*A#
"!$# 3
(6)
Here F
tree ;
) for every tree
with probability
%G I
KML N
J
=4 O
K
0QP
R"S*,#T%U# 3
(7)
and the consistency constraint. PCFGs estimated from treebanks using the relative frequency estimator always satisfy those constraints [12] [13].
Note that in this kind of models, the expansion probability for a given node
is computed as a function of the subtree of depth U*WV that the node generates,
i.e., every non-terminal symbol stores a subtree of depth X*YV . In the particular
case ZV , only the label of the node is taken into account (this is analogous to
105
104
the standard bigram model for strings) and the model coincides with the simple
rule-counting approach used in treebank grammars.
However, in the case
, we get a child-annotated model, that is, nonterminal symbols
are defined by:
5
! #
the node label ,
- < !
As an ilustration, consider a very simple sample with only the tree in the
figure 1. If we choose ZV , then
(
D
E
O#-#. ;
&#
#.M
O#<#
O#
,#
O#
O#
O#-
with
& N #
NP VP
NP PP
N
VP PP
V NP
P NP
#
#
#
#
#
2 we obtain
O#-#.
O#
&#
#. # O#-#
# O#-#
O#<# ,#
However, for
(
D
E
&# #
#-#
with
O#
,#
&#
O#
*,#
O#
* #
,# &#
,
*,
#
&#
*,
#
,# &#
*,
# #
*,
#
O#
&#
O#
&#
#
O#- N #
106
105
NP
VP
NP
NP
PP
NP
N
For comparison, if one uses a parent-annotated version of the grammar (following Johnson [9]), one gets the following rules 2 (where the superindex is the
parents label).
NP
VP
VP
NP
NP
NP
NP
PP
PP
NP
S
*,# S NP S VP
* #
,
* # VP NP
,
*,# NP NP NP PP
*,#
*,# PP NP
*,#
3 Smoothing
As will be seen in the experiments section, although the child-annotated models
yield a good performance (in terms of both parsing and speed), their rules are
very specific and, then, some events (subtrees, in our case) in the test set are not
present in the training data, yielding zero probabilities. Due to data sparseness,
this happens often in practice. However, this is not the case of the unannotated
V , with total coverage but with worse performance. This justifies
model,
the need for smoothing methods.
In the following, three smoothing techniques are described. Two of them
are well known: linear interpolation and tree-level back-off. In addition, we
introduce a new smoothing technique: rule-level back-off.
107
106
3.1 Linear interpolation
Smoothing through linear interpolation [14] is performed by computing the probability of events as a weighted average of the probabilities given by different
models. For instance, the smoothed probability of a
model could be computed as a weighted average of the probability given by the model itself, and
V model, that is, the probability of a tree in the interpothat given by the
lated model is
2
5 #
#
#
*
5 #
(8)
where
and
are the probabilities of the tree in, respectively, the
model
and ZV .
In our experiments, the mixing parameter
will be chosen to minimize the perplexity of the sample.
2I
2
% >3
3H * # 5
#
#
where
.
4L
if
if
5
#
5
#.W
;
4 < K
5
#
(9)
(10)
In our experiments, we will assume that a may be found such that no sen
tence in the test set having a parse-tree with
has another parse
with
. Therefore, s will only be computed for trees with
.
This leads to the following efficient parsing strategy: ZV (unannotated, slow)
parsing is not launched if the
(annotated, fast) parser returns a tree, because the
tree will win out all +V trees; therefore, for parsing purposes,
the actual value of is irrelevant.
#&
#
2I
#
5 #.I
2
I
1.
rules with modified probability,
2. back-off rules that allow to switch to the lower model, and,
3. modified ZV rules to switch-back to the higher model.
2
108
107
R
2 ! #-# 3
2
; :
<O J
K : K
3. Add the 'V rules as unary rules, that is, if the rule is
then, add #
with probability:
R
2 ! #
# R
2 ! #-#.
where
0
K
#
#
2
#$%
(13)
2 !
! #
(14)
%
7% exists in the grammar; this
must be checked in parse time.
R
2!,# #
2!#
probability if R
2!$#D# % :
4 Experiments
4.1 General conditions
We have performed a series of experiments to assess the structural disambiguation performance of offspring-annotated models as compared to standard treebank grammars, that is, to compare their relative ability for selecting the best
parse tree. To better put these comparisons in context, we have also evaluated
Johnsons [9] parent annotation scheme. To build training corpora and test sets
of parse trees, we have used English parse trees from the Penn Treebank, release
3, with small, basically structure-preserving modifications:
insertion of a root node (ROOT) in all sentences, (as in Charniak [8]) to
encompass the sentence and final periods, etc.;
removal of nonsyntactic annotations (prefixes and suffixes) from constituent
labels (for instance, NP-SBJ is reduced to NP);
removal of empty constituents; and
collapse of single-child nodes with the parent node when they have the
same label.
109
108
In all experiments the training corpus, consisted of all of the trees (41,532) in
sections 02 to 22 of the Wall Street Journal portion of Penn Treebank, modified
as above. This gives a total number of more than 600,000 subtrees. The test set
contained all sentences in section 23 having less than 40 words.
All grammar models were written as standard context-free grammars, and
Chappelier and Rajmans [15] probabilistic extended Cocke-Younger-Kasami
parsing algorithm (which constructs a table containing generalized items like
those in Earleys [16] algorithm) was used to obtain the most likely parse for
each sentence in the training set; this parse was compared to the corresponding gold-standard tree in the test set using the customary PARSEVAL evaluation metric [1, 2, p. 432] after deannotating the most likely tree delivered by the
parser. PARSEVAL gives partial credit to incorrect parses by establishing three
measures:
labeled precision ( ) is the fraction of correctly-labeled nonterminal bracketing (constituents) in the most likely parse which match the gold-standard
parse,
labeled recall ( ) is the fraction of brackets in the gold-standard parse which
are found in the most likely parse with the same label, and
crossing brackets ( ) refers to the fraction of constituents in one parse cross
over constituent boundaries in the other parse.
The crossing brackets measure does not take constituent labels into account and
will not be shown here. Some authors (see, e.g. [17]) have questioned partialcredit evaluation metrics such as the PARSEVAL measures; in particular, if one
wants to use a probability model to perform structural disambiguation before
assigning some kind of interpretation ot the parsed sentence, it may well be
argued that the exact match between the gold-standard tree and the most likely
tree is the only possible relevant measure. It is however, very well known that
the Penn Treebank, even in its release 3, still suffers from problems. One of the
problems worth mentioning (discussed in detail by Krotov et al. [18]) is the
presence of far too many partially bracketed constructs according to rules like
NP #
NN NN CC NN NN NNS which lead to very flat trees, when one can,
in the same treebank, find rules such as NP # NN NN, NP # NN NN NNS
and NP # NP CC NP, which would lead to more structured parses such as
NP
NP
CC
NP
NN
NN
NN
NN
NNS
Some of these flat parses may indeed be too flat to be useful for semantic purposes and have little linguistic plausibility; therefore, if one gets a more refined
parse, one may consider it to be the one leading to the correct interpretation or
not, but it surely contains a more information than the flat, unstructured one.
110
109
For this reason, we have chosen to give, in addition to the exact-match figure, the percentage of trees having 100% recall, because these are the trees in
which the most likely parse is either exactly the gold-standard parse or a refinement thereof in the sense of the previous example.
4.2 Structural disambiguation results
Here is a list of the models which were evaluated:
A standard treebank grammar, with no annotation of node labels (NO or
'V ), with probabilities for 15,140 rules.
A child-annotated grammar (CHILD or
), with probabilities for 92,830
rules.
A parent-annotated grammar (PARENT), with probabilities for 23,020 rules.
A both parent- and child-annotated grammar (BOTH), with probabilities for
112,610 rules.
2
D K
111
110
The average time to parse a sentence shows that child annotation leads to
parsers that are much faster. This comes as no surprise because the number
of possible parse trees considered is drastically reduced; this is, however,
not the case with parent-annotated models.
It may be worth mentioning that parse trees produced by child-annotated models tend to be more structured and refined than parent-annotated and unannotated parses which tend to use rules that lead to flat trees in the sense mentioned
above. Favoring structured parses may be convenient in some applications (especially where meaning has to be compositionally computed) but may be less
convenient than flat parses when the structure obtained is incorrect.
On the other hand, child-annotated models, CHILD and BOTH, were unable
to deliver a parse tree for all sentences in the test set (CHILD parses 94.6% of the
sentences and BOTH, 79.6%). To be able to parse all sentences, the smoothing
techniques described in section 3 were applied.
Those smoothed models were evaluated:
A linear interpolated model, M1, as described in subsection 3.1 with
(the value of selected to minimize the perplexity).
A tree-level back-off, M2, as described in subsection 3.2.
A rule-level back-off, M3, as described in subsection 3.3. This model has
V rules and 10,250 back-off rules. A fixed
rules, 15,140
92,830
parameter (0.005) was selected to maximize labelled recall and precision.
M ODEL
EXACT PARSED
M1
80.2% 78.6% 17.4% 100%
M2
78.9% 74.2% 17.1% 100%
82.4% 81.3% 17.5% 100%
M3
57
9.3
68
I
112
111
5 Conclusion
We have introduced a new probabilistic context-free grammar model, offspringannotated PCFG in which the grammar variables are specialized by annotating
them with the subtree they generate up to a certain level. In particular, we have
studied child-annotated models (one level) and have compared their parsing
performance to that of unannotated PCFG and of parent-annotated PCFG [9].
Offspring-annotated models may be seen as a special case of a very general
probabilistic state-based model, which in turn is based on probabilistic bottomup tree automata. The experiments show that:
The parsing performance of parent-annotated and the proposed child-annotated
PCFG is similar.
Parsers using child-annotated grammars are, however, much faster because
the number of possible parse trees considered is drastically reduced; this is,
however, not the case with parent-annotated models.
Child-annotated grammars have a larger number of parameters than parentannotated PCFG which may make it difficult to estimate them accurately
from currently available treebanks.
Child-annotated models tend to give very structured and refined parses
instead of flat parses, a tendency not so strong for parent-annotated grammars.
As future work we plan to study the use of statistical confidence criteria as
used in grammatical inference algorithms [19] to eliminate unnecessary annotations by merging states, therefore reducing the number of parameters to be
V ) may be
estimated. Indeed, offspring-annotation schemes (for a value of
useful as starting points for those state-merging mechanisms, which so far have
always been initialized with the complete set of different subtrees found in the
treebank (ranging in the hundreds of thousands).
References
1. Ezra Black, Steven Abney, Dan Flickinger, Claudia Gdaniec, Ralph Grishman,
Philip Harrison, Donald Hindle, Robert Ingria, Frederick Jelinek, Judith Klavans,
Mark Liberman, Mitch Marcus, Salim Roukos, Beatrice Santorini, and Tomek Strzalkowski. A procedure for quantitatively comparing the syntatic coverage of English
grammars. In Proc. Speech and Natural Language Workshop 1991, pages 306311, San
Mateo, CA, 1991. Morgan Kauffmann.
2. Christopher D. Manning and Hinrich Schutze.
113
112
6. L. Frazier and K. Rayner. Making and correcting errors during sentence comprehension: Eye movements in the analysis of structurally ambiguous sentences. Cognitive
Psychology, 14:178210, 1982.
7. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a
large annotated corpus of English: the Penn Treebank. Computational Linguistics,
19:313330, 1993.
8. Eugene Charniak. Treebank grammars. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 10311036. AAAI Press/MIT Press, 1996.
9. Mark Johnson. PCFG models of linguistic tree representations. Computational Linguistics, 24(4):613632, 1998.
10. Ezra Black, Frederick Jelinek, John D. Lafferty, David M. Magerman, Robert L. Mercer, and Salim Roukos. Towards history-based grammars: Using richer models for
probabilistic parsing. In Proceedings of the DARPA Speech and Natural Language Workshop, pages 3137, 1992.
11. R. Bod and R. Scha. Data-oriented language processing: An overview. Technical
Report LP-96-13, Departement of Computational Linguistics, University of Amsterdam, The Netherlands, 1996.
12. J.A. Sanchez and J.M. Bened. Consistency of stochastic context-free grammars from
probabilistic estimation based on growth transformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(9):10521055, 1997.
13. Zhiyi Chi and Stuart Geman. Estimation of probabilistic context-free grammars.
Computational Linguistics, 24(2):299305, 1998.
14. L. R. Bahl, P. F. Brown, P. V. de Souza, and R. L. Mercer. A tree-based statistical
language model for natural language speech recognition. In A. Waibel and K.-F.
Lee, editors, Readings in Speech Recognition, pages 507514. Kaufmann, San Mateo,
CA, 1990.
15. J.-C. Chappelier and M. Rajman. A generalized CYK algorithm for parsing stochastic CFG. In Actes de TAPD98, pages 133137, 1998.
16. J. Earley. An efficient context-free parsing algorithm. Communications of the ACM,
13(2):94102, 1970.
17. John Carroll, Ted Briscoe, and Antonio Sanfilippo. Parser evaluation: A survey and
a new proposal. In Proceedings of the International Conference on Language REsources
and Evaluation, pages 447454, Granada, Spain, 1998.
18. Alexander Krotov, Robert Gaizauskas, Mark Hepple, and Yorick Wilks. Compacting
the Penn Treebank grammar. In Proceedings of COLING/ACL98, pages 699703, 1998.
19. Rafael C. Carrasco, Jose Oncina, and Jorge Calera-Rubio. Stochastic inference of
regular tree languages. Machine Learning, 44(1/2):185197, 2001.
114
113
115
114
116
115
117
116
118
117
119
118
120
119
121
120
122
121
123
122
124
123
125
124