Documentos de Académico
Documentos de Profesional
Documentos de Cultura
indd 1
PRINCIPLES OF
ARTIFICIAL NEURAL
NETWORKS
3rd Edition
4/7/13 3:32 PM
Published
PRINCIPLES OF
ARTIFICIAL NEURAL
NETWORKS
3rd Edition
Daniel Graupe
University of Illinois, Chicago, USA
World Scientific
NEW JERSEY
8868hc_9789814522731_tp.indd 2
LONDON
SINGAPORE
BEIJING
SHANGHAI
HONG KONG
TA I P E I
CHENNAI
4/7/13 3:32 PM
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
ISBN 978-981-4522-73-1
Printed in Singapore
15:33
ws-book975x65
vi
ws-book975x65
15:33
ws-book975x65
Acknowledgments
vii
viii
ws-book975x65
15:33
ws-book975x65
The Third Edition diers from the Second Edition in several important
aspects. I added 4 new detailed case studies describing a variety of applications,
(new Sections 6.D, 7.C, 9.C, 9.D) together with their respective source codes. This
brings the total number of application to 19. This will allow the reader rst-hand
access to a wide range of dierent concrete APPLICATIONS of Neural Networks
ranging from medicine to constellation detection, thus establishing the main claim
of Neural Networks, namely the claim of its GENERALITY.
The new case studies include an application to a non-linear prediction problems
(case study 9.C), which are indeed a eld where articial neural networks are and
will play a major role. This case study also compares performances of two dierent neural networks in terms of accuracy and computational time, for the specic
problem of that case study. Also, two new Section (9.6 and 9.8) were added to
Chapter 9.
Text organization is also modied. The Chapter on the Large Memory Storage
and Retrieval Neural Network (LAMSTAR) is moved from Chapter 13 of the Second Edition, to become Chapter 9 in the present Edition. Consequently, the old
Chapter 9 to 12 are now Chapters 10 to 13. This allows teaching and self-study to
follow the main Articial Neural Networks (ANN) in a more logical order in terms
of basic principles and generality. We consider that in short courses, Chapters 1 to
9 will thus become the core of a course on ANN.
It is hoped that these text and this enhanced Edition can serve to show and
to persuade scientists, engineers and program developers in areas ranging from
medicine to nance and beyond, of the value and the power of ANN in problems
that are ill-dened, highly non-linear, stochastic and of time-varying dynamics and
which often appear to be beyond solution.
Additional corrections and minor modications are also included, as are other
updates based on recent developments including those relating to the authors
research.
Daniel Graupe
Chicago, IL
March 2013
ix
ws-book975x65
15:33
ws-book975x65
The Second Edition contains certain changes and additions to the First
Edition. Apart from corrections of typos and insertion of minor additional details
that I considered to be helpful to the reader, I decided to interchange the order of
Chapters 4 and 5 and to rewrite Chapter 13 so as to make it easier to apply the
LAMSTAR neural network to practical applications. I also moved the Case Study
6.D to become Case Study 4.A, since it is essentially a Perceptron solution.
I consider the Case Studies important to a reader who wishes to see a concrete
application of the neural networks considered in the text, including a complete
source code for that particular application with explanations on organizing that application. Therefore, I replaced some of the older Case Studies with new ones with
more detail and using most current coding languages (MATLAB, Java, C++). To
allow better comparison between the various neural network architectures regarding
performance, robustness and programming eort, all Chapters dealing with major
networks have a Case Study to solve the same problem, namely, character recognition. Consequently, the Case studies 5.A (previously, 4.A, since the order of these
chapters is interchanged), 6.A (previously, 6.C), 7.A, 8.A, have all been replaced
with new and more detailed Case Studies, all on character recognition in a 6 6
grid. Case Studies on the same problem have been added to Chapter 9, 12 and
13 as Case Studies 9.A, 12.A and 13.A (the old Case Studies 9.A and 13.A now
became 9.B and 13.B). Also, a Case Study 7.B on applying the Hopeld Network to
the well known Traveling Salesman Problem (TSP) was added to Chapter 7. Other
Case Studies remained as in the First Edition.
I hope that these updates will add to the readers ability to better understand
what Neural Networks can do, how they are applied and what the dierences are
between the dierent major architectures. I feel that this and the case studies with
their source codes and the respective code-design details will help to ll a gap in the
literature available to a graduate student or to an advanced undergraduate Senior
who is interested to study articial neural networks or to apply them.
Above all, the text should enable the reader to grasp the very broad range of
problems to which neural networks are applicable, especially those that defy analysis
and/or are very complex, such as in medicine or nance. It (and its Case Studies)
xi
15:33
xii
ws-book975x65
should also help the reader to understand that this is both doable and rather easily
programmable and executable.
Daniel Graupe
Chicago, IL
September 2006
15:33
ws-book975x65
This book evolved from the lecture notes of a rst-year graduate course entitled
Neural Networks which I taught at the Department of Electrical Engineering
and Computer Science of the University of Illinois at Chicago over the years 1990
1996. Whereas that course was a rst-year graduate course, several Senior-Year
undergraduate students from dierent engineering departments, attended it with
little diculty. It was mainly for historical and scheduling reasons that the course
was a graduate course, since no such course existed in our program of studies and in
the curricula of most U.S. universities in the Senior Year Undergraduate program. I
therefore consider this book, which closely follows these lecture notes, to be suitable
for such undergraduate students. Furthermore, it should be applicable to students
at that level from essentially every science and engineering University department.
Its prerequisites are the mathematical fundamentals in terms of some linear algebra
and calculus, and computational programming skills (not limited to a particular
programming language) that all such students possess.
Indeed, I strongly believe that Neural Networks are a eld of both intellectual
interest and practical value to all such students and young professionals. Articial
neural networks not only provide an understanding into an important computational architecture and methodology, but they also provide an understanding (very
simplied, of course) of the mechanism of the biological neural network.
Neural networks were until recently considered as a toy by many computer
engineers and business executives. This was probably somewhat justied in the
past, since neural nets could at best apply to small memories that were analyzable
just as successfully by other computational tools. I believe (and I tried in the
later chapters below to give some demonstration to support this belief) that neural
networks are indeed a valid, and presently, the only ecient tool, to deal with very
large memories.
The beauty of such nets is that they can allow and will in the near-future allow,
for instance, a computer user to overcome slight errors in representation, in programming (missing a trivial but essential command such as a period or any other
symbol or character) and yet have the computer execute the command. This will
obviously require a neural network buer between the keyboard and the main proxiii
15:33
xiv
ws-book975x65
grams. It should allow browsing through the Internet with both fun and eciency.
Advances in VLSI realizations of neural networks should allow in the coming years
many concrete applications in control, communications and medical devices, including in articial limbs and organs and in neural prostheses, such as neuromuscular
stimulation aids in certain paralysis situations.
For me as a teacher, it was remarkable to see how students with no background
in signal processing or pattern recognition could easily, a few weeks (1015 hours)
into the course, solve speech recognition, character identication and parameter
estimation problems as in the case studies included in the text. Such computational
capabilities make it clear to me that the merit in the neural network tool is huge.
In any other class, students might need to spend many more hours in performing
such tasks and will spend so much more computing time. Note that my students
used only PCs for these tasks (for simulating all the networks concerned). Since
the building blocks of neural nets are so simple, this becomes possible. And this
simplicity is the main feature of neural networks: A house y does not, to the
best of my knowledge, use advanced calculus to recognize a pattern (food, danger),
nor does its CNS computer work in picosecond-cycle times. Researches into neural
networks try, therefore, to nd out why this is so. This leads and led to neural
network theory and development, and is the guiding light to be followed in this
exciting eld.
Daniel Graupe
Chicago, IL
January 1997
15:33
ws-book975x65
Contents
Acknowledgments
vii
ix
xi
xiii
Chapter 2.
.
.
.
.
.
.
.
.
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
10
11
13
.
.
.
.
.
.
.
.
17
22
22
24
. .
25
17
37
37
39
15:33
xvi
ws-book975x65
Chapter 6.
Back Propagation
59
Introduction . . . . . . . . . . . . . . . . . . . . . . . .
Binary Hopeld Networks . . . . . . . . . . . . . . . .
Setting of Weights in Hopeld Nets Bidirectional
Associative Memory (BAM) Principle . . . . . . . . .
7.4. Walsh Functions . . . . . . . . . . . . . . . . . . . . .
7.5. Network Stability . . . . . . . . . . . . . . . . . . . . .
7.6. Summary of the Procedure for Implementing the
Hopeld Network . . . . . . . . . . . . . . . . . . . . .
7.7. Continuous Hopeld Models . . . . . . . . . . . . . . .
7.8. The Continuous Energy (Lyapunov) Function . . . . .
7.A. Hopeld Network Case Study: Character Recognition
7.B. Hopeld Network Case Study: Traveling Salesman
Problem . . . . . . . . . . . . . . . . . . . . . . . . . .
7.C. Cell Shape Detection Using Neural Networks . . . . .
123
. . 123
. . 123
. . 125
. . 127
. . 129
.
.
.
.
.
.
.
.
. . 147
. . 170
131
132
133
135
185
.
.
.
.
.
.
.
.
.
.
.
.
185
186
186
187
189
190
. . 190
203
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Basic Principles of the LAMSTAR Neural Network . . . . 204
15:33
ws-book975x65
Contents
9.3.
9.4.
9.5.
9.6.
9.7.
9.8.
9.9.
9.A.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . 240
. . 253
275
10.1.
10.2.
10.3.
10.4.
10.5.
10.6.
Motivation . . . . . . . . . . . . . . . . . . . . . . .
The ART Network Structure . . . . . . . . . . . . .
Setting-Up of the ART Network . . . . . . . . . . .
Network Operation . . . . . . . . . . . . . . . . . . .
Properties of ART . . . . . . . . . . . . . . . . . . .
Discussion and General Comments on ART-I and
ART-II . . . . . . . . . . . . . . . . . . . . . . . . .
10.A. ART-I Network Case Study: Character Recognition
10.B. ART-I Case Study: Speech Recognition . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
305
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
275
275
279
280
281
. . . 283
. . . 283
. . . 297
205
211
213
213
214
217
218
. . 220
. . 236
11.1.
11.2.
11.3.
11.4.
11.5.
xvii
Fundamental Philosophy . . . . . . . . . . . . . . . . .
Annealing Methods . . . . . . . . . . . . . . . . . . . .
Simulated Annealing by Boltzman Training of Weights
Stochastic Determination of Magnitude of Weight
Change . . . . . . . . . . . . . . . . . . . . . . . . . .
12.5. Temperature-Equivalent Setting . . . . . . . . . . . . .
12.6. Cauchy Training of Neural Network . . . . . . . . . .
12.A. Statistical Training Case Study: A Stochastic Hopeld
Network for Character Recognition . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
305
305
305
307
309
311
. . 311
. . 312
. . 312
. . 313
. . 313
. . 314
. . 315
15:33
xviii
ws-book975x65
327
. . . 327
. . . 328
. . . 330
. . . 330
Problems
343
References
349
Author Index
357
Subject Index
361