Está en la página 1de 73

MENE 5223 Artificial Intelligence

Ms. Izadora Mustaffa


FKEKK, UTeM

What is Artificial Neural Networks
(ANN)?

Inspired by how the brain process information

Composed of a large sets of highly interconnected
processing elements (neurons) working in parallel to
solve specific problems

ANNs, like human, learn by example

Pattern recognition, data classification of an ANN is
configured through a learning process

7/11/2014
7/11/2014
Regions Structure Gross Function
1. Forebrain Neocortex /Cerebral cortex (two hemispheres)
Occipital
Temporal
Parietal
Frontal
Amygdala
Hippocampus
Basal ganglia
Septum

Vision
Hearing, speech
Vision
Cognition
Emotion
Emotion
Motor
Emotion
1. Midbrain Thalamus
Hypothalamus
I/O to forebrain
Internal regulation
1. Hindbrain Pons
Brainstem
Medulla
Cerebellum
Sleep, waking attention
Sleep, waking attention
Sleep, waking attention
Motor
7/11/2014
7/11/2014

Reunification of










This development is coined as connectionism or
more famously as Artificial Neural Network (ANN)

7/11/2014
neurobiological
modelling
artificial
intelligence
Machines performing cognitive functions are now built
according to HOW brain are performing cognitive functions
(simulated brain regions)
Definition of Neural Networks
(given by Defense Advance Research Projects Agency (DARPA) 1998)

a system composed of many simple processing elements operating in
parallel whose functions is determined by network structures,
connection strengths and the processing performed at the computing
elements or nodes
1
.

NN architectures are inspired by the architecture of biological nervous
systems, which use many simple processing elements operating in
parallel to obtain high computation rates.
1
nodes are functional units in NN also referred to as units, cells and populations.
It is population of neurons that are unified and have functional sense ; see light, hunger
Historical Overview
McCulloch and Pitts (1943): first neural network model; linear
threshold law

Hebb (1949): proposed a mechanism for learning; learning
law

Rosenblatt (1958): Perceptron network and the associated
learning rule

Widrow & Hoff (1960): a new learning algorithm for linear
neural networks (ADALINE)

Minsky and Papert (1969): widely influential book about the
limitations of single-layer perceptrons, causing the research
on NNs mostly to come to an end.
7/11/2014
Anderson, Kohonen (1972): Use of ANNs as associative
memory

Grossberg (1980): Adaptive Resonance Theory (ART)

Hopfield (1982): Hopfield Network

Kohonen (1982): Self-organizing maps (SOM)

Rumelhart and McClelland (1982): Backpropagation
algorithm for training multilayer feed-forward networks
...


7/11/2014
Neuron Model
7/11/2014
or
(abbreviated notation version)
scalar input, p
scalar weight, w
bias or offset, b
summer output or net input, n
transfer function or activation function, f
scalar neuron output, a
7/11/2014
relations to the biological neuron ;
weight, w corresponds to the strength of a synapse
summation and transfer function are the cell body
neuron output, a is the signal on the axon

neuron output can be calculated e.g.:
w = 3, p = 2 and b = -1.5 then

a = f(3(2) (1.5) = f(4.5)

the actual output really depends on the transfer function

individual inputs p
1
, p
2
, ...p
R
are weighted by
corresponding elements w
1,1
, w
1,2
, ...w
1,R
of the
weight matrix, W

n is a summation of the weighted inputs and b
n = w
1,1
p
1
+ w
1,2
p
2
+ w
1,R
p
R
+ b

in matrix form :
n = W
p
+ b


7/11/2014







Table of Transfer Functions
7/11/2014
7/11/2014
Classification of ANNs
ANN can be classified in a number of ways. Here we will
look at two classification according to their :

A) Network architecture

i. Feed forward networks
ii. Recurrent networks
iii. Competitive networks

B) Mode of learning

i. Supervised learning
ii. Unsupervised learning
iii. Reinforcement learning
7/11/2014
7/11/2014
Input units Hidden units Output unit
Feed forward networks
7/11/2014
Feedback
Recurrent Networks
7/11/2014
Supervised Learning

Supervised training or learning requires training data that
contains both the network inputs and the associated
outputs.

This means that to use this type of training one must have a
set of training inputs for which the outputs are known.
Once this type of network performs satisfactorily on the
training examples, it can be used with inputs for which the
correct outputs are not known.


7/11/2014
7/11/2014
Unsupervised learning

Unsupervised learning techniques use only input data
and attempt through self organisation to divide the
examples presented to the network inputs up into
categories or groups with similar characteristics.

Unsupervised learning can act as a type of discovery
process identifying significant features in the input
patterns presented to it.

7/11/2014
Reinforcement learning

Similar to supervised learning.
Instead being provided with the correct output for each
network input, the algorithm is given a grade, which is a
measure of the network performance over some
sequence of inputs.
Less common than supervised learning.

Principles of ANN
Any major cognitive process or ANN to be analyzed should be
organized into subprocesses and subnetworks

Principles of ANN are used in analyzing cognitive process into
subprocess;

1) ASSOCIATIVE LEARNING
2) COMPETITION
3) OPPONENT PROCESSING
7/11/2014
Two examples of analyzing cognitive process into subprocess;

1) CATEGORIZATION

2) ATTENTIONAL MODULATION OF CONDITIONING

7/11/2014
Categorization
For example, matching a handwritten character to a known letter of the
alphabet.

The process: two units need to be included in the network
7/11/2014
responds to the
presence or abscence of
writing at particular location
responds to the
patterns of feature nodes
activation representing
particular letters
Feature nodes Category nodes
The connection between these two units should be allowed to
change over time as a result of repeated activation of the connection
principle of associative learning

If it was hard-wired, non-Roman characters cannot be identified,
such as Russian and Chinese characters
Categorization cont
Another principle that could be applied in this example is the
competition principle, for example recognizing E (sloppy
handwriting) as E or F

The process:
the same two units are applied in the network

the feature nodes are activated to a varying degree by
the incoming input

via internode connections, the categorization nodes are
then activated

if both nodes E and F are activated, the system will
choose the greater activation of the two.


7/11/2014
Attentional Modulation of Conditioning

Suppose that a neutral stimuli such as the sound of the bell is associated with
food; such with the Pavlovian dog.

How does an animal learn to pay more attention to
the bell than to other neutral stimuli in its
environment?

The ANN principle of competition. Different nodes exerts sensory stimuli
then the bell node should win the competition.

WHY?

Principle of associative learning. Prior association between bell & food.
Repeated pairing of the stimuli strengthen the pathways.
7/11/2014
Opponent Processing Principle
Discovered by Grossberg & co-workers.

Neural architectures are organized into pairs of pathways of
opposite significance;
light & dark
reward & punishment

It works in a manner that, with a sudden decrease in one
pathway, this activates the opposing pathway

All these principles are being suggested by a heterogeneous
database that is partly physiological and partly psycological.
7/11/2014
The concept of representation is to model a small part of a
cognitive process at a time

A complete model of how particular sensory event are
transformed into particular movement sequences depend on
concatenation of several networks









7/11/2014
One network to
transform raw
(visual/auditory)
data
Another to
associate data
with co-occuring
(visual/auditory)
data
Another network
to classify these
Subnetworks or larger networks with ANN principles can be re-use
in other models, the connections can be added or reduced.
Hence, these can be considered as the ANN modelers toolkit

NN theory addresses complex problems, no single model has
cracked the problem of categorization/attention/memory.
7/11/2014

McCullohPitts Network (1943)

Based on a network of simple units, referred to as linear
threshold units

Meant only to capture the essential workings of the
biological neuron.

Also known as the all-or-none or 0-or-1 network

Boolean operators such as AND and OR is represented by
these units

A neuron can be embedded into a network to fire selectively
in respons to any given spatiotemporal array of firing of
other neurons in the network
7/11/2014
McCullohPitts Networks Rules
Rules governing the excitatory and inhibitory pathways of the M-P Network

1. All computations are carried out in discreet time intervals

2. Linear threshold law :
Each neuron will fire whenever a given threshold number of
excitatory pathways and no inhibitory pathways impinging on
it are active from the previous time period (Rosenblatt 1962)

3. If a neuron receives a single inhibitory signal from an active neuron, it does
not fire

4. The connections does not change from a function of experience. Thus the
network deals with performance but not learning.




7/11/2014
Example 1

A cold object help to the skin and then removed quickly,causes a
sensation of heat

In this example, each cell has a threshold of two, hence it will fire
when it receives two excitatory (+) and NO inhibitory (-) signal from
cells active at the previous time



7/11/2014
Example 1
7/11/2014
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
At T0 :
threshold number :2
Example 1
7/11/2014
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+
+ +
-
At T1 :
Example 1
At T2 :
7/11/2014
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+
+
+
Example 1
At T3 :
7/11/2014
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+

+
What happens when the cold object is being
held continuosly?

Lets look at the next example.
7/11/2014
Example 2
7/11/2014
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+
+ +
-
At T1 :
Example 2
7/11/2014
At T2 :
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+
+ +
-
+ +
+
Example 2
7/11/2014
At T3 :
Cell 1
A
Cell 4
Cell 3
B
Cell 2
Heat
receptor
Cold
receptor
Heat
Cold
+
+ +
-
+ +
+
Modern connectionist network contains 3 types of units (nodes), which are being called using different terms in
texts but are actually referring to the same unit -

Input units Output units Hidden units
Mc Culloh Pits
Network
Cells 1 & 2 Cells 3 & 4 Cell A & B
Rosenblatts
perceptron
(1962)
Sensory Response Associative
Units Trilogy Sensory neuron Motor neurons
Inter neurons
(other neurons)
7/11/2014
This networks models psychological effects such as :

Relief motor act becoming rewarding when it turns off
an unpleasant stimulus

Frustration withholding of an expected reward being
unpleasant

Partial reinforcement acquisition effect reward value of
food being enhanced if the food is unexpected

7/11/2014
McCulloch and Pitts Networks also confront the
question of how memory is stored

The mechanism for such memory storage is a
reverberatory circuit

This concept remains the central of understanding of
memory


7/11/2014
Example A
M-P Network models a neuron firing when an input (light) is given three
times in a row

7/11/2014
1
B
A
2
+
+
+
+
+
+
+
+
+
A B
Network A: Each neuron has threshold 3
Network B: shows neuron 2 (threshold two) is made to fire if the light has on
been at any time in the past. The mechanism is a reverberatory circuit.


1 2
+
+
This network has an absence of precise timing
but still useful in exhibiting the pathways.


7/11/2014
Psychologists were considering the mechanistic frameworks
for studying and learning.
How do we study and learn?





7/11/2014
which then
led to
Can Short-Term-Memory (STM) be
distinguished from Long-Term-Memory (LTM)?
Classical conditioning
7/11/2014
Ivan Petrovich Pavlov (1849-1936)
studying the mechanisms underlying the digestive
system in mammals
Nobel Prize in Physiology or Medicine in 1904
realizes dogs were salivating without the presence of
food
Hull (1943)
Using the Pavlovs dog example

The experiment is about bell-food association

When the bell stops, the dog will start to concentrate on
other things because the conscious memory is gone

But the memory still remains, because the next time the
bell rings the dog will salivate quickly
7/11/2014
From the Pavlovs dog experiment, Hull distinguished two sets
of traces :
7/11/2014
Stimulus traces
Refers to amounts of
activity of particular nodes
or functional units in an
ANN
Subject to rapid decay
Associative strengths
(habit strengths)
Refers to connections between nodes
Persists over a longer period of time
Connection strengths change with
experience
and corresponds to variable related to
the synapse or junction between
neurons
Hebb (1949)
Memory mechanism, reverberatory feedback loops, suggested by
McCulloh and Pitts (1943) could be useful for STM

For LTM, this mechanism would be too unstable, for LTM
depends on structural change

But repeatedly, this mechanism may provide the structural
change for LTM
7/11/2014
"When an axon of cell A is near enough to excite cell B and repeatedly or
persistently takes part in firing it, some growth processes or metabolic
changes take place in one or both cells such that A's efficiency as one of
the cells firing B, is increased".

If one cell repeatedly fires another, the knobs of the synapse
between the cells could grow as to increase the area of
contact.

7/11/2014
Before repeated
firing
After repeated
firing
Hebbian Learning Rule
In other words, in a neural net, the connections between
neurons get larger weights if they are repeatedly used during
training.

Neurons that fires together grows together
7/11/2014

Rosenblatts Perceptron (1962)
Another ANN model which is considered influential and has
anticipated many themes of adaptive methods, such as the
multilayer perceptrons of Rumelhart & McClelland, 1986.

Main function : to make and learn choices between different
patterns of sensory stimuli

Perceptrons = networks of sensory (S), associative (A) and
response (R) units, with various structures of active
connection betweens units.
7/11/2014
Connection Structure Types
Three-layer-
series-coupled
Multilayer series-
coupled
Cross-coupled Back-coupled





























7/11/2014
S A R S A R S A R S A R
Connections
from S to one
level of A to
another level
of A to R
Similar to the first
structure, with the
addition of cross
links between A
units
Similar to series
coupled with the
addition of
feedback links
from R to A units
One way
connection from
S to A to R
Experiments with perceptrons
A major Rosenblatts experiment,

S-units are arranged in a rectangular unit. Connections from S- to A-units
are random, whereas all A-units connect to the single R-unit.

The perceptron (series-coupled) was taught to discriminate vertical from
horizontal bars. If ALL possible vertical and horizontal bars are presented
to the elementary series-coupled perceptron, positively reinforced for
vertical bars and negatively for horizontal bars, the network will give the
desired response.

If only SOME of the bars are reinforced, the series-coupled perceptron are
unable to generalize its behaviour to other bars that have not been
reinforced.




7/11/2014
Experiments with perceptrons cont
7/11/2014
Schematic of a simplified form of one of Rosenblatts experiment.
S consists of 20x20 grid. Each unit A receives 5 excitatory and 1 inhibitory
inputs from random S units.
Experiments with perceptrons cont 2
This is a weakness of the series-coupled perceptrons, inability
to separate out parts (features) of a complex pattern. It needs
an excessively large number of nodes for perceptrons to
perform categorizations. These system also relies on a
reinforcement signal external to the perceptron

Generalizations can be improved by; adding connections to
the perceptrons, interposing extra layers of associative units
or by cross-coupling existing associative units.

Added connections, removes perceptron dependence on
external reinforcement


7/11/2014
Perceptrons
First neural network with the ability to learn
Made up of only input neurons and output neurons
Input neurons typically have two states: ON and OFF
Output neurons use a simple threshold activation function
In basic form, can only solve linear problems
Limited applications
7/11/2014
How Do Perceptrons Learn?
Uses supervised training
If the output is not correct, the weights are adjusted
according to the formula:
7/11/2014
7/11/2014
Network architecture
Feed forward network
60 input (one for each frequency bin)
6 hidden
2 output (0-1 for Elephant, 1-0 for Grey mouse)

Presenting the data
Elephant
Grey
mouse
Presenting the data (untrained network)
Elephant
Grey
mouse
0.43
0.26
0.73
0.55
Calculate error
Elephant
Grey
mous
e
0.43 0 = 0.43
0.26 1 = 0.74
0.73 1 = 0.27
0.55 0 = 0.55
Backprop error and adjust weights

Dora
Yam
0.43 0 = 0.43
0.26 1 = 0.74
0.73 1 = 0.27
0.55 0 = 0.55
1.17
0.82
Presenting the data (trained network)
Elephant
Grey
mouse
0.01
0.99
0.99
0.01
7/11/2014
Multilayer Feedforward Networks

Most common neural network
An extension of the perceptron
Multiple layers
The addition of one or more hidden layers in
between the input and output layers
Activation function is not simply a threshold
Usually a sigmoid function
A general function approximator
Not limited to linear problems
Information flows in one direction
The outputs of one layer act as inputs to the next layer
Grossberg (1976)
Adaptive Resonance Theory (ART)

An unsupervised ANN in the sense that it establishes
clusters without external interference

Designed to overcome the stability-plasticity dilemma :
how to achieve stability without rigidity and plasticity
without chaos?


7/11/2014
7/11/2014
Biological Realism
How close should we mimic the structure of the biological
systems?

In knowledge processing, the existence of emotions
are helpful in making decisions.
(Damasio, 1994)

Do we want to emulate human emotional conflict in
intelligent machines?

Many irrational common decision-making tendecies may be a
by-product of a system designed for effective
(if not optimal) real time processing of a complex informational environment
(Grossberg & Gutowski,1987)

7/11/2014
Biological Realism cont
Proceed designing with the degree of
biological realism focusing on the
requirements of the problem need
solving.

To appreciate the capabilities of the
human brain it is important to
understand the neuroscience and the
experimental phsycology.
7/11/2014
Thus,
And,
ANNs Applications
ANNs are best at identifying patterns or trends in data,
they are well suited for prediction or forecasting needs

Specific paradigms;

- recognition of speakers in communications
- diagnosis of hepatitis
- recovery of telecommunications from faulty
software
- interpretation of multimeaning Chinese words
- undersea mine detection
- texture analysis
- three-dimensional object recognition
- hand-written word recognition
- facial recognition
7/11/2014
And so that concludes the Introduction and
Historical Outline of Artificial Neural
Network.

Thank you!
7/11/2014
Useful Links
Introduction to Neural and Cognitive Modeling, Daniel S. Levine,
2
nd
Edition, 2000 ISBN:0-80582005-1
http://cialab.ee.washington.edu/index_files/Page598.html
http://www.tek271.com/articles/neuralNet/IntoToNeuralNets.ht
ml
http://www.comp.nus.edu.sg/~pris/ArtificialNeuralNetworks/
http://ece.colorado.edu/~ecen4831/demuth.html

7/11/2014