Documentos de Académico
Documentos de Profesional
Documentos de Cultura
http://www.ijsat.com
Salvador Ayala-Raggi
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
Josefina Castaeda
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
Francisco Portillo
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
Gerardo Mino-Aguilar
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
gmino44@ieee.org
Rodrigo Maya
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
Ricardo lvarez
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
Tecilli Tapia
Facultad de Ciencias de la Electrnica
Benemrita Universidad Autnoma de Puebla
Puebla, Mexico
I.
INTRODUCTION
http://www.ijsat.com
METHOD
Recognition results
Signal Preprocessing
Features
Extraction
Identification
Pattern
association
On-line process
Training
Knowledge
base (library)
Off-line process
III.
Pre-processing
Feature Extraction
Average
magnitude
Spectrogram
Linear Prediction
Coefficients
SIGNAL PRE-PROCESSING
For this work, for the Pre-processing stage and with the
goal of adapting the speech signals into a representation that
helps to the coming stages to perform in a more efficient
manner a Pre-emphasis filter, a Voice Activity Detection
stage, a stage of Short time analysis of the voice signal (if
the characteristics require to do so) and finally a
Normalization and Temporal alignment process (see Figure
4), were used.
Speech signal
Images
generation
Dimensionality
Reduction
(Eigenfaces)
Evaluation
(classification)
Figure 4. Voice signals before (top) and after (bottom) of the detection of
vocalized signals. In this figure it is used the first 20ms of the signal in
order to calculate the levels of background noise and a threshold of 15%
of the signals range above of the background noise to discriminate
vocalized signals.
IV.
FEATURE EXTRACTION
http://www.ijsat.com
size from
examples:
was
V.
DIMENSIONALITY REDUCTION
http://www.ijsat.com
Figure 12. Results for the Magnitude Average used as Feature Vectors for
a space of seven words and fifty testing elements.
Figure 11. Knowledge library containing the weights from each of the
training examples.
VI.
AUTOMATIC RECOGNITION
Figure 13. Results for the Spectrograms used as Feature Vectors for a
space of seven words and fifty testing elements.
Figure 14. Results for the Linear Prediction Coefficients used as Feature
Vectors for a space of seven words and fifty testing elements.
35
http://www.ijsat.com
REFERENCES
Figure 15. Results for the Short-Time Analysis of the Fundamental Signal
used as Feature Vectors for a space of seven words and fifty testing
elements.
VIII. CONCLUSIONS
The main contribution of this work is developing an
original method for the Automatic Speech Recognition that
explodes the hypothesis in which it is assumed that the
voice possesses an inherent los dimensional structure of low
drawing upon Facial Recognition paradigms. The following
is a summary of the advantages of this proposed technique:
36
[1]
[2]
[3]
[4]
[5]
[6]
[7]
http://www.ijsat.com
37