Documentos de Académico
Documentos de Profesional
Documentos de Cultura
The types and amount of mRNA produced by a cell determines the way it
responds to the changing needs.
Condition A Condition B
Natural Causes
Induced ( Chemical or Physical)
Other
components,
mRNAs Proteins organelles
mRNAs proteins
Other
components,
organelles
Characterize the
Over expression gene – express
Study the pathway etc – gene knock out protein .. Deduce
Other members studies – in vitro function etc
or in vivo
Gene Expression Analysis – The concepts
Transcriptome of a cell is the expressed gene set at any given time ( the profile
of mRNAs )
The classical approach involve studies on a gene or a few genes, one at a time .
This information is biased since it is based on pre-defined knowledge
• Microarrays
• Serial Analysis of Gene Expression (SAGE)
• Massively Parallel Signature Sequencing ( MPSS)
Microarrays –the methodology for quantitative information
on expressed genes
The absolute levels and type of unique mRNA species in a cell will
vary in response to cell-signaling cues or perturbations in the extra-
cellular environment
DNA microarray technology – How do we profile the transcriptome
With the aid of a computer, the amount of mRNA bound to the spots on the
microarray is precisely measured, generating a profile of gene expression in
the cell.
The Basics – Nucleic Acid Hybridization
Image Analysis
Normalization
t-Test, ANOVA
Principal Component Analysis
Clustering
Ontologies/Pathway Analysis/Function Prediction……and
other Knowledge based analyses
Image Analysis
Analysis of the image of scanned array seeks to extract an intensity for each spot or
feature on the array
Each spot is identified by gridding so as to include all spots within defined grids
• Segmentation is done by analyzing the spot intensities using various algorithms to
identify the best shape of the spot and which pixels belong to a spot
• Intensity Extraction for each spot and background is done by measuring the mean or
median intensity of all pixels within the spot
<1-downregulated
>1-upregulated
Significance
A fold difference in expression may be an experimental error. Replicates of
control and sample are done and statistical testing is done on the data to see
if the differences are actually significant !
T-test or ANOVA
Sample data – Normalized Expression values of genes
Differences in gene expression patterns when hESCs are treated with
BMP4- a growth factor
The analyses which are to be performed on Microarray data depends on
the needs of the user .. What problem is he/she addressing
Preprocessing
Log transformation
Dimension
Mean centering
reduction
Median centering
Confirmation –
Knowledge based Normalizations
Experiments
( eg pathways)
Analysis
Selection of
Similarity Measure
Selection of
clustering rules
Dimension reduction
Microarray data is multidimensional
Each Gene is compared across multiple libraries/experiments
And Each Library/experiment is compared across multiple genes
The data is basically a matrix, with “m” columns and “n” rows whose number can be
enormous. Any mathematical operation on such huge matrices are computationally
intensive even challenging the supercomputers
http://genome.tugraz.at
Microarray libraries can be compared one to one for identifying genes over
expressed /up regulated in one library vs the other.
Genes up regulated in a library
can be selected from
interactive scatter plots
Ovary –Normal Epithelium
Ovary –Tumor3
using 3D scatter
plots
Ovary -Tumor
180
A 20 10 40 160
140
Gene A
Gene B
Gene C
Gene D
B 23 24 25 120
100
Gene E
80
C 56 28 112 60
40
D 2 1 4 20
Genes A, C and D are similar and this similarity can be mathematically defined as
“correlation co-efficient”
The measure of correlation between any two genes can be defined as the distance between
them. The distance measure ranges from 0-1 and higher correlation value means higher
similarity
Common distance measures used in gene expression analyses
Standard correlation, Pearson Correlation, Manhattan distance, Cosine distance , Poisson
distance
Distance Measures are used to find the similarity between genes and
experiments
Genes Closer to one another are grouped together and displayed as cluster
diagrams which can either be colored blocks, line graphs or scatter plots
HCA produces trees of relationships , In our case gene trees and Sample trees
Self-Organizing Map (SOM) is a neural-network array the cells (or nodes) of which
become specifically tuned to recognize signal patterns or classes of patterns in an
orderly fashion.
The learning process is competitive and unsupervised, and the locations of the
responses in the array tend to become ordered in the learning process thus
producing meaningful groupings of data based on their similarities
Data format for feeding into software is similar to HCA but the user needs to
specify the number of desired groupings, the number of iterative analyses to be
performed on each gene before output of result, and define the neighborhood
radius.
SOM analysis of Ovarian Cancer data
Genes in the SOM clusters specific for the particular case of study can be
retrieved and studied further experimentally
DEMONSTRATION OF ANALYSIS
GENESIS SOFTWARE
% Expression
0
2
4
6
8
10
12
14
16
18
Ap
op
Ce to
ll C Ce ssi
ycl llC
yc
Differentiated
eC le
Undifferentiated
Ce heck
ll p Po
r ol in
i t
Dif ferat
f i
DN eren on
A tia
t
mo io
difi n
c
DN atoi n
DN A re
A p
rep air
lica
tio
n
Gr FG EC
ow F M
th pa
Fa thw
cto
rs/ ay
Functional Categories
Lig
an
d
Kins
Tra a
ns Sig Re ses
crpi na cep
tio l Tr tor
nF ans s
ac du
tor cti
on
s/C
oW-f
TG natct
Comparisonof theexpressionof genesbelongingtoselect categories
Poars
Fb thw
eta ay
sgi
na
ling
Demonstration of EASE software
Gene Expression Data Can Be catalogued-
We have more information now with less experiments and the only limitations are
in the analysis of data and its interpretations