Documentos de Académico
Documentos de Profesional
Documentos de Cultura
of Microarray Data
Matt Ritchie
mritchie@wehi.edu.au
WEHI Bioinformatics
indexs
grid.r
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
grid.c
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
spot.r
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
spot.c
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
area
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
121
112
115
144
136
122
128
131
101
148
159
140
150
135
101
138
128
Gmean
Gmedian
782.6529
786
795.1161
787
640.2261
612
636.4306
606
683.8015
665
704.6721
686
641.3516
617
669.0992
675
570.2475
582
596.3851
579
581.1635
578
574.3071
565
559.56
553
576.6963
577
615.6931
603
596.9565
592
1076.281
1073
GIQR
0.2713
0.2531
0.228259
0.219474
0.278104
0.252427
0.235919
0.228259
0.204065
0.221592
0.199796
0.167933
0.18629
0.174049
0.202645
0.21408
0.310304
Rmean
Rmedian
460.595
441
480.4107
430
415.7043
383
387.3472
370
447.875
421
437.0902
394
374.4453
344
390.2748
352
355.4455
315
370.6149
357
360.3962
329
369.6929
340
327.4133
306
363.963
352
365.703
352
356.5725
341
646.0469
611
RIQR
0.507772
0.69478
0.637267
0.615804
0.590317
0.574541
0.712276
0.65764
0.852071
0.667829
0.576348
0.565665
0.568426
0.664224
0.581229
0.636239
0.608136
bgGmean bgGmed
626.9174
604
661.4129
599
594.4429
582
586.6806
591
607.2785
595
583.7286
579
593.4861
570
575.6381
569
581.6649
573
577.9004
571
568.0617
567
566.9535
562
562.5439
550
577.2553
577
576.665
572
570.4036
557
621.8974
558
bgGSD
0.223515
0.260361
0.197048
0.211949
0.185998
0.225869
0.208001
0.204065
0.21606
0.19484
0.232283
0.175134
0.200304
0.185848
0.197331
0.186544
0.225912
bgRmean
366.1927
405.6269
354.1096
358.0463
366.8059
366.0238
367.9306
351.1619
341.2147
355.7835
339.0044
353.5767
354.1974
335.1957
330.7488
336.5067
361.0154
Outline
An example analysis
Plotting capabilities
Options for normalization
Selecting differentially expressed genes
Lunching!
The R environment
Freeware version of S-plus (GNU S)
A language and environment for statistical
computing and graphics
Operates on Linux, Windows and MacOS
Add-in libraries available for specialised
statistical routines
Bioconductor bundle
Statistics for Microarray Analysis (SMA) for
cDNA microarrays
Installing R
http://cran.r-project.org/
R Binaries link
Downloading Libraries
http://www.bioconductor.org
Released Packages link
Downloading Libraries
http://cran.r-project.org/
Package sources link to SMA
marrayRaw
maRf
matrix
maRb
matrix
maLayout
marrayLayout
maGf
matrix
maGnames
marrayInfo
maNotes
character
maGb
matrix
maW
matrix
maTargets
marrayInfo
An Example
4 microarrays (human)
AML1
1
3
1723.spot
1737.spot
1738.spot
1739.spot
Array Layout: 20 x 20 x 4 x 12
Gene names (19200 spots on each array)
8kHela.gal
GFP
Normalization (intensity-based)
Rank genes
Output results to file
Getting Started
Load libraries
marrayInput,
marrayPlots,
marrayNorm and sma
library(sma)
help.start()
Read in Data
widget.marrayRaw()
grid.r
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
grid.c
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
spot.r
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
spot.c
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
area
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
Gmean Gmedian GIQR Rmean Rmedian RIQR bgGmean bgGmed bgGSD bgRmean
121 782.6529
786 0.2713 460.595
441 0.507772 626.9174
604 0.223515 366.1927
112 795.1161
787 0.2531 480.4107
430 0.69478 661.4129
599 0.260361 405.6269
115 640.2261
612 0.228259 415.7043
383 0.637267 594.4429
582 0.197048 354.1096
144 636.4306
606 0.219474 387.3472
370 0.615804 586.6806
591 0.211949 358.0463
136 683.8015
665 0.278104 447.875
421 0.590317 607.2785
595 0.185998 366.8059
122 704.6721
686 0.252427 437.0902
394 0.574541 583.7286
579 0.225869 366.0238
128 641.3516
617 0.235919 374.4453
344 0.712276 593.4861
570 0.208001 367.9306
131 669.0992
675 0.228259 390.2748
352 0.65764 575.6381
569 0.204065 351.1619
101 570.2475
582 0.204065 355.4455
315 0.852071 581.6649
573 0.21606 341.2147
148 596.3851
579 0.221592 370.6149
357 0.667829 577.9004
571 0.19484 355.7835
159 581.1635
578 0.199796 360.3962
329 0.576348 568.0617
567 0.232283 339.0044
140 574.3071
565 0.167933 369.6929
340 0.565665 566.9535
562 0.175134 353.5767
150 559.56
553 0.18629 327.4133
306 0.568426 562.5439
550 0.200304 354.1974
135 576.6963
577 0.174049 363.963
352 0.664224 577.2553
577 0.185848 335.1957
101 615.6931
603 0.202645 365.703
352 0.581229 576.665
572 0.197331 330.7488
138 596.9565
592 0.21408 356.5725
341 0.636239 570.4036
557 0.186544 336.5067
128 1076.281
1073 0.310304 646.0469
611 0.608136 621.8974
558 0.225912 361.0154
127 1059.654
1043 0.238826 651.7402
625 0.590226
617
556 0.224742 342.3889
134 1284.164
1335 0.19153 774.4776
753 0.472759 605.0754
564 0.226288 351.4824
121 1238.289
1255 0.220095 787.5041
774 0.479653 632.79
565 0.207882 369.41
128 879.0469
888 0.340957 485.3828
455 0.637088 639.602
603 0.225747 403.9005
Slots
aml.raw
maRf
19200x4
maRb
19200x4
maLayout
20x20x4x12
maGf
19200x4
maGnames
marrayInfo
maNotes
character
maGb
19200x4
maW
0x0
maTargets
marrayInfo
Spot Intensity
maImage(aml.raw[,1], x=maA, col=heat.colors(20),
main=1723: A)
maBoxplot Log-ratios
maBoxplot(aml.raw[,1], x=maPrintTip, y=maM,
main=1723: Non-normalized M by print-tip)
maBoxplot Intensity
maBoxplot(aml.raw[,1], x=maPrintTip, y=maA,
main=1723: A by print-tip)
Normalization
help(maNorm)
Normalization
aml.norm.n <- maNorm(aml.raw, norm=none)
aml.norm.l <- maNorm(aml.raw, norm=loess)
aml.norm.p <- maNorm(aml.raw, norm=printTipLoess)
aml.norm.n
maA
19200x4
maM
19200x4
maLayout
20x20x4x12
maMloc
0x0
maGnames
marrayInfo
maNotes
character
maMscale
0x0
maW
0x0
maTargets
marrayInfo
maNormCall
call
Normalization
maPlot(aml.norm.n[,1], z=NULL, pch=., main=1723 Non-normalized MA Plot)
maPlot(aml.norm.n[,1], z=maPrintTip, pch=.,
main=1723 Print-tip normalized MA Plot)
maPlots
maPlot(aml.norm.n[,3], z=maPrintTip, pch=.,
main=1738 Non-normalized MA Plot)
maPlot(aml.norm.p[,3], z=maPrintTip, pch=.,
main=1738 Print-tip normalized MA Plot)
help(stat.bayesian)
Output Results
write.table(cbind(maGnames(aml.raw)@maLabels[index1][aml.bayesian$lods>0],
aml.bayesian[aml.bayesian$lods>0],
aml.bayesian$Xprep$Mbar[aml.bayesian$lods>0]), Results.txt,
row.names=F, col.names=c(Gene Names, B, Ave M), sep=\t,
quote=F)
Gene Name
B
Ave M
AA916325;aldo-keto reductase family 1, member C3 (3-alpha hydroxysteroid dehydrogenase,
0.85
type
-0.63
II)
AI949576;annexin A3
0.99
-0.59
AI927438;hemoglobin, beta
1.73
-1.39
AI969657;UDP-Gal:betaGlcNAc beta 1,4- galactosyltransferase, polypeptide 2
0.07
-0.42
AI952285;EST, Highly similar to CA34_HUMAN COLLAGEN ALPHA 3(IV) CHAIN PRECURSOR
0.14
-0.61
[H.sapiens]
AA598601;insulin-like growth factor binding protein 3
4.72
-0.84
N80129;metallothionein 1L
1.12
1.04
AA169469;pyruvate dehydrogenase kinase, isoenzyme 4
2.31
-0.72
AA676466;argininosuccinate synthetase
0.14
-0.47
AA504348;ESTs, Highly similar to topoisomerase II alpha {C-terminal} [H.sapiens]
0.81
-0.53
AA127096;enigma (LIM domain protein)
3.35
0.83
AA446103;lectin, mannose-binding, 1
3.07
-0.72
AI381323;creatine kinase, muscle
1.55
-0.24
AA430504;ubiquitin carrier protein E2-C
0.73
-0.35
H70775;ESTs
0.90
-0.40
AA126265;calnexin
0.62
-0.66
maNorm()
Normalization (intensity-based)
Rank genes (using B statistic) stat.bayesian()
Output results to file write.table()
Summary
R and the libraries used here are free!
Class structure for common array objects
Standard functions for plotting and normalization
Heading in a user-friendly direction
GUIs
functions well documented - help() command
Links
R - http://www.r-project.org/
Bioconductor
http://www.bioconductor.org/
SMA
http://www.stat.berkeley.edu/users/terry/zarray/Html/index.html
marray Tutorials
http://oz.berkeley.edu/~terry/zarray/Course/ (Labs 1 & 2)
WEHI Bioinformatics
http://www.wehi.edu.au/bioweb/index.html
References
Dudoit, S. and Yang, Y. H. (2002) Bioconductor
R packages for exploratory analysis and
normalization of cDNA microarray data.
IN
Parmigiani, G., Garrett, E. S., Irizarry, R. A., and
S. L. Zeger, S. L. (editors) (2002). The Analysis
of Gene Expression Data: Methods and
Software. Springer, New York. (To appear).
Acknowledgements
WEHI Genetics
and Bioinformatics
Gordon Smyth
Natalie Thorne
Terry Speed
Asa Wirapati
Hamish Scott
Joelle Michaud
UC Berkeley
Jean Yang
Sandrine Dudoit
Ben Bolstad