Seq 1

# High-throughput sequencing metadata template (version 2.1).
# All fields in this template must be completed.

# Templates containing example data are found in the METADATA EXAMPLES spreadsheet tabs at the foot of this pag
# Field names (in blue on this page) should not be edited. Hover over cells containing
# Human data. If there are patient privacy concerns regarding making data fully public through GEO, please submit to
SERIES
# This section describes the overall experiment.
title
summary
overall design
contributor
contributor
supplementary file
SRA_center_name_code [optional]
SAMPLES
# This section lists and describes each of the biological Samples under investgation, as well as any protocols that are specific
# Additional "processed data file" or "raw file" columns may be included.
Sample name title
Sample 1
Sample 2
Sample 3
PROTOCOLS
# Any of the protocols below which are applicable to only a subset of Samples should be included as additional columns of the
growth protocol
treatment protocol
extract protocol
library construction protocol
library strategy
DATA PROCESSING PIPELINE

# Data processing steps include base-calling, alignment, filtering, peak-calling, generation of normalized abundance measurem
# For each step provide a description, as well as software name, version, parameters, if applicable.
# Include additional steps, as necessary.
data processing step
genome build
processed data files format and content
# For each file listed in the "processed data file" columns of the SAMPLES section, provide additional information below.
PROCESSED DATA FILES
file name file type
# For each file listed in the "raw file" columns of the SAMPLES section, provide additional information below.
RAW FILES
file name file type
# For paired-end experiments, list the 2 associated raw files, and provide average insert size and standard deviation, if known
PAIRED-END EXPERIMENTS
file name 1 file name 2
MPLES spreadsheet tabs at the foot of this page.
ells containing field names to view field content guidelines.
data fully public through GEO, please submit to NCBI's dbGaP (http://www.ncbi.nlm.nih.gov/gap/) database. dbGaP has controlled
stgation, as well as any protocols that are specific to individual Samples.
source name organism characteristics: tag
es should be included as additional columns of the SAMPLES section instead.
ng, generation of normalized abundance measurements etc…

arameters, if applicable.
ection, provide additional information below.
file checksum
ide additional information below.
file checksum instrument model read length
erage insert size and standard deviation, if known. For SOLiD experiments, list the 4 file names (include "file name 3" and "file name 4" colu
average insert size standard deviation

abase. dbGaP has controlled access mechanisms and is an appropriate resource for hosting sensitive patient data.
characteristics: tag characteristics: tag molecule

single or paired-end
name 3" and "file name 4" columns).

ve patient data.
description processed data file raw file

# Templates containing example data are found in the METADATA EXAMPLES spreadsheet tabs at the foot of this page
SERIES
title Genome-wide maps of chromatin state in pluripotent and lineage-committe
summary We report the application of single-molecule-based sequencing technolog
overall design Examination of 2 different histone modifications in 2 cell types.
contributor John,B,Goode
contributor Bradley,Smith
supplementary file
SRA_center_name_code [optional]
SAMPLES
# This section lists and describes each of the biological Samples under investgation, as well as any protocols that are specific t
Sample name title
Sample 1 H3K4me2_ChIPSeq
Sample 2 H3K4me1_ChIPSeq
Sample 3 input DNA
PROTOCOLS
growth protocol ES cell–derived NS cells were routinely generated by re-plating d 7 adhere
treatment protocol
extract protocol Lysates were clarified from sonicated nuclei and histone-DNA complexes w
library construction protocol Libraries were prepared according to Illumina's instructions accompanying
library strategy ChIP-Seq

data processing step Basecalls performed using CASAVA version 1.4
data processing step ChIP-seq reads were aligned to the mm9 genome assembly using EasyAl
data processing step Data were filtered using the following specifications…
data processing step peaks were called using PeaksFind version 2.2 with the following setting:
genome build mm9
processed data files format and content wig files were generated using …; Scores represent …
file name file type
H3K4me2.peaks.wig wig
H3K4me1.peaks.wig wig
H3K4me2.b.peaks.wig wig
RAW FILES
file name file type
080716_BI-EAS46_0001_209DH_L1.fastq fastq
# For paired-end experiments, list the 2 associated raw files, and provide average insert size and standard deviation, if known.
ata fully public through GEO, please submit to NCBI's dbGaP (http://www.ncbi.nlm.nih.gov/gap/) database. dbGaP has controlled a
chromatin state in pluripotent and lineage-committed cells.

n of single-molecule-based sequencing technology for high-throughput profiling of histone modifications in mammalian cells. By obtaining o
nt histone modifications in 2 cell types.
tgation, as well as any protocols that are specific to individual Samples.
source name organism characteristics: cell type

Neural progenitor cells Mus musculus ES-derived neural progenitor cells
s should be included as additional columns of the SAMPLES section instead.

s were routinely generated by re-plating d 7 adherent neural differentiation cultures (typically 2–3 × 106 cells into a T75 flask) on uncoated p
om sonicated nuclei and histone-DNA complexes were isolated with antibody.

according to Illumina's instructions accompanying the DNA Sample Kit (Part# 0801-0303). Briefly, DNA was end-repaired using a combinat
g, generation of normalized abundance measurements etc…

rameters, if applicable.
ng CASAVA version 1.4

gned to the mm9 genome assembly using EasyAlign version 3.2 with the following configurations…
the following specifications…
PeaksFind version 2.2 with the following setting: ChIP threshold (0.2), Enrichment Fold (2.5), Rescue Fold (3).
d using …; Scores represent …
file checksum
95cf1d1fa509d871b2ef0bb9fd734c3d
8ec6ee3cce10b970e5bfea4e35cdb231
f8fcd650914ff1a733956d6d06e8b543
de additional information below.

6cc6ee3cce10b970e5bfea4e35cdb Illumina Genome Analyzer 36
88ceb0e0d056dda9208a03acf9073 Illumina Genome Analyzer 36
f2786fedc5106789a2af4014a0e74f Illumina Genome Analyzer 36
d8fcd650914ff1a733956d6d06e8b0Illumina Genome Analyzer 36
03839cca2e797b28b9f9371f7b9ca Illumina Genome Analyzer 36
604fbb658413c559511eb6ad2bb14 Illumina Genome Analyzer 36
57cf1d1fa509d871b2ef0bb9fd734c Illumina Genome Analyzer IIx 42
e5718e1a97690d410464f24f37aae Illumina Genome Analyzer IIx 42

base. dbGaP has controlled access mechanisms and is an appropriate resource for hosting sensitive patient data.
mammalian cells. By obtaining over four billion bases of sequence from chromatin immunoprecipitated DNA, we generated genome-wide chr
characteristics: passages characteristics: strain characteristics: ChIP antibody

15-18 C57BL/6 H3K4me2 (Millipore, 07-030, lot 12
15-18 C57BL/6 H3K4me1 (Millipore, 08-034, lot 11
15-18 C57BL/6 none
into a T75 flask) on uncoated plastic in NS-A medium (Euroclone, Milan, Italy) supplemented with modified N2 and 10 ng/ml of both EGF a
end-repaired using a combination of T4 DNA polymerase, E. coli DNA Pol I large fragment (Klenow polymerase) and T4 polynucleotide kin
single
single
single
single
single
single
single
single

e patient data.
we generated genome-wide chromatin-state maps of mouse embryonic stem cells, neural progenitor cells and embryonic fibroblasts. We fin
molecule description processed data file

genomic DNA H3K4me2.aligned.txt
genomic DNA H3K4me1.aligned.txt
genomic DNA H3K4me2.b.aligned.txt
N2 and 10 ng/ml of both EGF and FGF-2 (NS expansion medium).
ase) and T4 polynucleotide kinase. The blunt, phosphorylated ends were treated with Klenow fragment (32 to 52 exo minus) and dATP to y
d embryonic fibroblasts. We find that lysine 4 and lysine 27 trimethylation effectively discriminates genes that are expressed, poised for exp
raw file raw file raw file

H3K4me2.peaks.txt 080716_BI-EAS46_0001_209DH_L1080716_BI-EAS46_0001_209DH_L2
H3K4me1.peaks.txt 080716_BI-EAS46_0001_209DH_L4080716_BI-EAS46_0001_209DH_L5
H3K4me2.b.peaks.txt 080717_BI-EAS46_0001_20DH_L5.080717_BI-EAS46_0001_20DH_L6.fastq
o 52 exo minus) and dATP to yield a protruding 3- 'A' base for ligation of Illumina's adapters which have a single 'T' base overhang at the 3’
t are expressed, poised for expression, or stably repressed, and therefore reflect cell state and lineage potential. Lysine 36 trimethylation m
raw file
080716_BI-EAS46_0001_209DH_L3.fastq
080716_BI-EAS46_0001_209DH_L6.fastq
1_20DH_L6.fastq
ngle 'T' base overhang at the 3’ end. After adapter ligation DNA was PCR amplified with Illumina primers for 15 cycles and library fragments
ntial. Lysine 36 trimethylation marks primary coding and non-coding transcripts, facilitating gene annotation. Trimethylation of lysine 9 and ly
15 cycles and library fragments of ~250 bp (insert plus adaptor and PCR primer sequences) were band isolated from an agarose gel. The p
Trimethylation of lysine 9 and lysine 20 is detected at satellite, telomeric and active long-terminal repeats, and can spread into proximal uniq
ted from an agarose gel. The purified DNA was captured on an Illumina flow cell for cluster generation. Libraries were sequenced on the Ge
d can spread into proximal unique sequences. Lysine 4 and lysine 9 trimethylation marks imprinting control regions. Finally, we show that c
ries were sequenced on the Genome Analyzer following the manufacturer's protocols.
egions. Finally, we show that chromatin state can be read in an allele-specific manner by using single nucleotide polymorphisms. This study
otide polymorphisms. This study provides a framework for the application of comprehensive chromatin profiling towards characterization of d
ng towards characterization of diverse mammalian cell populations.
# Templates containing example data are found in the METADATA EXAMPLES spreadsheet tabs at the foot of this page
SERIES
title Next Generation Sequencing Facilitates Quantitative Analysis of Wild Type
summary Purpose: Next-generation sequencing (NGS) has revolutionized systems-
summary Methods: Retinal mRNA profiles of 21-day-old wild-type (WT) and neural r
summary Results: Using an optimized data analysis workflow, we mapped about 30
summary Conclusions: Our study represents the first detailed analysis of retinal tran
overall design Retinal mRNA profiles of 21-day old wild type (WT) and Nrl-/- mice were g
contributor Rebecca,A,Smith
contributor David,Doe
supplementary file
SRA_center_name_code
SAMPLES
# This section lists and describes each of the biological Samples under investgation, as well as any protocols that are specific t
Sample name title
Sample 1 WT rep1
Sample 2 WT rep2
Sample 3 Nrl-KO rep1
Sample 4 Nrl-KO rep2
PROTOCOLS
growth protocol
treatment protocol
extract protocol Retinas were removed, flash frozen on dry ice, and RNA was harvested us
library construction protocol RNA libraries were prepared for sequencing using standard Illumina proto
library strategy RNA-Seq

data processing step Illumina Casava1.7 software used for basecalling.
data processing step Sequenced reads were trimmed for adaptor sequence, and masked for low
data processing step Reads Per Kilobase of exon per Megabase of library size (RPKM) were ca
genome build mm8
processed data files format and content tab-delimited text files include RPKM values for each Sample ...
file name file type
WT.txt abundance measurements
WT2.txt abundance measurements
mutant1.txt abundance measurements
mutant2.txt abundance measurements
RAW FILES
file name file type
Run123abc.csfasta solid_native_csfasta
Run123abc_QV.qual solid_native_qual
2011_01_gfh_qseq.txt Illumina_native_qseq
DS18389-7_1.fastq fastq
DS18389-7_2.fastq fastq
run454.seq 454_native_seq
run454.qual 454_native_qual
2011_05_rst_qseq.tar Illumina_native_qseq
GAXHYMS02.sff sff
080717_BI-EAS46_1.fastq fastq
080717_BI-EAS46_2.fastq fastq
# For paired-end experiments, list the 2 associated raw files, and provide average insert size and standard deviation, if known.
DS18389-7_1.fastq DS18389-7_2.fastq
080717_BI-EAS46_1.fastq 080717_BI-EAS46_2.fastq
ata fully public through GEO, please submit to NCBI's dbGaP (http://www.ncbi.nlm.nih.gov/gap/) database. dbGaP has controlled a
ncing Facilitates Quantitative Analysis of Wild Type and Nrl-/- Retinal Transcriptomes
on sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived
A profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl−/−) mice were generated by deep sequencing, in triplic
ized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,0
represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA-seq technology. The optimized da
f 21-day old wild type (WT) and Nrl-/- mice were generated by deep sequencing, in triplicate, using Illumina GAIIx.
tgation, as well as any protocols that are specific to individual Samples.
source name organism characteristics: strain

Retina Mus musculus C57BL/6
s should be included as additional columns of the SAMPLES section instead.
flash frozen on dry ice, and RNA was harvested using Trizol reagent. Illumina TruSeq RNA Sample Prep Kit (Cat#FC-122-1001) was used w
ared for sequencing using standard Illumina protocols
g, generation of normalized abundance measurements etc…

rameters, if applicable.
ware used for basecalling.

trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence, then mapped to mm8 whole genome using bowtie
exon per Megabase of library size (RPKM) were calculated using a protocol from Chepelev et al., Nucleic Acids Research, 2009. In short, ex
nclude RPKM values for each Sample ...
file checksum
d8fcd650914ff1a733956d6d06e8b091
abcdef123456789abc123456789abc
95cf1d1fa509d871b2ef0bb9fd734c3d
0wd6ee3cce10b970e5bfea4e35cdb987
de additional information below.

6cc6ee3cce10b970e5bfea4e35cdb AB SOLiD System 3.0 50
88ceb0e0d056dda9208a03acf9073 AB SOLiD System 3.0 50
95cf1d1fa509d871b2ef0bb9fd734c Illumina HiSeq 2000 72
95cf1d1fa509d871b2ef0bb9fd734c Illumina HiSeq 2000 50
0wd6ee3cce10b970e5bfea4e35cdbIllumina HiSeq 2000 50
f2786fedc5106789a2af4014a0e74f 454 GS FLX Titanium 400
d8fcd650914ff1a733956d6d06e8b0454 GS FLX Titanium 400
03839cca2e797b28b9f9371f7b9ca Illumina Genome Analyzer II 36
604fbb658413c559511eb6ad2bb14 454 GS 20 36
57cf1d1fa509d871b2ef0bb9fd734c Illumina Genome Analyzer IIx 42
e5718e1a97690d410464f24f37aae Illumina Genome Analyzer IIx 42

222 25
300 32
base. dbGaP has controlled access mechanisms and is an appropriate resource for hosting sensitive patient data.
y are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase cha
d by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isof
(build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl−/− mice with BWA workflow and 34,115 transcripts with TopHat w
q technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression
characteristics: tissue characteristics: age characteristics: genotype

retina post natal day 21 wild type
retina post natal day 21 wild type
retina post natal day 21 Nrl-/-
retina post natal day 21 Nrl-/-
(Cat#FC-122-1001) was used with 1 ug of total RNA for the construction of sequencing libraries.
m8 whole genome using bowtie v0.12.2 with parameters -q -p 4 -e 100 -y -a -m 10 --best --strata
ds Research, 2009. In short, exons from all isoforms of a gene were merged to create one meta-transcript. The number of reads falling in th
single
single
single
paired-end
paired-end
single
single
single
single
paired-end
paired-end

e patient data.
se transcription polymerase chain reaction (qRT–PCR) methods and to evaluate protocols for optimal high-throughput data analysis
e analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat follo
4,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were
tive investigations of expression profiles. Our results show that NGS offers a comprehensive and more accurate quantitative and qualitative
molecule description processed data file

total RNA WT1.txt
total RNA WT2.txt
total RNA mutant1.txt
total RNA mutant2.txt
The number of reads falling in the exons of this meta-transcript were counted and normalized by the size of the meta-transcript and by the s
hroughput data analysis
OVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using TaqMan and SYBR Green assays
ng genes, and 12 of these were validated with qRT–PCR. RNA-seq data had a linear relationship with qRT–PCR for more than four orders o
rate quantitative and qualitative evaluation of mRNA content within a cell or tissue. We conclude that RNA-seq based transcriptome charact
raw file raw file raw file

Run123abc.csfasta Run123abc.qual 2011_01_gfh_qseq.txt
run454.seq run454.qual 2011_05_rst_qseq.tar
GAXHYMS02.sff
080717_BI-EAS46_1.fastq 080717_BI-EAS46_2.fastq
he meta-transcript and by the size of the library.

BR Green assays
PCR for more than four orders of magnitude and a goodness of fit (R2) of 0.8798. Approximately 10% of the transcripts showed differential e
eq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions.
raw file raw file

DS18389-7_1.fastq DS18389-7_2.fastq
transcripts showed differential expression between the WT and Nrl−/− retina, with a fold change ≥1.5 and p value <0.05. Altered expression
of complex biologic functions.
value <0.05. Altered expression of 25 genes was confirmed with qRT–PCR, demonstrating the high degree of sensitivity of the RNA-seq me
f sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized gen
eral as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significa
t workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling.

Seq 1

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Seq 1

Cargado por

Copyright:

Formatos disponibles

# High-throughput sequencing metadata template (version 2.1).

# All fields in this template must be completed.

DATA PROCESSING PIPELINE

stgation, as well as any protocols that are specific to individual Samples.

source name organism characteristics: tag

es should be included as additional columns of the SAMPLES section instead.

ng, generation of normalized abundance measurements etc…

ection, provide additional information below.

file checksum instrument model read length

average insert size standard deviation

characteristics: tag characteristics: tag molecule

name 3" and "file name 4" columns).

description processed data file raw file

DATA PROCESSING PIPELINE

chromatin state in pluripotent and lineage-committed cells.

tgation, as well as any protocols that are specific to individual Samples.

source name organism characteristics: cell type

s should be included as additional columns of the SAMPLES section instead.

om sonicated nuclei and histone-DNA complexes were isolated with antibody.

g, generation of normalized abundance measurements etc…

ng CASAVA version 1.4

d using …; Scores represent …

ection, provide additional information below.

de additional information below.

average insert size standard deviation

characteristics: passages characteristics: strain characteristics: ChIP antibody

name 3" and "file name 4" columns).

molecule description processed data file

N2 and 10 ng/ml of both EGF and FGF-2 (NS expansion medium).

raw file raw file raw file

DATA PROCESSING PIPELINE

tgation, as well as any protocols that are specific to individual Samples.

source name organism characteristics: strain

s should be included as additional columns of the SAMPLES section instead.

g, generation of normalized abundance measurements etc…

ware used for basecalling.

nclude RPKM values for each Sample ...

ection, provide additional information below.

de additional information below.

file checksum instrument model read length

average insert size standard deviation

characteristics: tissue characteristics: age characteristics: genotype

name 3" and "file name 4" columns).

molecule description processed data file

raw file raw file raw file

he meta-transcript and by the size of the library.

raw file raw file

También podría gustarte