Documentos de Académico
Documentos de Profesional
Documentos de Cultura
DNA sequencing
ACGTGACTGAGGACCGTG
DNA sequencing
DNA sequencing refers to the methods and technologies that used to determine the orders of nucleotide bases in a DNA molecule, namely adenine (A), guanine (G), cytosine (C) and thymine (T). The first DNA sequences were obtained in the early 1970s by academic researchers using laborious methods based on two-dimensional chromatography. Following the development of fluorescence-based sequencing methods with automated analysis, DNA sequencing has become easier and orders of magnitude faster.
DNA sequencing
DNA sequencing enables us to perform a thorough analysis of DNA because it provides us with the most basic information of all: the sequence of nucleotides. The knowledge of DNA sequences has formed the basis of basic biological researches and clinical genetic diagnosis. There are also numerous applied technology fields such as biotechnology, forensic science and biological systematics that are heavily dependent on the information generated through DNA sequencing.
DNA sequencing
DNA sequencing may be used to determine the sequence of individual genes, larger genetic regions (i.e. clusters of genes or operons), full chromosomes or entire genomes. Depending on the methods used, sequencing may provide the order of nucleotides in DNA or RNA isolated from cells of animals, plants, bacteria, archaea, or virtually any other source of genetic information. The resulting sequences may be used by researchers in molecular biology or genetics to further scientific progress or may be used by medical personnel to make treatment decisions or aid in genetic counseling.
History
RNA sequencing was one of the earliest forms of nucleotide sequencing. The major landmark of RNA sequencing is the sequence of the first complete gene and the complete genome of Bacteriophage MS2, identified and published by Walter Fiers in 1972 and 1976. Frederick Sanger developed rapid DNA sequencing methods with chain-terminating inhibitors" in 1977. Walter Gilbert and Allan Maxam at Harvard also developed sequencing methods, including one for "DNA sequencing by chemical degradation". In 1973, Gilbert and Maxam reported the sequence of 24 basepairs using a method known as wandering-spot analysis. Advancements in sequencing were aided by the concurrent development of recombinant DNA technology, allowing DNA samples to be isolated from sources other than viruses.
History
The first full DNA genome to be sequenced was that of bacteriophage X174 in 1977. Leroy E. Hood's and Smith announced the first semi-automated DNA sequencing machine in 1986. In 1995, Venter, Hamilton Smith, and colleagues published the first complete genome of a free-living organism, the bacterium Haemophilus influenzae. The circular chromosome contains 1,830,137 bases and its publication in the journal Science marked the first published use of whole-genome shotgun sequencing, eliminating the need for initial mapping efforts. Several new methods for DNA sequencing were developed in the mid to late 1990s. These techniques comprise the first of the "nextgeneration" sequencing methods. In 1996, Pl Nyrn and his student Mostafa Ronaghi published their method of pyrosequencing. A year later, Pascal Mayer and Laurent Farinelli describing DNA
DNA sequencing
Determination of nucleotide sequence Two similar methods: 1. Maxam and Gilbert method 2. Sanger method They depend on the production of a mixture of oligonucleotides labeled either radioactively or fluorescein, with one common end and differing in length by a single nucleotide at the other end This mixture of oligonucleotides is separated by high resolution electrophoresis on polyacrilamide gels and the position of the bands determined
The four are incubated with piperidine which cleaves the sugar phosphate backbone of DNA next to the residue that has been modified To visualize the fragments, the gel is exposed to X-ray film for autoradiography, yielding a series of dark bands each corresponding to a radiolabeled DNA fragment, from which the sequence may be inferred.
A methyl group is added to guanine, the modified base is removed from its sugar by heating, and the exposed sugar is removed from the backbone by heating in alkali. To cleave at both A and G, the procedure is identical except that a dilute acid is added after the methylation step. The reactions that cleave at C, or at C and T, involve hydrazine to remove the bases and piperidine to cleave the backbone. The extent of the reaction can be carefully limited so that, on average, only one G is evicted from each strand, thus each strand is cleaved at only one of its guanine sites.
A radiolabeled strand to be sequenced and the fragments created from that strand by a single cleavage at the site of G are
Each originalstrand is broken into a labeled fragment and an unlabeled fragment. All the labeled fragments start at the 5 end of the strand and terminate at the base that precedes the site of a G along the original strand. Only the labeled fragments will be recorded once all the fragments are separated on a gel and visualized by exposing the gel to an x-ray film to create an autoradiogram of the gel.
Fragments ending in A or G can be similarly identified. Note that the fragment cleaved at the first base will not show up on the gel, so the first base at the 5 end of the original strand cannot be determined. The band corresponding to the shortest fragments is at the bottom of the autoradiogram. The 5-to-3 sequence of the original strand is read by noting the positions and lanes of the bands from the bottom to the top of the autoradiogram.
Frederick Sanger
Discovered DNA sequencing by chain termination method Nobel Prize 1 (1958)
Complete amino acid sequence of insulin
Sanger Method
Generates the nested set of labeled fragments from a template strand by replicating the template strand to be sequenced and interrupting the replication at one of the four bases. Four different replication reactions produce fragments that terminate in A, C, G, or T, respectively. DNA synthesis using deoxy- and dideoxynucleotides that results in termination of synthesis at specific nucleotides Requires a primer, DNA polymerase, a template, a mixture of nucleotides, and detection system Incorporation of dideoxynucleotides into growing strand terminates synthesis Synthesized strand sizes are determined for each dideoxynucleotide rxn by using gel or capillary electrophoresis
Dideoxynucleotide
PPP O
5 CH2 O BASE
3
no hydroxyl group at 3 end prevents strand extension
Dideoxy nucleotides
In the Sanger chain termination method, the nucleotide analog is called a dideoxynucleotide. Are added in small proportion When the correct amount is added to the solution, the chain will be terminated at each occurrence of the complementary nucleotide in the template because DNA polymerase cannot add another base to the analog. For example, if the right amount of dideoxy A is added, then the chain will be terminated at each occurrence of T in the template. To determine the complete sequence requires a separate reaction for each of the four bases A, T, C, and G. These strands are complementary to the template strand, and terminate opposite the site of a T on the template strand. Complementary strands terminating in either A, G, C, or T are produced by the inclusion in the reaction mixture of ddATP, ddGTp, ddCTP, or ddTTP, respectively.
The primer is essential to initiate replication of the templates by DNA polymerase. The most convenient method for adding a known sequence to the 3 end of the template strand is to clone the strand in the single stranded cloning vector Ml3 so that a known M13 sequence will always flank the unknown DNA insert and can serve as the site for binding a standard primer. Also, the Ml3 cloning protocol automatically creates two types of clones, each type containing a DNA insert whose sequence is complementary to that of the other DNA insert. Thus, the two complementary strands may be sequenced and the two sequences cross-checked to ensure sequence accuracy.
Chain Termination
ddC
dN : ddN 100 : 1
3 primer 5
CCGTAC 5 3 dNTP
ddCTP ddGTP
ddATP ddTTP
GGCA
A
GGCAT
T C G
GGC
G GG GGCATG
Sequence detection
To detect products of sequencing reaction Include labeled nucleotides Formerly, radioactive labels used Now, fluorescent labels used Use different fluorescent tag for each nucleotide Can run all four bases in same lane TAGCCACGTATCGAA*
TAGCCACGTATC*
TAGCCACG*
TAGCCACGT*
Sequence separation
Terminated chains need to be separated Requires one-base-pair resolution
See difference between chain of X and X+1 base pairs A T C G
Gel electrophoresis
Very thin gel High voltage Works with radioactive or fluorescent labels
Shotgun Sequencing
Since only short stretches of DNA, several hundred to a thousand base pairs in length, can be obtained from a single sequencing gel, many shell sequences must be generated separately and then combined to determine the sequence of a much longer DNA fragment. Various strategies have been developed to generate these short sequences from the larger fragment. The shotgun approach is the most widely used in the larger sequencing projects.
Shotgun Sequencing
Copies of a long fragment to be sequenced are broken into much shorter fragments that overlap one another, and the short fragments are cloned. Those clones are then picked at random and sequenced. The sequence of the long fragment is determined by finding overlaps among the short sequences and assembling those sequences into the most likely order. Numerous computer algorithms have been developed to facilitate the assembly of long sequences.
Shotgun Sequencing
Inevitably, gaps remain in the sequence of the long fragment, and they are filled by switching to a directed sequencing strategy. That is, the short clones are no longer sequenced at random, but rather, short sequences at the end of a continuous stretch of known sequence provide the information necessary to construct a probe to pick out a clone, or region of a clone, whose sequence will extend the known sequence. Most of the large sequencing projects to date have used a mixture of random and directed sequencing strategies to complete the sequence of long, contiguous stretches of DNA. The advantage of the random, or shotgun, strategy is that in the course of picking clones at random and sequencing them, any given region is usually sequenced many times, thereby reducing the errors in the final sequence.