Está en la página 1de 31

GENE PREDICTION

TECHNIQUES

PRESENTED BY: AQSA SYED


PRESENTED TO: SIR RIZWAN
ROLL NO: 157104
DEPARTMENT: BIOINFORMATICS
SEMESTER: 5th
TALK SEQUENCE
1. What is gene?

2. What is Gene finding?

3. Gene prediction methods.

4. Prokaryotic gene prediction.

5. Eukaryotic gene prediction.


WHAT IS GENE?

• Gene is a basic unit of heredity and it is present on chromosomes.

• Genes are made up of DNA, act as instructions to make molecules called proteins.

• The Human Genome Project has estimated that humans have between 20,000 and
25,000 genes.

• 1st time Gregor Mendel gave the idea of gene from his experiment on pea plant. He didn't
use the term gene but use factor in his first law.

• Danish botanist Wilhelm Johannsen coined the word "gene" ("gen" in Danish and
German) in 1909 to describe these fundamental physical and functional units of heredity
GENE ANATOMY
WHAT IS GENE FINDING?

•  Gene prediction or gene finding refers to the process of identifying the


regions of genomic DNA that encode genes.

• This includes protein-coding genes as well as RNA gene.

•  Also include prediction of other functional elements such as regulatory


regions. 

• Gene finding is one of the first and most important steps in understanding
the genome of a species once it has been sequenced.
GENE PREDICTION METHODS

SEQUENCE SIMILARITY SEARCH

• Simple approach based on finding similarity in genes sequences between EST and
protein.

• If there is similarity between certain genomic region like EST,DNA, protein ,this
similarity information is used to predict gene structure and function.

• Local and global alignment method based on similarity search

• BLAST( local alignment tool) is used to detect similarity to known genes ,EST, protein.
AB INITIO GENE PREDICTION

• Genomic DNA is systematically search for potential coding gene.


• Based on two types of sequence information i.e.

• Signal sensor : sequence motifs such as, splice site, branch point,
polypyrimidine tract, start codon and stop codon.

• Content sensor : detection of exons

• Algorithm based on this methods are; dynamic programming, linear


discriminant analysis , hidden markov model and neural network.
PROKARYOTIC GENE PREDICTION

• Prokaryotic gene prediction is easier than in prokaryotes.

• Prokaryotic DNA is much more gene dense.

• Lack of introns ( non-coding regions), very few repetitive sequences, more


sequences genomes.

• Start codon is ATG

• The first simplest approach to find genes in prokaryotes is ORF.


PROKARYOTIC GENE STRUCTURE
ORF

• Stands for open reading frame.

• Simplest method to scan DNA for start and stop codon.

• Each strand of DNA has three potential reading frame.

• DNA is translated in all six possible reading frames, three forward and
three backward.

• Any region of DNA between a start codon and stop codon could
potentially code for a polypeptide, and therefore an ORF.

• ORF Finder is computationally approach to find coding regions.


ORF FINDER
GENE FINDING PROGRAMS IN
PROKARYOTES
Gene mark.hmm:
• GeneMark is an ab initio gene prediction tool , developed at Georgia
institute of technology Atlanta.

• This algorithm was designed in 1998.

• It is used to find short genes and gene start.


GENEMARK HOME PAGE
GLIMMER:
• Gene locator interpolated markov modler.

• Used to find genes in prokaryotes , archea, viruses.

• Finding 98-99% of all relatively long proteins coding genes.

RBSFINDER:
• Searches for ribosomal binding site for prediction of translational initiation
site
GENE PREDICTION IN EUKARYOTES

• Eukaryotic gene prediction is more complex than prokaryotes.

• Genomes is much more larger and having low gene density.

• Rich in repetitive sequences and transposable elements.

• Having both exon and introns.


EUKARYOTIC GENE STRUCTURE
INTRON EXON SPLICE SITE

• Detect exons and to precisely locate the boundary between the exon and
the contiguous introns.

• There are 4 different exons that are being recognised.

1. Initial exons, from the initiation codon to the first splice site.
2. Internal exons from splice site to splice site.
3. Terminal exons from splice site to stop codon.
4. Single introns corresponding to uninterrupted, intronless genes i.e.,
running from initiation codon to stop codon.
SENSITIVITY:
The frequency with which a programme detects ‘true’ splice sites.

SPECIFICITY:
Reflects the number of predicted sites which are correct.
TRANSCRIPTION SIGNALS

• The principal of transcription signals of gene detection include;

• Initiator or CAP signal located at the transcription start point.

• The TATA box.

• Transcription factor binding site.

• Polyadenylation signal.
TRANSLATIONAL SIGNAL

• The Kozak signal located immediately upstream of the AUG initiation


codon.

• The termination codon(s) present in the terminal exon and absent from the
initial and internal exons.
GENE FINDING PROGRAMS IN
EUKARYOTES
AB INITIO BASED GENE PREDICTION
1. GENEID:
Is ab initio gene prediction tool which is used to predict gene in eukaryotes.

2. GRAIL:
A neural network based algorithm which is used to predict splice junctions,
start and stop codons, poly-A sites, promoters, and CpG island.

3. FGENESH:
programme to predict multiple genes in genomic DNA.
AB INITIO BASED GENE PREDICTION...

4. GenScan:

A program to identify complete gene structures in genomic DNA that can


be used to predict the location of genes and their exon introns boundaries
in genomic sequences from a variety of organisms.
HOME PAGE
OUT PUT
HOMOLOGY BASED GENE
PREDICTION PROGRAMME

1. GenomeScan:
combination of GenScan prediction result with BLASTX similarity
searches.
2. TwinScan:
A gene finding server similar to GenomeScan.
CONSENSUS BASED GENE
PREDICTION

1. GeneComber:
Combination of HMM gene and GenScan result prediction.

2. DIGIT:

• The algorithm has been implemented into a novel gene-finder named


DIGIT.

• DIGIT combines the results of FGENESH, GENSCAN and HMMgene

También podría gustarte