Está en la página 1de 9

Exerccios de Bioinformtica

Prof. Maria Lucila Hernandez Macedo e Prof. Leandro E. C. Diniz

Exercicio 01
A partir da sequncia abaixo, analise e responda:
1
61
121
181
241
301
361
421
481
541
601
661
721
781
841
901
961
1021
1081
1141
1201
1261
1321
1381
1441
1501
1561
1621
1681
1741
1801
1861
1921
1981
2041
2101

acatccgcgg
acacttgtgg
gttgccaagg
gccagctggt
gtgcgcggga
caaagattga
cctacggaag
ttgggatgga
ctgcccaaca
ggtgagagct
catgttccat
gccagtagca
tctaatggtc
gaataatgaa
tttattccgt
tgatctccac
ggttaaagga
aaataaatct
acaagagata
atggagtggc
tgcagttttt
attagggcat
tgaatgctcc
tctgtactga
acctgtttaa
aaattgacat
tattaagagt
atggataatg
ggtccaagat
tgccttcaat
taaaataaaa
cattgtcatt
tttaagataa
actatatctg
tttctccttt
aaaaaaaaaa

caacgcctcc
cttccgtgca
cgttgagaga
ttttcctgcg
agcaggtgcc
acctgcagac
ttatggaaaa
tgaaactgca
cagactcggc
tgctcctgct
gtttctttta
tcatctgatt
agcatcgatc
tttaactttt
gctgctcgca
ttattaaaag
agaaaaccag
ttaaaggaac
aaaacttgtt
aatatagaaa
cagagtggaa
ttgagaaatg
agtcaacaaa
tttttgtaag
atatttcaag
tgctttactg
gaaaattgtc
ccggtgagaa
acctaaaaga
aaatggtata
taaaaataaa
tatatagatt
taatatatgt
tatataaaat
gataaaataa
aaaaaa

ttggtgtcgt
cacattaaca
tcatctggga
gtgattcgga
caggagagag
caagcgcaaa
ggcaaagcgc
gtcgcggcgt
aactccgcgg
ccagttgcgg
ggtatatctt
gtgatattga
aattattgga
ttaaaagaca
agttgaggca
tttcagaagg
ctgccctggg
agaaaaaact
ggaataaaat
cacgaacttt
tgcttcctag
catattgtat
ctatttctta
acaatccatg
acattaaatc
tcaaaataat
ttcttctgtg
taagagagtc
gatttcaaga
gcaaatgttt
ttcacctagt
ataacttgtc
ttacctttta
attttcatac
atgagctatg

ccgcttccaa
actcatggtt
agtcttttac
aattcgcgaa
gggataatga
gtagaaactg
agagccacgc
gggtaagagg
aagaccaggg
tcatcatgac
tggacttcct
aggtaaagat
cagcatgaaa
tatctgtgat
atttcttaaa
cacaacaata
tgaagcccaa
gaatgacttg
tttgatgggc
agctgcatcc
aagttactga
tactagaaga
tatatgtgaa
taaggtatca
tatgaagtat
tttatggctc
ctggagatgt
ataaacctta
gatttaatta
tgacatgaaa
ctaaggatgc
taaataagtt
attaatgaaa
agcattacaa
tattaacaaa

taacccagct
ctagctccca
ccagaattgc
ttcctctggt
agattccatg
aaagtacact
cgtagtgtgt
aaccagctgc
tcctgggagt
tacgcccgcc
cccctgatcc
ggcaaacaat
gaaattggta
gctaataagg
atgaatagca
ctgttgaact
ccaacaaaga
tgtttcctaa
actaaagaac
tccaagaatc
atgcaccatg
tgaatacaaa
catttatcaa
gttgcaataa
ataatggttt
actatgaatc
tttagagtta
agtaagcaac
atcatgaatg
aaaggacaat
taaaccttag
tgcaatttgg
tatctgtatt
attgcttact
aaaaaaaaaa

tgcgtcctgc
gtcgccaagc
tttgattcag
cctcatccag
ctgatgatcc
gctggcggat
gccgcccccc
agagatcacc
gactatgggc
tcccgcagac
ttgttctgtt
atgagagtgt
gcaattgcct
aaggtatgtt
ctggtgattt
gcactggcca
gtttggaaga
agagactatt
actgaaaaat
tatctgctta
gtcaaaacgg
caatggaaac
tcagtataat
tacttctcaa
caaagattca
tattatactg
acaatgatat
agcataacaa
tgtaacacag
ttcaaaaaaa
tactgagtta
gagatatatt
taattttgac
ttggaataca
aaaaaaaaaa

O que fazer?
Voc recebeu uma sequncia na direo 53 de um cDNA (cpia de RNA
mensageiro) desconhecido. As perguntas so:
1. A sequncia corresponde a qual gene?
2. De que espcie?
3. Quais os 12 nucleotdeos do incio da traduo?
4. Qual a sequncia de aminocidos codificada pelo RNA? (apresentar como
exemplo abaixo)

Dicas:
1. Faa um BLAST da sequncia no site: http://www.ncbi.nlm.nih.gov/BLAST/

Entre em: Nucleotide-nucleotide BLAST (blastn)

Coloque a sequncia no box superior

Aperte o boto virtual:

Na prxima pgina, aperte o boto


vrias vezes, pois o tempo ser ainda mais lento)...

Logo abaixo voc encontrar a sequencia de DNA mais parecida com aquela que
voc iniciou a procura. Voc ver o nome e a espcie e um valor de similaridade
(E value), que inversamente proporcional semelhana das sequncias
(sequncias idnticas do E value igual a ZERO).

2.

e aguarde (no aperte

Para obter a sequncia de aminocidos, v ao site:


http://us.expasy.org/tools/dna.html

O programa ir analisar todos os possveis ORF (Open Reading Frames).


Explicando: o ribossomo l o RNA de 3 em 3 letras, podendo comear pela
primeira letra da sequncia, pela segunda ou pela terceira. Dependendo em qual
posio ele comea, a sequncia de aminocidos obtida ser diferente. Preste
ateno nos sinais Met (abreviao de Metionina) e Stop (abreviao de cdon
de trmino da traduo). A sequncia da protena (se a sequncia que voc est
analisando for o cDNA inteiro, de ponta a ponta) comea sempre com uma
Metionina e acaba com um Stop. Quanto mais longa for a sequncia de
aminocidos desde uma Met at um Stop, maiores chances de que esse seja o
ORF certo (o polipeptdeo, ou protena, codificado pela sequncia em questo).

Outra observao: em alguns casos no se sabe se a sequncia de cDNA a fita


+ ou a fita , por isso o programa vai liberar informao de traduo nas duas
direes da sequncia dada.

Voc encontrar outra forma de visualizar qual a ORF certa no site (a ORF o
box verde clique nele para ver a sequncia):
http://www.ncbi.nlm.nih.gov/gorf/gorf.html

Tem uma homepage com um exerccio muito interessante, onde voc faz a
transcrio e depois a traduo, experimente!
http://learn.genetics.utah.edu/content/begin/dna/transcribe/

Exerccio 2
Aps sequenciamento do DNA de uma amostra qualquer de microrganismo isolado do
solo foi obtida a sequncia abaixo. Identifique o organismo a qual essa sequncia
pertence. Caso essa sequncia seja codificante, a que tipo de protena estaria
relacionada.
ATGGCCATACTTTGCAGTACTGCATTGGCTCTGGGCGCATGCGGAAGTATGGGGAAAGCGGGCGGCCCGACAAT
GCGTTGTCTATAGAACAAACAACGCAGATAGACAGCGCAGACGGGATTGATGCCTCCAAACTGCTCTTTTCCTC
TTCCCAAGCCGTTGTTATCGCCGGTGATTCTGTGGGGCAGAAGTGGGAGGGCGCGAAAGCAGCGGTGAAGCGG
GGCGCGCCGCTGCTGGTGCGCACTGCCGATAACGCGTCGGCCATTGATTCGGAGATAAAGCGCCTCGGGGCTCA
AGACGTTATTAAGATTGACGAGCCTCAGGCCCCGGACCCGGAAATTTCCGAGGCAACATTGCCGATAAAATCTC
GCAGCTCACGCCCGAATCCCCGCTTTTTAACGGCGGCGCGTCCATCCTGGTCTCCGGGCACACCACGGCCGCTG
ATGTAGCCACCGCACGCGCGTCGGGGGCCAATGTGGAGTACCTGTCTTCGGGCGATGCGCGTGAAAGCTCTGCG
CTATCCGCTGATCCCGACGCTCATGTGGTTGCCCTGGGTCCAAGTTTTGCCAACAAAGAACGCTTTAATCGCCAG
GTAGAGATGATTAGCCATGGTGAGGTCCCCGGTGGCGGGCATCTCATTTTCCCCTCGCATCGCGTGGTAGCTCTC
TACGGTCATCCTTCCGGCGGGGCGCTGGGAGTGCTTGGCGAGCAACCTGCTGAGGAAGCCGTAAACAGGGTGA
ATGATTTAGTGGGTAAGTATCAGGCCATTGCACCGGAAGAGAGCATGATCCCCGCCTTTGAGATCATTGCTACC
GTGGCGAGCTCGTCAGCAGGGCCGGATGGCAATTATTCCAATGAGGGGAACGTTGATGAGCTGCGCCCGTGGGT
TGAAGCGATTGGTGATGCTGGAGGCATAGCGATTCTTGATCTACAACCTGGCAGCGCAAGCTTCCTTGAACAGG
CACAACAATTTGAGGAATTGCTGAAACTACCGCACGTCGGACTGGCGATAGATCCCGAGTGGCGGCTTAAGCCG
GGGGAGAAACCCATGGAGAGGGTCGGCAGTGTTGGGGCGGGGGAAGTGAACCAGACTGCTGCGTGGCTGCGGG
ACCTGGTAAAAGATAACGAGCTCCCGCAGAAAGTCTTTGTTGTGCACCAATTTCAGCATCAGATGGTGCAGAAC
AGGGAAACCTTGGACACCACGGCACCGGAACTTTCGTGGGTTCTTCACGCAGATGGCCACGGAACCGCGGGCG
ATAAGTTTGCCACGTGGGATATGGTGCGGAAGAATCTGCAGCCCGAGTTCTACCTTGCGTGGAAGAACTTTATC
GATGAGGATCAGCCGATGTTCACCCCCGAGCAGACGTTTAAGATCGAGCCTCGGCCTTGGTTTGTGTCCTATCA
ATAA

Exerccio 3:
Faa uma digesto in silico do fragmento amplificado pelos primers Eub-8f e 1492r do
genoma do organismo identificado no exerccio acima, utilizado as enzimas de
restrio: HaeIII, HhaI, MseI, MspI e RsaI.

Exerccio 4:
As sequncias abaixo correspondem a quais organismos?
Caso elas correspondessem a um gene, qual seria a funo e estrutura da protena
codificadas por cada sequncia.
Sequncia 1
CGAAAAATAAGCCATAGTCGGCACCATAAGCATAACCTAGCTCTGCGATTATCTCTAACATAATTAACTT
AAGCAGCCGTATTTATAAAGAAATTTCCAAAATAAAGCGAATATTCTAGAATCCCAAAACAAACTGGTTG
TTGCGGTAGGTCATTTGTTTGGCAGAAAGAAAACTCGAGAAATTTCTCTGGCCGTTATTCTCTATTCGTT
TTGTGACTCTCCCTCTTTGTACTATTGCTCTCTCACTCTGTCACACAGTAAACGGCGCACTGTTCTCGTT
GCTTCGAGAGAGCGCGCCTCGAATGTTCGCGAAAAGAGCGCCGGAGTATAAATAGAGGAGCTTCGTCGAC
GGAGAGTCAATTCTATTCAAACAAGCAAAGTGAACACATCGCTAAGCGAAAGCTAAGCAAACAAACAAGC
GCAGCTGAACAAGCTAAACAATCTGCAATAAAGTGCAAGTTAAAGTGAATCAATTAAAAGTAACCAACAA
CCAAGTAATTAAACTAAAAACTGCAACTACTGAAATCAACCAAGAAGTAATTATTGAAGACAAGAAGAGA
ACTCTGAATACTTTCAACAAGTCGTTACCGAGGAAGAAGAACTCACACACAATGCCTGCTATTGGAATCG
ATCTGGGCACCACCTACTCCTGCGTGGGTGTCTACCAACATGGCAAGGTGGAGATTATCGCCAACGACCA
GGGCAACCGCACCACGCCGTCCTACGTGGCTTTCACAGATTCGGAACGCCTCATCGGCGATCCGGCTAAG
AACCAGGTGGCCATGAACCCCAGAAACACAGTGTTTGACGCCAAGCGACTGATCGGCCGAAAATACGACG
ACCCCAAGATCGCAGAGGACATGAAGCACTGGCCTTTCAAGGTTGTAAGCGACGGCGGAAAGCCCAAGAT
CGGGGTGGAGTATAAGGGTGAGTCCAAGAGATTTGCCCCCGAGGAGATCAGCTCGATGGTACTGACCAAG

ATGAAGGAGACGGCGGAGGCATATCTGGGCGAGAGCATCACAGACGCAGTCATCACAGTTCCAGCCTACT
TCAACGACTCCCAGCGCCAGGCTACCAAAGACGCCGGTCACATCGCCGGCCTGAATGTGCTCCGCATCAT
CAATGAGCCCACGGCGGCAGCACTGGCCTACGGACTGGACAAGAACCTCAAGGGTGAGCGCAATGTGCTT
ATCTTCGACTTGGGCGGCGGCACCTTCGATGTCTCCATCCTGACCATCGACGAGGGATCACTGTTCGAGG
TGCGCTCCACCGCCGGAGACACACACTTGGGCGGCGAGGACTTTGACAACCGGCTAGTCACTCATCTGGC
GGACGAGTTCAAGCGCAAGTACAAGAAGGATCTGCGCTCCAACCCTCGCGCCCTACGACGCCTCAGAACA
GCAGCTGAACGGGCCAAGCGCACACTCTCCTCCAGCACGGAGGCCACCATCGAGATTGACGCACTGTTTG
AGGGCCAAGACTTCTACACCAAAGTGAGCCGCGCCAGGTTTGAGGAGCTGTGCGCGGACCTCTTCCGCAA
CACCCTGCAGCCTGTGGAGAAGGCCCTCAACGATGCCAAGATGGATAAGGGTCAGATCCACGACATCGTG
CTCGTCGGCGGATCCACTCGCATTCCCAAGGTGCAAAGTCTGCTGCAGGACTTCTTCCACGGCAAGAACC
TCAACCTATCCATCAACCCAGACGAGGCAGTTGCATACGGAGCTGCTGTGCAGGCCGCTATCCTCAGCGG
AGACCAGAGCGGCAAGATCCAGGACGTGCTGCTGGTGGACGTGGCCCCACTTTCATTGGGAATTGAGACC
GCTGGAGGTGTAATGACCAAGCTGATCGAGCGCAACTGCCGCATTCCGTGCAAGCAGACTAAGACGTTCT
CCACATACGCGGACAACCAGCCCGGAGTCTCCATTCAGGTGTATGAGGGCGAACGTGCGATGACGAAGGA
CAACAATGCATTGGGCACCTTCGATCTGTCCGGCATTCCACCTGCACCAAGGGGTGTGCCCCAGATAGAA
GTTACCTTCGACTTGGACGCCAATGGAATCCTGAACGTCAGCGCCAAGGAGATGAGCACGGGCAAGGCCA
AGAACATCACGATCAAGAACGACAAGGGACGGCTCTCGCAGGCCGAGATTGATCGCATGGTGAACGAGGC
TGAAAAGTACGCCGACGAGGACGAGAAGCATCGCCAGCGAATAACCTCTAGAAATGCCCTGGAGAGCTAC
GTCTTCAATGTGAAGCAGGCCGTGGAACAGGCACCTGCTGGCAAATTGGACGAGGCTGACAAGAACTCCG
TCTTGGACAAGTGCAACGACACTATCCGGTGGCTGGACAGCAACACCACTGCCGAGAAGGAGGAGTTCGA
CCACAAGCTGGAGGAGCTCACCCGCCACTGCTCCCCCATCATGACCAAGATGCATCAGCAGGGTGCGGGA
GCTGGAGCTGGTGGTCCGGGAGCAAACTGCGGCCAGCAGGCGGGAGGATTTGGAGGCTACTCTGGACCCA
CGGTCGAGGAGGTCGACTAAGGCCAAAGAGTCTAATTTTTGTTCATCAATGGGTTATAACATATGGGTTA
TATTATAAGTTTGTTTTAAGTTTTTGAGACTGATAAGAATGTTTCGATCGAATATTCCATAGAACAACAA
TAGTATTACCTAATTACCAAGTCTTAATTTAGCAAAAATGTTATTGCTTATAGAAAAAATAAATTATTTA
TTTGAAATTTAAAGTCAACTTGTCATTTAATGTTTTGTAGACTTTTGAAAGTCTTACGATACAATTAGTA
TCTAATATACATGGGTTCATTCTACATTCTATATTAGTGATGATTTCTTTAGCTAGTAATACATTTTAAT
TATATTCGGCTTTGATGATTTTCTGATTTTTTCCGAACGGATTTTCGTAGACCCTTTCGATCTCATAATG
GCTCATTTTATTGCGATGGACGGTCAGGAGAGCTCCACTTTTGAATTTCTGTTCGCAGACACCGCATTTG
TAGCACATAGCCGGGACATCCGGTTTGGGGAGATTTTCCAGTCTCTGTTGCAATTGGTTTTCGGGAATGC
GTTGCAG

Sequncia 2
GGTTCCAATCCTGCCTCTGCCACTTCTCAGTTGTATGCCCCAACCCAACCTGTCTGGCTCTGTCCTCCTT
AACAGAAGGACGGCCCTGGCCACGGGCCACAGCCAGCAACGCTTAAGCACCAGGGCCGGCGAGTGCCCTG
CCGTGGCACGGCTCCAGCGTCGCGCTCTCGAATTCATTTGCTTTCCTTAACGAGAGAAGGTTCCAGATGA
GGGCTGAACCCTCTTCGCCCCGCCCACGGCCCCTGAACGCTGGGGGAGGAGTGCATGGGGAGGGGCGGCC
CTCAAACGGGTCATTGCCATTAATAGAGACCTCAAACACCGCCTGCTAAAAATACCCGACTGGAGGAGCA
TAAAAGCGCAGCCGAGCCCAGCGCCCCGCACTTTTCTGAGCAGACGTCCAGAGCAGAGTCAGCCAGCATG
ACCGAGCGCCGCGTCCCCTTCTCGCTCCTGCGGGGCCCCAGCTGGGACCCCTTCCGCGACTGGTACCCGC
ATAGCCGCCTCTTCGACCAGGCCTTCGGGCTGCCCCGGCTGCCGGAGGAGTGGTCGCAGTGGTTAGGCGG
CAGCAGCTGGCCAGGCTACGTGCGCCCCCTGCCCCCCGCCGCCATCGAGAGCCCCGCAGTGGCCGCGCCC
GCCTACAGCCGCGCGCTCAGCCGGCAACTCAGCAGCGGGGTCTCGGAGATCCGGCACACTGCGGACCGCT
GGCGCGTGTCCCTGGATGTCAACCACTTCGCCCCGGACGAGCTGACGGTCAAGACCAAGGATGGCGTGGT
GGAGATCACCGGTGAGCCCCCCTGCTCCTGCAGGGGAGAGGAGGAGGCTAGCAGGGCGGGCAGGGCCGGG
GGCGTGCGGTTGAAACGGGGGTCCCGGGGGCCTGGGGAGTTAAACGTTGGCCCAGCACCGGGAAAAACAG
GACTCCTGATTCCCTTGCTCAGGAATTGGGAGTGCGGGTCGCTTCTAAGGGCGCTTTCTGCTCTGTAATC
CCAGCGCTTTGGGAGGCCGAGACGGGAGGATCGCTTGAGGCCAGGAGTTCAAGACTAGCCTGGGCAACAT
AGCGAGACGCGCCCCCCCGCCCCGACCCCGCGCCATTACAAAAAAAAAGCAAACAAAAATTTTTTTAAAG
ATCATCGATGAAGAGAGAAAATGCGCTTTTCTACAGAGTCCCCTTCCCACCCACAGCCCCATCCCCAGAT
AAGCGGGGAGTTCCCTGGCGCGGTGCCAGTTTCTAGCCGCTGAGTGGGCGTGTGCGCGGCTCCAAGTGCG
CCTGCGTACTGCTCACTCCCCAGCTCCGCGCCCTGCTCCGTTCCTCCCAAAACTCTGAATCGAAGAACTT
TCCGGAAGTTTCTGAGAGCCCAGACCGGCGGGCACGCCCCCATCCCCAACCCCCTCTGTTAATCCCTACC
AGCCTGCAGTCCTGGCTGCTTCCAAGCAGGAGGTGGGGCCTCTGGCCTAGCGGGGCCGAAAGGCAGTCCC
CTCCCCCGCAGTCTGATTTCCCTCTTCCCCCCAAAGGCAAGCACGAGGAGCGGCAGGACGAGCATGGCTA
CATCTCCCGGTGCTTCACGCGGAAATACACGTGAGTCCTGGCGCCAGGTCGGGGTGGGTGGGTGGCGTGG
GGGTGGGGTCAGGGAAGAGGGCACAGGGACCCACCCGGTGTGTAATGTAACGCTTGCCTTTCCTCTCTGC
ACGTCCAGGCTGCCCCCCGGTGTGGACCCCACCCAAGTTTCCTCCTCCCTGTCCCCTGAGGGCACACTGA

CCGTGGAGGCCCCCATGCCCAAGCTAGCCACGCAGTCCAACGAGATCACCATCCCAGTCACCTTCGAGTC
GCGGGCCCAGCTTGGGGGCCCAGAAGCTGCAAAATCCGATGAGACTGCCGCCAAGTAAAGCCTTAGCCCG
GATGCCCACCCCTGCTGCCGCCACTGGCTGTGCCTCCCCCGCCACCTGTGTGTTCTTTTGATACATTTAT
CTTCTGTTTTTCTCAAATAAAGTTCAAAGCAACCACCTGTCACTGGCCCAGGCCCTGGTGTTTGTGGAAG
GAAGCCTCAGGCACCTGCCATTTGCTGGCTTTCAGGAGTCATCTTTGCTCAGGCCCGTGCTGGGCCATGT
GGGTACACTGGTGTAGGTTGCTGGACACAGGCTGACTCACATCCATAAAGACAGAGGTCTTAGGGCCGGG
CGCAGTGGCTCATACCTACAATCCCAGCACTTTGGGGGGTTGAAGCAGGAGGAGTGCTTGAAGCCAAGAG
TTCTAGACCAGCCTGGACAACA

Sequncia 3
AAACTTTCTGCGTCCGCCATCCTGTAGGAAGGATTTGTACACTTTAAACTCCCTCCCTGGTCTGAGTCCC
ACACTCTCACCACCCAGCACCTTCAGGAGCTGACCCTTAACAGCTTCACCCACAGGGACCCCGAAGTTGC
GTCGCCTCCGCAACAGTGTCAATAGCAGCACCAGCACTTCCCCACACCCTCCCCCTCAGGAATCCGTACT
CTCTAGCGAACCCCAGAAACCTCTGGAGAGTTCTGGACAAGGGCGGAACCCACAACTCCGATTACTCAAG
GGAGGCGGGGAAGCTCCACCAGACGCGAAACTGCTGGAAGATTCCTGGCCCCAAGGCCTCCTCCGGCTCG
CTGATTGGCCCAGCGGAGAGTGGGCGGGGCCGGTGAAGACTCCTTAAAGGCGCAGGGCGGCGAGCAGGGC
ACCAGACGCTGACAGCTACTCAGAATCAAATCTGGTTCCATCCAGAGACAAGCGAAGACAAGAGAAGCAG
AGCGAGCGGCGCGTTCCCGATCCTCGGCCAGGACCAGCCTTCCCCAGAGCATCCACGCCGCGGAGCGCAA
CCTTCCCAGGAGCATCCCTGCCGCGGAGCGCAACTTTCCCCGGAGCATCCACGCCGCGGAGCGCAGCCTT
CCAGAAGCAGAGCGCGGCGCCATGGCCAAGAACACGGCGATCGGCATCGACCTGGGCACCACCTACTCGT
GCGTGGGCGTGTTCCAGCACGGCAAGGTGGAGATCATCGCCAACGACCAGGGCAACCGCACGACCCCCAG
CTACGTGGCCTTCACCGACACCGAGCGCCTCATCGGGGACGCCGCCAAGAACCAGGTGGCGCTGAACCCG
CAGAACACCGTGTTCGACGCGAAGCGGCTGATCGGCCGCAAGTTCGGCGATGCGGTGGTGCAGTCCGACA
TGAAGCACTGGCCCTTCCAGGTGGTGAACGACGGCGACAAGCCCAAGGTGCAGGTGAACTACAAGGGCGA
GAGCCGGTCGTTCTTCCCGGAGGAGATCTCGTCCATGGTGCTGACGAAGATGAAGGAGATCGCTGAGGCG
TACCTGGGCCACCCGGTGACCAACGCGGTGATCACGGTGCCCGCCTACTTCAACGACTCTCAGCGGCAGG
CCACCAAGGACGCGGGCGTGATCGCCGGTCTAAACGTGCTGCGGATCATCAACGAGCCCACGGCGGCCGC
CATCGCCTACGGGCTGGACCGGACCGGCAAGGGCGAGCGCAACGTGCTCATCTTCGACCTGGGGGGCGGC
ACGTTCGACGTGTCCATCCTGACGATCGACGACGGCATCTTCGAGGTGAAGGCCACGGCGGGCGACACGC
ACCTGGGAGGGGAGGACTTCGACAACCGGCTGGTGAGCCACTTCGTGGAGGAGTTCAAGAGGAAGCACAA
GAAGGACATCAGCCAGAACAAGCGCGCGGTGCGGCGGCTGCGCACGGCGTGTGAGAGGGCCAAGAGGACG
CTGTCGTCCAGCACCCAGGCCAGCCTGGAGATCGACTCTCTGTTCGAGGGCATCGACTTCTACACATCCA
TCACGCGGGCGCGGTTCGAAGAGCTGTGCTCGGACCTGTTCCGCGGCACGCTGGAGCCCGTGGAGAAGGC
CCTGCGCGACGCCAAGATGGACAAGGCGCAGATCCACGACCTGGTGCTGGTGGGCGGCTCGACGCGCATC
CCCAAGGTGCAGAAGCTGCTGCAGGACTTCTTCAACGGGCGCGACCTGAACAAGAGCATCAACCCGGACG
AGGCGGTGGCCTACGGGGCGGCGGTGCAGGCGGCCATCCTGATGGGGGACAAGTCGGAGAACGTGCAGGA
CCTGCTGCTGCTGGACGTGGCGCCGCTGTCGCTGGGCCTGGAGACTGCGGGCGGCGTGATGACGGCGCTC
ATCAAGCGCAACTCCACCATCCCCACCAAGCAGACGCAGACCTTCACCACCTACTCGGACAACCAGCCCG
GGGTGCTGATCCAGGTGTACGAGGGCGAGAGGGCCATGACGCGCGACAACAACCTGCTGGGGCGCTTCGA
GCTGAGCGGCATCCCGCCGGCGCCCAGGGGCGTGCCGCAGATCGAGGTGACCTTCGACATCGACGCCAAC
GGCATCCTGAACGTCACGGCCACCGACAAGAGCACCGGCAAGGCCAACAAGATCACCATCACCAACGACA
AGGGCCGCCTGAGCAAGGAGGAGATCGAGCGCATGGTGCAGGAGGCCGAGCGCTACAAGGCCGAGGACGA
GGTGCAGCGCGACAGGGTGGCCGCCAAGAACGCGCTCGAGTCCTATGCCTTCAACATGAAGAGCGCCGTG
GAGGACGAGGGTCTCAAGGGCAAGCTCAGCGAGGCTGACAAGAAGAAGGTGCTGGACAAGTGCCAGGAGG
TCATCTCCTGGCTGGACTCCAACACGCTGGCCGACAAGGAGGAGTTCGTGCACAAGCGGGAGGAGCTGGA
GCGGGTGTGCAGCCCCATCATCAGTGGGCTGTACCAGGGTGCGGGTGCTCCTGGGGCTGGGGGCTTCGGG
GCCCAGGCGCCGCCGAAAGGAGCCTCTGGCTCAGGACCCACCATCGAGGAGGTGGATTAGAGGCCTCTGC
TGGCTCTCCCGGTGTGGTCTAGAAAACAGACTCTTTGCACTTGATAGCTGCTTGGGCACCGATTACTGTC
AAGGTTATTTAAAGTCTTCTTCATGGTTCAGTTTAAAGTTACAGTCTTTCTTAAGGTAATTGCGTTGACT
GTTAAATTTTGTATGCATATATATATATATATATATATATATATATATATATTCAAATATATTCAAAGTA
ATGTTGGGAGCAGCACTGTGCACTGTACCAGGGGATTATGTTTTATAGCTAATGATGTGTAAAGTCTAAA
GATTTTTTTGTAATTTTTATATCAGTGTTCCAGTAGCCTGGGAAGACATATAGTCTAGCTGCCCAGTTCC
CTGGAGATGGTCATCTCTAAGACAAAGTGTCTTAAACAAACGTCTTGGCACTGTGTACTACATAACTTTA
CTCTTTTGTACTTAAAACTTTATCTGCTTGTCCATGTTAAGGTTTTGTGGTATAACCAGTATGTTCTTTG
CATTTAATCTAAGTAGGTTAAAGATGGTGTATCCTTCCTGCATACATGTCTACACTGCCACCCTGTGTAC
ATTTTTTTCTTTGCATCACTACAAACTAATGAAAAAAACTTTTATGACTTAAATATTCAAAATAAAAGGT
TACAAGTATATTTTGTCTGTTTGTATGTTGGAAGGGCTAATGGATTCTGGGCTTCTGTGGATTTCTTAAG
TTTTTTTTAAGATTTATTATTATATGTGAACACATTGTAGCTATCTTCAGACACACCAGAAAAGGGCATC
AGATCTCATTACAGATGGGTGTGAGCCACCATGTGGTTCCTGGGATTTGAACTCAGGACCTTCGGAAGAG

TAGTCAGTGCTCTTAACTGCTGAGCTGTCTCTCCAGCCCCCGGATTTCTTAGTTTTGTGATAACTGGAAA
AGGGATTTTTTTTTGGTGGATTTTCAGTGCAGTTATGCAGGGAGTACAGGTATTTACTTTGAGGGTCGGG
CTCATTCATGGGAAAAAGTAGGGTGGTGCTGTTGTTTGGGGTCAGTGGAAGAGGGACTCAAGGGGCTATG
AGAGCTCAGCTC

Exerccio 5: exerccios de alinhamento


A partir das sequncias abaixo, responda:
>seq 1
1 tgtgttcact
61 gccgttactg
121 aggctgctgg
181 actcctgatg
241 gcctttagtg
301 gagctgcact
361 ctggtctgtg
421 tatcagaaag
481 tttcttgctg
541 gggggatatt
>seq 2
1 aacgtggatg
61 cagaggttct
121 aaggtgaagg
181 gacaacctca
241 gatcctgaga
301 ggcaaagaat
361 aatgccctgg
421 gttcctttgt
481 tctggatt

1.
2.
3.
4.
5.

agcaacctca
ccctgtgggg
tggtctaccc
ctgttatggg
atggcctggc
gtgacaagct
tgctggccca
tggtggctgg
tccaatttct
atgaagggcc

aacagacacc
caaggtgaac
ttggacccag
caaccctaag
tcacctggac
gcacgtggat
tcactttggc
tgtggctaat
attaaaggtt
tt

atggtgcacc
gtggatgaag
aggttccttg
gtgaaggctc
aacctcaagg
cctgagaact
aaagaattca
gccctggccc
cctttgttcc

tgactcctga
ttggtggtga
agtcctttgg
atggcaagaa
gcacctttgc
tcaggctcct
ccccaccagt
acaagtatca
ctaagtccaa

ggagaagtct
ggccctgggc
ggatctgtcc
agtgctcggt
cacactgagt
gggcaacgtg
gcaggctgcc
ctaagctcgc
ctactaaact

aagttggtgg
ttgagtcctt
ctcatggcaa
agggcacctt
acttcaggct
tcaccccacc
cccacaagta
tccctaagtc

tgaggccctg
tggggatctg
gaaagtgctc
tgccacactg
cctgggcaac
agtgcaggct
tcactaagct
caactactaa

ggcaggctgc
tccactcctg
ggtgccttta
agtgagctgc
gtgctggtct
gcctatcaga
cgctttcttg
actgggggat

tggtggtcta
atgctgttat
gtgatggcct
actgtgacaa
gtgtgctgga
aagtggtggc
ctgtccaatt
attatgaagg

cccttggacc
gggcaaccct
ggctcacctg
gctgcacgtg
ccatcacttt
tggtgtggct
tctattaaag
gccttgagca

Faa o alinhamento das sequncias acima.


Classifique o tipo de alinhamento realizado?
Quais foram as posies que mostraram diferenas entre as duas sequncias?
Obtenha a sequncia protica dos genes.
H diferenas na regio que codifica a protena?

Exerccio 6:
Utilizando a sequncia abaixo:
ctactggtacttcgatctctggggccgtggcaccctggtcactgtctcctcagagtcttctctgtccaggcacc
1. Identifique o gene e a qual organismo ele pertence.
2. Obtenha a sequncia completa do gene no NCBI.
3. Alinhe as duas sequncias utilizando o programa CLUSTALW e verifique se houve
mismatches e gaps no alinhamento.
4. H alguma alterao na regio que codifica a protena?

Exerccio 7:
Defina os seguintes itens e d exemplos de aplicao:
a)
b)
c)
d)
e)
f)
g)

Comparao de sequncias;
Alinhamento mltiplo;
Primer;
Busca booleana;
Via metablica;
Patente;
Domnio.

Exerccio 8:
Execute os seguintes passos e registre (print screen) os resultados de cada etapa
marcada com *:
a) Escolha um gene com 100 ou mais aas e inclua sua sequncia em formato fasta na
resposta;
b) Execute BlastP (NCBI BLAST) e registre: a tela de configurao da busca; a lista de
resultados (aps o grfico) e o segundo e o terceiro alinhamento;
c) Interprete sucintamente o resultado (ignorando o primeiro alinhamento),
confirmando ou no a identidade da protena e justificando a resposta;
d) Execute um alinhamento mltiplo com sete (7) protenas prximas entre si, de
organismos diferentes, e registre o resultado do alinhamento. Comente o resultado;
e) Desenhe um par de primers para amplificar uma regio de 190 a 238 bp no gene
desta protena, e registre configurao da ferramenta e as informaes dos primers
escolhidos;
f) Verifique se existem patentes relacionadas a esta protena, e descreva:
g) Detalhes da busca executada;
h) Nmero de patentes encontradas;
i) Comente sucintamente uma destas patentes.
j) Identifique o EC number desta protena;
k) Classifique a protena segundo as ontologias do Gene Ontology.

Exerccio 9:
A sequencia abaixo est indicada em duas verses. A primeira sem edio (contendo
ntrons e xons) e a segunda editada (spliced). A partir de uma destas sequncias
indique:
a) Local onde comea e termina o gene.
b) Aps esta identificao da linha a, indique tambm a posio do gene completo no
clone BAC de 79.629bp anexo (ex. +50bp a + 350bp, ou -1800bp a -1350bp).
c) Em qual sentido esta o gene. Fita + ou fita -?
d) Marque no gene completo, quem so os ntrons e os xons.
e) Utilizando o clone BAC, e a localizao do gene feito no item b, identifique e
selecione a regio de 2 kb upstream ao gene. Qual a funo desta regio de 2kb
upstream ao gene?
7

f) A partir da seleo da regio upstream, monte a sequncia completa entre regio de


2kb e gene. Monte tanto com o gene completo quanto com o gene editado.
g) Identifique se h regies promotoras em cada uma das sequencias (gene + upstream).
E se houver, indique suas posies na sequencia (use cores).
h) A partir das duas sequncias de genes, se tivssemos que desenhar 1 par de primers
para ser usado em PCR tempo real, qual seria a melhor estratgia para desenhar estes
primers? Justifique? E como fazer para saber se o cDNA usado no tempo real no esta
contaminado por gDNA?
> Gene completo
CGTGGCTGATGTTGCCAGTTGAAATCGGTAAAGCAACTGCATCGCTGTATCCTGACGAGAATAATGATAC
ATCTCAAAATAATGTTATAAGTGCATTAGCCCCTCCGAAGGATGTGGATGATCTGCGGCTGATATCTGGA
TATGGAAATGTCAATATATTCACGTATAGTGAATTGAGAGCTGCTACCAAGAATTTCCGGCCAGATCAGG
TTCTTGGAGAGGGTGGCTTTGGGGTTGTATATAAAGGTGTTATTGATGAGAGTGTCAGGCCAGGTTCTGA
AACCATCCAAGTTGCTGTGAAAGAGCTAAAGTCAGATGGCTTGCAGGGAGACAAAGAGTGGCTGGTAATT
TCTCCCTTGTACTTTTTCAATATTTCTAAGCCAAATTTGTGATATTTTTTCTATCAGCATGAAATTTTCT
TCACTGCACCATGTATGAGTTGGCAATTGTAGTGAAAAACAAGTTGTCTGAAAAATGTTGTGGAGATTAC
ACCTAAGGTCTTATCTATTTCATCCTCTTGTATGTAGTTGAGTTAATGCTGTAAACACCCTCAACTTATA
TAAAATGAAGTCACCTACACAACAAACGGTTGCATTCTGCAGATCATGATTATGTTATTTAATGGCAATA
TCACTGATTGTATAAGTTGGAAACACTTTTATTGTTCTTTGGGAAATATATATTACATATTTGCTCTTCT
TTGACTTTAGGCTTAAAGGGATGTTCATCCCTAAAAGCAGATATCTAAAGATGTATGGGAATATGAAATT
TCTTCAAGTAACCACGGTATCTTCCAAGATGTAATTTATGTTACCTTATTTAGATATCTTATGGAAATCC
AGAGAGGCATTACTGTCAAGAGAAATCGTTGTCGGTCCCCCTTTGGCTCCATCTCAACTTAAAATTCAGG
AGTACATGACATGTGTAGTAAACTAATACCAAGTTTTGTCTTGGTCAGCTCCTCTATTATAAGATATATC
CATATATCAGTTTATTTGGAAGTCCAAGATGCTTAGTAACCTGTGACAAAACCCTATAATAGAAATTTGG
ACAGCCATATGATTACCAAAGATGGATTCCTTTTCCAATACCTCTGTGATCCATCTAGAATACTAATTAA
ATGAGTTACTTTTTGGGTTTTTCAGGCAGAAGTAAACTATCTGGGGCAACTTAGTCATCCCAATCTTGTC
AAACTTATTGGTTACTGCTGTGAAGGTGATCACAGGCTGCTAGTTTATGAGTATATGGCTTCTGGCAGCC
TTGACAAGCACCTATTCCGACGTAAGTACTTCTCTGACACGAATATCATTACAATACTTAACTAGTTATT
GTATAAAAATATTAATATCTGAGTTTGTGGAATTTAGTAAATGGTCATTTGTGGCCATCTATCCATTATA
ACTTCATTTTTCGTTTTTGACAACAGATAAATAATTTTTTAATCTACTATGAAGAAGGAAGACTTTATTA
CTAGAAACTAGATATTGCTGGGATATATATACCTTTTGTATATATGCATGGATTTGTTTATTATGCCTTT
GTATAACAAGAGCAATTAATTGTAACTACTAACTATCTTCATCATATCATTCCTTCCTGCCACTGGGCTC
TGTTAAGAGCAATGTTAGTAGTTTATAGTTGTTTATGGACTCTTTCCCCTGGAACCACTAATCTGGAATA
ACCTACTTATACATCTAGATATCTTTATGTATGAAAGAATTATGAAAAAGATCTCATCTATCCTCAAGTT
TTTTCATATATTCAAATAAATTTTATCTTATATTTTTTAATAATAACATAAATCTATCTTGTTATCCTCA
TATGTGGCTAACAATTAATAATAATTAATCCTTTATCCTTCCTAGTTACTAAGTGACAAGTAAACCGGTG
GGTAAGTAGTAAGCTAGGGGCAAAATTAGTAAGTATTGGGAATTGTACATGATGAGTTGTTGCCTTTTGC
TTATGAGCACAATTGCCCCTATCTAATTTCCCATATAATAAACATAAGTTTCTAAAATATACAATGATTA
AGTATAAGACAATACACTAAAAACAACTTTGCAACTTGGTAACTTTCTATATTCATTGTGTGTGTCTGAA
ATATTTCTACCTTCTTGACAGGGGTTTGTCTTACAATGCCATGGTCTACTCGAATGAAAATTGCCCTTGG
TGCTGCAAAAGGACTAGCCTTTCTTCATGCAGCTGAAAGATCAATCATCTATCGTGACTTCAAGACATCA
AATATCTTACTGGATGAAGTTTGTATCTCTTCCTGCACCATTGGACTGGATATTTAAATTCCCTAGAACT
ATCTCCATTATCATTATCATATTTAGCAAATACAACTCGGTTACAGATTATTAAGTCCCTATATATATTT
GTAGTCCATAAATTTTATGGCCATAACATTGGCAAAATTAGATAAATTTGTTATATGGTTAAGAGCACAA
TTACTTACATAAAAAAGATTTCATTCAGTGTGAACTCATCAATTTCTAAATAATGAGTCTGTTCCATTAA
AAAAAAATGCATACTTATTATTTGAAAAAGAAAATTGCAAATGTCCAGTATGTGAGCAACAAAAGTGGTT
ACTGAATCAATGAAAACAAGTAACTAAGGAACTCCATCGTATAATAATATTAAGGATACCCTTTTGAAGC
ATGCCCATACTGTGAAAGGTCATTTATATGTTTTTCATAACCTGAAAATATAGAAATCATGAAAACATAG
TTGTTTCTCATCAATTGTCTAATTCTTGGATTCACTTTGCAGGATTACAATGCAAAGCTCTCAGACTTTG
GCCTTGCAAAAGAGGGGCCTACAGGTGACCAAACTCACGTTTCCACTCGGGTCGTTGGTACATATGGATA
TGCAGCTCCCGAGTATATAATGACTGGTGAGCTTCTCAAACCACAATCCCAACTTTATGCAAAAGGGAGT
GCAGATAATTAAGCGCTACTAGCCACTTTCTATCGAGTGTCTCAAATGTGTAGCGATTACATGTCGTTTT
GTATATTTCTCTAGATGCCTTGCAGTAGGTGACTCTTTCTGGTTCTCTATTCCTTTTCTCTAAAGAAAAC
CCATGTTTCTGAGCAACTTTACCTCTATTTTAGGGCGTATAGCTTAAATTAATAACTTCTTTCATTTATT
CCAGGCCATTTAACTGCAAGGAGTGATGTTTACGGATTTGGAGTTGTATTGCTGGAGATGCTTTTAGGGA
GAAGGGCAATGGACAAGAGCAGGCCCAGCAGACACCAGAACCTCGTTGAGTGGGCTCGACCACTCCTGAT
CAATGGTCGGAAGTTGCTAAAGATCTTGGATCCAAGAATGGAAGGGCAATATTCTAACAGAGTTGCAACA
GATGTGGCTAGTTTAGCATATCGATGCCTGAGCCAGAACCCGAAAGGGAGGCCAACAATGAACCAAGTAG

TCGAGTCGCTTGAGAGCCTTCAAGACCTGCCTGAGAACTGGGAAGGCATCCTGTTTCAGAGCAGTGAAGC
TGCTGTGACTCTCTATGAGGCTCCAAAAGAGATTGCGAGTGACCATTTAGAAAAGAACTCCAGCGAGAAT
GGAGAGAATGGATCAAATGTGCATGCCAAGGGAAGAAAGAAGCTTGGAAATGGCAGAAGCAACAGCGAGC
CACCGCCGGTGGAGTTCAGTCAGTACAGTCCTTCACCTGAGTCAGAGAGACATGAGCCAAGTAGAAGATC
AATCGATCATGACAGAATTCCAAGGCCACCTGCCTATTGACGTGGCTG

> gene editado


CGTGGCTGATGTTGCCAGTTGAAATCGGTAAAGCAACTGCATCGCTGTATCCTGACGAGAATAATGATAC
ATCTCAAAATAATGTTATAAGTGCATTAGCCCCTCCGAAGGATGTGGATGATCTGCGGCTGATATCTGGA
TATGGAAATGTCAATATATTCACGTATAGTGAATTGAGAGCTGCTACCAAGAATTTCCGGCCAGATCAGG
TTCTTGGAGAGGGTGGCTTTGGGGTTGTATATAAAGGTGTTATTGATGAGAGTGTCAGGCCAGGTTCTGA
AACCATCCAAGTTGCTGTGAAAGAGCTAAAGTCAGATGGCTTGCAGGGAGACAAAGAGTGGCTGGCAGAA
GTAAACTATCTGGGGCAACTTAGTCATCCCAATCTTGTCAAACTTATTGGTTACTGCTGTGAAGGTGATC
ACAGGCTGCTAGTTTATGAGTATATGGCTTCTGGCAGCCTTGACAAGCACCTATTCCGACGGGTTTGTCT
TACAATGCCATGGTCTACTCGAATGAAAATTGCCCTTGGTGCTGCAAAAGGACTAGCCTTTCTTCATGCA
GCTGAAAGATCAATCATCTATCGTGACTTCAAGACATCAAATATCTTACTGGATGAAGATTACAATGCAA
AGCTCTCAGACTTTGGCCTTGCAAAAGAGGGGCCTACAGGTGACCAAACTCACGTTTCCACTCGGGTCGT
TGGTACATATGGATATGCAGCTCCCGAGTATATAATGACTGGCCATTTAACTGCAAGGAGTGATGTTTAC
GGATTTGGAGTTGTATTGCTGGAGATGCTTTTAGGGAGAAGGGCAATGGACAAGAGCAGGCCCAGCAGAC
ACCAGAACCTCGTTGAGTGGGCTCGACCACTCCTGATCAATGGTCGGAAGTTGCTAAAGATCTTGGATCC
AAGAATGGAAGGGCAATATTCTAACAGAGTTGCAACAGATGTGGCTAGTTTAGCATATCGATGCCTGAGC
CAGAACCCGAAAGGGAGGCCAACAATGAACCAAGTAGTCGAGTCGCTTGAGAGCCTTCAAGACCTGCCTG
AGAACTGGGAAGGCATCCTGTTTCAGAGCAGTGAAGCTGCTGTGACTCTCTATGAGGCTCCAAAAGAGAT
TGCGAGTGACCATTTAGAAAAGAACTCCAGCGAGAATGGAGAGAATGGATCAAATGTGCATGCCAAGGGA
AGAAAGAAGCTTGGAAATGGCAGAAGCAACAGCGAGCCACCGCCGGTGGAGTTCAGTCAGTACAGTCCTT
CACCTGAGTCAGAGAGACATGAGCCAAGTAGAAGATCAATCGATCATGACAGAATTCCAAGGCCACCTGC
CTATTGACGTGGCTG

También podría gustarte