Bioinfo 2

www.geocities.
com/chinna_chetan05/fo
rfriends.html
BIOINFORMATICS
ABSTRACT:
The emerging field of bioinformatics uses information storage, data analysis and clusters for large
scale computing to deal with the mass of information on decoded genes. To compute such a large
scale of data, an advanced computational methods neural networks, artificial intelligence are used to
assist in storing and analyzing data generated from DNA sequencing.
This paper suggests starting with the usage of bioinformatics disciplines and interaction on web. There
are various tools used to analyses DNA and Protein, the information put in a centralized database to
access by scientist through out the world very easy. In next subsequent sections explained about how
to store information and analyze DNA in computer.
The database is maintained to store information and determine the entire DNA sequence of various
projects. The first foremost DNA sequence is right for crops and livestock. Once DNA sequence has
been determined, next step is to fan-out what these genes code for. Scientists are look for that. The
next step after that code means is to compare DNA sequence to determine how closely different
species are related on an evolutionary scale.
Finally, last section of this paper is how the different projects are used for analyzing, comparing and
writing the new code for representing DNA and how they are performed on the bases of which
language are mentioned very clearly.
Email: chinna_chetan05@yahoo.com
www.geocities.com/chinna_chetan05/fo
rfriends.html
Introduction:
New discoveries are being made in the field of genomics, an area of study which looks at the
DNA sequence of an organism in order to determine which genes code for beneficial traits and
which genes are involved in inherited diseases. With an increasing amount of information generated
in this area of study, scientists need a way of storing and analyzing that information. Computers can
really help in this process. As a result, a new research area that combines the study of biotechnology
and the use of computers is emerging. This field is referred to as bioinformatics and involves the use
of Internet tools, artificial intelligence and other advanced computational methods to assist in storing
and analyzing data generated from DNA sequencing.
K ey usage areas
Bioinformatics is the field of science in which biology, computer science, and IT combine to form a
single discipline. The goal is to discover new biological insights and create a worldwide perspective
from which unifying principles in biology can be found. Bioinformatics has three important sub-
disciplines, such as:
• The development of new algorithms and statistics to assess relationships among members of large
data sets;
• The analysis and interpretation of various types of data including nucleotide and amino acid
sequences, protein domains and protein structures ; and
• The development and implementation of tools that enable efficient access and management of
different types of information.
Interaction among scientist on the web

Scientist around the world need to share information on genome research. The Internet
links hundreds of thousands of individual networks all over the world. The e-mail
is a powerful internet tool. Then there are newsgroups and online
rfriends.html
chatting, and the World Wide Web (WWW). An increasing amount of

bioinformatics information has been published over the Internet. Many of the
tools used to analyse DNA and Protein sequences are also stored on the Internet
for easy access by scientists. Computer technology and genome research have
both grown rapidly over the past decade. It is expected that information
technology will continue to provide rapid advances to make genome research
more efficient, leading to better methods of diagnosing diseases, identifying
beneficial traits and providing cures for crop, animal and human diseases.
How to store Information

DNA is a molecuile made from sugar, phosphate and bases called Guanine (G), Cytosine (C),
Adenine (A), and Thia-mine (T). The various combinations of these bases make up the DNA in
plants, animals, bacteria, yeast and fungi. By these bases as blocks we could place them many
different ways. For example; we have AAGCT, CCAGT, TACGGT and so on. An infinite
number of combinations of these bases is possible. A database system is computerized record
keeping system. With increasing information, new database system continues to be developed.
Scientists are currently trying to determine the entire DNA sequence of various living beings.
Perhaps one of the best-known projects is the “HUMAN GENOME PROJECT”. Computers are
greatly assist in managing and storing all the information this project has unearthed.
Analyasis of DNA in computer:

The first step in a genomics project is getting the DNA sequence right for these crops and
livestock. Results generated from DNA sequencing could identify genes, regulatory sequences
and other functions. Genes are the units of DNA that code for particular traits (character). Once
the information of the DNA sequence has been determined, the next step is to find out what these
genes code for. This area of genomics is referred to as ‘functional genomics’ as it involves
determining the function of a gene. To determine the function, scientists study gene expression,
which will determine when, where and how much of a protein is produced. Comparing one DNA
rfriends.html
sequence to the sequence from closely related organisms assists these processes. Computers can
help compare DNA sequences and look for homologies, or related strands of
DNA. One can also compare DNA sequences to determine how closely two different species are
related on an evolutionary scale.
World Wide Services Of Bioinformatics:
once the hardware becomes available, scientists will also need the right solutions to help them
with their research. For a small database, much can be done on a desktop system. But as the
database grows bigger, more computing power is required, and cluster-type computers can be
more effective on larger databases. Also, data is not stored in one place and researchers
worldwide will want to access updates data. Hence, grid-style computing is considered more
suitable. Solutions that are available include BioPerl for solving high-level tasks, Emboss for
compiled programs and Cray BioLab for high performance computing.
The Bioperl project:

The Bioperl project is a coordinated effort to collect computational methods regularly used in
bioinformatics and life science research into a set of standard CPAN-style, well documented, and
freely available Perl modules. Perl provides unparalleled support for many tasks common in
bioinformatics and life science research, and yet there are no standard Perl modules for biology. This
is what Bioperl aims to achieve. It hopes to convince people to release more of their own code
into the public domain for others to see, learn from, use and improve. Its site says,”Some
people are very embarrassed about showing their code to others for fear of being told that
there is a better way of doing it. It is fine to have kludges and make hacka at least, if they
work and bioperl has a number of wince making or obtuse bits of code. Eventually,
someone takes them out and fixes. Nearly all of us have made some real big mistakes in
our time, so we’ve all been there. Peer/community review is a great way to
find bugs, optimize techniques and learn new things”. Pearl is useful for many different
problems it does not make sense for individuals and institutions to constantly reinvent the
wheel. This is especially critical for researchers and labs, which lack the resources to hire
dedicated teams of software engineers. Bioperl modules and source code will always be
rfriends.html
freely avallable under the same terms scientists performing tests and analyzing data with
as Perl. The Bioperl the help of advanced computer technology project is open to all and
invites suggestions and participation from the larger bioinformatics community.
BioWAP Service:
BioWAP is a bioinformatics service for portable devices (such as mobile phones) with
WAP (Wireless Application Protocol) compliance. It provides access to all the major
bioinformatics databases and anaiysis programs. BioWAP facilitates searching in formation
from all the major nucleotide and protein sequence databases, as well as studying structural
information and mutation data related to immuno deficiencies. With BioWAP it is possible
to search for general properties of sequences, user-defines patterns and restriction enzyme
recognition sites. The system has been implemented with the UML language, while the server
programs are written on Perl. BioWAP can be accessed with any WAP device. The service is freely
accessible through a WAP gateway with no service charge. Instructions on setting up your WAP de-
vice can be found: http://b i o i n f . u t a . f i / b i o w a p / gateway.html.
EMBOSS:
EMBOSS stands for 'The European Molecular Biology Open Software Suite'. It is a new, free,
open source software analysis package specially developed for the needs of the molecular biology
(e.g., EMBnet) user community. The software automatically deals effectively with the data in a
variety of formats and even allows transparent retrieval of sequence data from the Web. Also,
as extensive libraries are provided with the package, it is a platform that allows other scientists
to develop and release software in the true open source spirit. EMBOSS also integrates a
range of currently available packages and tools for sequence analysis into a seamless whole. It
has broken the historical trend towards commercial software packages.
The EMBOSS suits;

• Provides a comprehensive set of sequence analysis programs (approximately 100).
rfriends.html
• Provides a set of core software libraries (AJAX and NUCLEUS).
• Integrates other publicly available packages.
• Encourages the use of EMBOSS in sequence analysis training.
• Encourages developers elsewhere to use the EMBOSS libraries.
• Supports all common Unix platforms including Linux, Digital Unix, Irix, Tru64Unix and
Solaris.
Within EMBOSS there are around 100 programs (applications). Some of the areas covered are:
• Sequence alignment.
• Rapid database searching with sequence patterns.
• Protein motif identification, including domain analysis.
BIOSCI:
This is a set of electronic communication forums the bionet USENET newsgroups and parallel
e-mail lists and is used by biological scientists worldwide. No fees are charged for the service.
The BIOSCI site is hosted at the UK Medical Research Council's Human Genome
Mapping Project Resource Centre. BIOSCI promotes communication between professionals in

the biological sciences. There are other forums on Usenet, such as sci.bio.misc, which answer
questions from laypersons on bioinformatics.
C onclusion:
1. According to CII, the global bioinformatics industry estimated turnover of $2 billion in the
year 2000 and is expected to become $60 billion by the year 2005.
2. If the global bioinformatics industry and government work together, it is possible to
achieve a five per cent global market share by 2005, i.e., a $3 billion opportunity in India.
3. IT spending in bio-sciences in India will cross $138 million by 2005, mainly in the areas
rfriends.html
of system clusters, storage, application software, and services.
4. Biotech and pharmaceutical companies need huge software support. Software expertise is
required to write algorithms, develop software for existing algorithms, manage databases, and in
final process of drug discovery.
5. Major opportunity areas for IT companies in India are—refining content and utility of
databases, coming out with developing better tools for data generation, capture, and postscript,
developing and improving tools and databases for detailed functional studies, developing and
improving tools for representing, and analysing sequence similarity and variation, and creating
mechanisms to support effective approaches for producing robust, software that can be widely
shared.
References:
1. Students of BIOTECHNOLOGY.
2. IT (INFORMATION TECHNOLOGY) Magazine of June 2003.

Bioinfo 2

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Bioinfo 2

Cargado por

Copyright:

Formatos disponibles

www.geocities.

Interaction among scientist on the web

chatting, and the World Wide Web (WWW). An increasing amount of

How to store Information

Analyasis of DNA in computer:

The Bioperl project:

The EMBOSS suits;

• Provides a set of core software libraries (AJAX and NUCLEUS).

• Integrates other publicly available packages.

• Encourages the use of EMBOSS in sequence analysis training.

• Encourages developers elsewhere to use the EMBOSS libraries.

• Rapid database searching with sequence patterns.

• Protein motif identification, including domain analysis.

Mapping Project Resource Centre. BIOSCI promotes communication between professionals in

2. If the global bioinformatics industry and government work together, it is possible to

of system clusters, storage, application software, and services.

2. IT (INFORMATION TECHNOLOGY) Magazine of June 2003.

También podría gustarte