Está en la página 1de 10


The role of genetics in medicine is growing. An increasing number of diseases are being linked
to genetic factors. However genetic analysis is still an arcane and difficult process, requiring vast
amounts of specialized knowledge that keeps this increasingly important diagnostic tool out of
the reach of many.
A single genome contains vast amounts of data. Modern genetic analysis techniques read only a
few hundred base pairs at a time (the human genome consists of around 3 billion base pairs).
These base pairs are read by comparing them to existing databases of genes, but a lack of sharing
between organizations means that the data available to any one individual or organization is
Our aim in this project is to create a universal database for genes, integrated with existing health
records. The enormous amounts of data we could harness with such a database could be used to
train artificial intelligence algorithms, using existing machine learning techniques, that could
probe into how our genes affect us in ways that we as yet do not know, as well as providing an
invaluable and cost-effective diagnostic tool for doctors everywhere.
Big data and artificial intelligence technologies have been having a moment recently. It seems
like everyday something new emerges at the forefront of these fields. The massive cache of data
gathered by technology companies and the Internet at large seems to be bearing fruit. When
brainstorming ideas for this competition, one thing we told ourselves was to imagine that the
entirety of that cache full of information was at our disposal. What problems could we solve with
a literal planet-full of data? Commented [G1]: Inserted: at
Commented [G2]: Deleted:on
The Universal Diagnostic System encapsulates the central promise of big data technologies. It is
a system that uses vast amounts of data to find novel solutions to existing problems and, indeed,
find solutions to problems that we may not be aware of yet. It can be summarized by a simple
question: if you had access to the genes and medical histories of an entire city, an entire country,
or even the entire world, what could you do with it? It's an interesting question and one that we
set out to answer. Commented [G3]: Deleted:,

The dream of cracking the genetic code is an old one. The first genome was sequenced in 1977,
but people have dreamed of fully understanding the secrets of the DNA molecule for far longer
than that. While our modern understanding of how our genes work is sophisticated beyond the
wildest dreams of the pioneers that brought us here, our understanding remains incomplete. With
UDS we aim to advance our understanding of the genetic code, and by extension, life itself.
UDS is the result of applying that approach to the problem of genetic analysis, which is a
growing area of modern medicine. In recent years there has been an increasing number of ties
discovered between diseases and genes, however, getting a diagnosis that takes genetic factors
into account is expensive, time-consuming, and far beyond the reach of most individuals.
Modern medicine has been and continues to be, a boon to the entire world, yet one of the most
promising paths for improvement has been barred for anyone but the wealthy. Commented [G4]: Inserted: s
Commented [G5]: Deleted:ve
Commented [G6]: Deleted:,
With UDS we not only aim to improve on modern genetic analysis techniques with big data but
to also democratize the availability of quality genetic analysis so that anyone may benefit from it. Commented [G7]: Inserted: he
Commented [G8]: Deleted:o

Big data in medicine

The main objective of technology in the field of medicine is to radically change the
understanding, treatment and diagnosis of diseases. With the increasing number of data-driven
approaches in all fields, even the healthcare sector has embraced it to improve the quality and
standards of diagnosis. Using big data one can build tailor-made health profiles, predictive
models. An individual's data won't be treated in isolation rather it will be compared and analysed
with thousands of other data sets. It can be the data collected from Electronic Health Records
(EHRs), gene data, fitness devices or even smartwatches. Commented [G9]: Inserted: -
Commented [G10]: Deleted:a
Decoding DNA
DNA sequencing has been evolving since it was first done in the 1970's. The Human Genome
Project which was completed in 2003, paved way for affordable and accurate genome
sequencing. In the coming years, DNA sequencing would be much quicker and a low-cost affair.
Keeping this in mind we came up with UDS, by the time this project would be implemented the
cost of DNA sequencing would be down significantly. Commented [G11]: Inserted: -
Currently, Second Generation Sequencing (SGS) is in practice. SGS involves establishing the Commented [G12]: Inserted: ,
order of nucleotides in a particular strand of DNA through which the genetic code can be
analysed. Comparing the normal versions and disease-causing versions of a gene allows changes
in a particular nucleotide order to be determined. These changes are responsible for causing
diseases. Commented [G13]: Inserted: -
DNA sequencing involves a nucleotide known as ddNTP, which is similar to regular nucleotides Commented [G14]: Deleted:,
but ddNTP stops DNA replication. When ddNTP is added to chains of DNA, the DNA stops
growing as ddNTP prevents the number of nucleotides from increasing. This results in numerous
DNA strands with different lengths. These partial copies are then loaded into a machine with a
laser which reads the fluorescent tags of each ddNTP which reveals the order of nucleotides.
This process is known as PCR amplification. It is time-consuming as the strands of DNA can't be
arranged in the correct order before trying various combinations. Commented [G15]: Inserted: -
Considering the demerits of SGS, scientists are working on how to improvise on the SGS to give Commented [G16]: Inserted: is
mankind the Third Generation Sequencing which won’t need the repeated PCR amplification as
the SGS; this would result in a quicker and cheaper process.

Electronic Health Records (EHR)

An Electronic Health Record is a digital version of a patient’s medical history, diagnoses, lab
tests and prescriptions. The benefits of EHRs are well known: they provide accurate information
about patients, they enable quick access to records in case of emergencies, they also facilitate
coordinated and efficient treatment. The key aspect of EHRs is that they are only available to
authorized personnel. Access to an EHR is only restricted to the patient, the patient’s doctor, lab
technicians and the EHR provider.
Despite the multitude of benefits associated with associated with EHRs they still haven't been
adopted in many countries, and there are a large number of competing services that offer EHR
services where they are available. Due to this, it is often hard to transfer data across institutional
and national borders. A clear need for a universal standard for electronic health records has
arisen. Commented [G17]: Inserted: ,

Our Inspiration

Modern medicine is one of the wonders of the world. Data is integral to the success of modern
medicine, from simple acts like doctors gathering patients' medical histories to large population
medical research studies, having large amounts of high-quality data is an essential part of
progress in medicine, access to such data can save lives. In recent years, Alphabet's Verily Life
Sciences LLC, and IBM's Watson division among others have made much progress in applying
big data technologies to medicine. It was these attempts that inspired us to create our own
proposal. Commented [G18]: Inserted: -

How it works
The basic premise of UDS is to create a database that contains the genetic, and medical data of as
many people as possible. Why genetic data? The genes we carry determine a large portion of our
biological lives and a growing number of diseases and disorders are being linked to genetic
factors. Moreover, most commonly available medical data does not include genetics despite it
being an increasingly important factor in medical diagnoses.
Once we have the data, we can use the existing knowledge to train machine-learning algorithms
to "read" the genetic data and to predict how it relates to medical conditions experienced by a
person. We expect that with a large enough data set it could be just as good, if not better than
trained experts at reading genetic code. We can then set the algorithms to search for new
correlations between genes and medical conditions and symptoms. The cumulative effect of this
will be a system that can read a genetic sequence and return a diagnosis based on analysis of
millions of other examples, and it can do this exponentially faster than any human. The system
might look something like this:
The actual system would comprise of natural language processors to read and standardize
symptoms and medical histories, as well as a specialized system to link genetic strings to
symptoms and conditions.
UDS will not contain any identity data and will be fully encrypted, however, every person that
has their data entered will be issued a Unique Identification Number (UIN), that can be used as a
sort of key to their own genetic and medical data. With the UIN the person can access their own
data, as well as control who has access to their data, including their doctor and any researchers
that might want to access it. This will allow UDS to act as a universal system for health records,
and provide data for academic research. If a UIN is lost it can be recovered by using genetic data. Commented [G19]: Inserted: ,
Commented [G20]: Deleted:,
Collecting the requisite data for creating UDS will, of course, be difficult, not to mention the
enormous costs involved in actually extracting genetic sequences from tissue, in addition to
indexing and developing the programs to process all of it. Many people may be initially reluctant
to offer up all their medical data, one possible avenue for overcoming this obstacle would be
obtaining this information from deceased individuals such as organ donors. Governmental
support would be invaluable in making this possible but not necessary. Commented [G21]: Inserted: ,
Commented [G22]: Inserted: ,

Expected Impact
Medicine is a boon to every living individual on the planet. UDS helps advance medicine
meaningfully and as such impacts everyone, hopefully, it could have a positive impact on us all. Commented [G23]: Inserted: a
Commented [G24]: Inserted: ,
As a Diagnostic Tool
We expect UDS to have an enormous impact as a diagnostic tool. With access to a patient's
genes and medical history, it should be able to provide diagnoses with a high level of accuracy,
which should help curb the estimated hundreds of thousands of deaths due to misdiagnosis every
year. Commented [G25]: Inserted: ,

As a Standardized Format for EHRs

Arguably the most important impact of UDS will be as a standard for medical records. Existing
medical records are highly fragmented and sharing them between clinics and hospitals is grossly
expensive and inefficient. UDS would streamline that entire process as anyone can share their
medical records with the doctor of their choice through the use of their UIN.

Future Applications
One thing we hope to do with UDS is to make the data available to researchers and academics
that can hopefully use the data to provide us with even more medical breakthroughs that propel
us forward as a species.
Another possible future application of UDS would be providing free diagnoses to patients that
may be unable to pay doctor's fees. A person could access UDS online, input their symptoms and
get a possible diagnosis with a risk rating that could help people avoid doctor's fees. Commented [G26]: Inserted: i
Commented [G27]: Deleted:e
UDS is a diagnostic tool that pushes forward and increases access to an emerging form of
medicine. Commented [G28]: Deleted:,

It is a novel way to analyse genetic data in the context of a patient’s current symptoms and
medical history.
It creates a new standard for EHRs, one that provides significantly more value to the consumer
than existing standards.
Most importantly, it has the potential to provide millions with access to some facets of modern
medicine that have previously been unavailable.
Finally, anyone, no matter who they may be, can benefit from the Universal Diagnostic System.
Watson white paper