Está en la página 1de 55

c 





c  c  

© 
 
  

c   
   
  
 
 


 

‡ John Vanhemert - jlv@iastate.edu
± John is developing new tools for PLEXdb, and as such is involved in the plex database. John's difficulty
understanding the existing database structure and his recognition of its many flaws led him to propose a redesign of
the database. John was our primary point of contact, providing us with initial requirements and continuous
feedback.

‡ Sudhansu Dash - sdash@iastate.edu


± Sudhansu is a curator for PLEXdb. He is the expert on the data and how users access it. He was able to help clarify
what data was important and how it was linked together.

‡ Ethalinda Cannon - ekcannon@iastate.edu


± Ethy was one of the original creators of PLEXdb. While she is no longer on the PLEXdb project, she was
graciously willing to meet with us and explain some of the considerations that led to the orginal design. She was
very helpful in explaining how some of the original tables were meant to join together.

‡ Julie Dickerson - julied@iastate.edu


± Julie is a PI on the PLEXdb project. Julie gave to go-ahead to start our pilot project. She expressed approval with
our ER design considerations.
c c c 
   

     

   


  


   
! 
"  #$  
"#%&  

 

‡ Clients initial motivation in soliciting our group to


work on their project included
± Recognition of existing problems, although the extent of
problems had not been assessed.
± Need to store new types of information in PlexDB
required updates to the schema.
± Without documentation, knowledge of the database had
been lost as its designers moved on. If the database was
allowed to grow in size without clear understanding of the
tables, the project risks introducing problems later on.
± Clients wanted to start fresh with a clearly documented
and properly designed schema

 


     

        


        
©         
     
     
 

         


      
    
       

     
    
       
             
    
    
c !"
#   $%&
   "
+ 
'   (%    ,

   ()
    !!)
   $(
c   *"
   $!


 
©     
  

& c  '

 
#

-

. 


   

+     


 
   !  

&    '

+      


´" 

   
.    

& c  '

   


  
 

  
  

&    '


   
   
    /
c 


- 0 
6 ©    
6 c  1-© 20      

3 0 
6 ©    4      
        
"
  

20   &%55
1    $"%6!!
c7c   &!$(
Jesse Walsh

  #$#%
Ô& 

‡ MIAME
± (Minimum Information on A Microarray Experiment)
± Does not specify particular format or terminology
‡ PlexDB claims to be MIAME compliant
± Our design to be MIAME compliant
± Unfortunately, we learned about MIAME late into the
design process
± We could achieve MIAME compliance with small tweaks
% #%  '  

‡ The raw data for each hybridisation (e.g., CEL or GPR files)
‡ The final processed (normalised) data for the set of hybridisations in the experiment
(study) (e.g., the gene expression data matrix used to draw the conclusions from the
study)
‡ The essential sample annotation including experimental factors and their values (e.g.,
compound and dose in a dose response experiment)
‡ The experimental design including sample data relationships (e.g., which raw data
file relates to which sample, which hybridisations are technical, which are biological
replicates)
‡ Sufficient annotation of the array (e.g., gene identifiers, genomic coordinates, probe
oligonucleotide sequences or reference commercial array catalog number)
‡ The essential laboratory and data processing protocols (e.g., what normalisation
method has been used to obtain the final processed data)
Ô& 

‡ Biological data can be complex
‡ Procedures used and data collected can vary widely
± Require a flexible schema to handle this
  
$" 
  









 

    ©  

+  $

+  !





  
  

    ©      


   
   

+  $    


   
   

+  !    


   
   

 ( ) * 
‡ Time
± 10 hrs
± 20 hrs
‡ Temperature
± 30 F
± 50 F
‡ Stress
± Control
± Salinity
± Drought
  
  
¦"   %#+,
& "  
‡ Microarrays measure genes
‡ The smallest thing measured are probes
‡ Probes are grouped and summarized into probe sets
‡ Roughly, probe set = gene
‡ Microarrays experiment is called a hybridization
  
  
  
Arun Chander

##Ô#  $-

 "
Õ
-8 6 8 6  9
( ) Õ
-8 68 6   8 6   6  9
c) Õ
-8   8   6  8   666 8   6 8   6
  8   6 9
* Õ  68 8 8  866 6 8 8  8  8 6   8
8  868  8 8 8 8  8 8 68  66
8   86 9

 Õ 8   8  8  8 6869


 Õ
-8  6 8   6 8   6   8  68 6
8: 6   8: 6   6   8   8 6   8  6   
8   6   8   6 686 86   8 8  8  8
  8 /6 8   8 8  9
´Õ
-8   8 c  8 c  9
& Õ
-8 68  6  6 8  6 8  6 8   68
 6   68  68  6   8   8  
6 8   6   88 6   8   6  
 8   6 8 68   6 8  6  8 6
88 6  8 6   8     6  8  
6 :8  6  68   6  68  6  68 
  6  68  6  68 8  8   9
  (  Õ
-8  68 6   69;
+,(  Õ
-8  6  6 8  68
   6  6 8  68  8 6 8©-<6 6 9
 (( Õ
-8 9
 (Õ
-8 8  8  8 8  869
 (( -
-8  68 69
 Õ
-8  6   9
c Õ
-8 6 8 6 6 8  6 86  86 6  8
 6 8  686   8©-<6 8©-<6 6 8©-<6 6   8
©-<6 8  6  8 6  8 62©6  8  6   68 6 8
6 8  8 6  6 8  6 86  8 6  8
 8 8  9
   Õ  68 9
. ,( Õ
-86 86   8 68
 6 6 8  9
  (( Õ
-88 6  69
c ( Õ
-8  8 6  69
c (  Õ
-8   68 6  69
c (Õ
-8   68 6  69
( Õ
-8  68  8  68  9
&Õ
-8  6 8  6    869
- 

Stephen Mueller

## % $# ´-
  

‡ Access to VM is slow
‡ Inconsistencies
‡ File Names
‡ Users that don¶t exist
 !  ! .
‡ ER Diagram and Schema Complete
 ! /
‡ Updating entire database will take place over time
‡ Views keep website working
 ) "/ "+ / &
‡ Continuous learning
‡ Continuous requirements gathering
‡ Complex data
‡ Data inconsistencies
 ) "/ "+ / &
‡ Getting the data we needed
‡ Sometimes didn¶t know who to ask
‡ Virtual Machine
‡ Installing software
‡ Accessing for data migration
Brian Nordland

¦Ô 0 ´c%-


´ 
 
‡ Previously the organism was stored with experiment
´ 
 
´ 
 
´ 
 
‡ Previously the organism was stored with experiment
sample
‡ No sense or hierarchy
‡ http://cs461-1.cs.iastate.edu/
‡ Hierarchy adds future ability for more meaningful
info
´ 
 
‡ Uses a nested set model for hierarchies
´ 
 
‡ Uses a nested set model for hierarchies
´ 
 
‡ Uses a nested set model for hierarchies
‡ Makes selecting portion of tree easy
´ 
 
‡ Uses a nested set model for hierarchies
‡ Makes selecting portion of tree easy
‡ SELECT * FROM tree WHERE lft BETWEEN 2
AND 11
´ 
 
‡ Nested Set Model makes retrieval easy
‡ Changes more complicated, ³re-indexing´ required
) 


‡ Organism Editor
± Ability to move portions of the tree
± Login ability to editor/Integration with PlexDB
‡ Make PlexDB Use Our Data
± Two-phase process creating views
± Change PlexDB Code to use data directly
‡ Implement Data Partitioning
$ % 

‡ Every member was involved in each aspect of the


project, but each member also focused their efforts
on coordinating certain tasks
$ % 

‡ Project Manager: Jesse Walsh


± Responsible for understanding biology concepts
± Focused on ER design
‡ Web Developer: Brian Nordland
± Focused on organism editor
‡ Java Developer: Stephen Mueller
± Focused on data migration
‡ DBA: Arun Chander
± Focused on creation of tables
M
,,,

También podría gustarte