Está en la página 1de 32

An exemplar for data integration in the biomedical domain driven by the ISA framework Shannan Ho Sui AMIA Summits

on Translational Bioinformatics March 19, 2013 http://stemcellcommons.org

This is a story about collaboration...

ISA

ISA

Disparate Stem Cell Resources


Inconsistent data formats, experimental
descriptions and results

Disparate Stem Cell Resources


Inconsistent data formats, experimental
descriptions and results

The Stem Cell Commons



A shared data and analytical resource Bioinformatics support for research at the HSCI A community Support/ consults Data repository Analysis system

The Stem Cell Commons



A shared data and analytical resource Bioinformatics support for research at the HSCI A community Support/ consults Data repository Analysis system

user community

Susanna-Assunta Sansone isacommons.org

Susanna-Assunta Sansone isacommons.org

General-purpose, congurable format, designed to support the use of several standards checklists, terminologies and conversions to (a growing number of) other metadata formats, used by public repositories, e.g.
MAGE-Tab Pride-xml

SRA-xml

SOFT

Rationale for developing ISA


Capture all salient features of the experimental workow Make annotation explicit and discoverable Support data provenance tracking Use community standards

Susanna-Assunta Sansone isacommons.org

53 studies 1098 assays

87 studies

ISA assays 1179

Curator

148 studies 2356 assays

Manual merging process

53 studies 1098 assays

87 studies

ISA assays 1179

ISA-Tab

148 studies 2356 assays

Conversion driven by ISA-Tab

Data uploads and annotation

Current Data Statistics

Filtering data using metadata as search facets

Experiment description

Experimental protocols and data downloads

ISA-Tab metadata downloads and export

Linking data to the Galaxy workow engine

Renery:
An analysis and visualization framework
In development

Viewing and selecting samples in list view

Viewing and selecting samples in matrix view

Initiating workows

Monitoring progress

Integration with the IGV genome browser

Challenges

Changing research culture(s) to recognize the value of data sharing Manually curating the data for consistency and completeness Managing large volumes of data Standardizing workows Ensuring interoperability when integrating multiple systems and tools Technical complexity of software development effort

Renery

Peter Park

Nils Gehlenborg

Richard Park

Psalm Haseley

Ilya Sytchev

Shannan Ho Sui

ISA Commons
Oxford e-Research Centre
A growing community that uses the ISA metadata tracking framework to facilitate standards-compliant collection, curation, managementand reuse of datasets.

Susanna Sansone

Eamonn Maguire

Philippe Rocca-Sera

WikiPathways

Meet the Team


Center for Stem Cell Bioinformatics Collaborators

Winston Hide
Program Leader

Sudeshna Das
Repository

Shannan Ho Sui
Analytics

Oliver Hofmann
Core services

Nils Gehlenborg Richard Park Psalm Haseley Peter Park

John Hutchinson
HSCI Analyst

Emily Merrill
Bioinformatics Analyst

Ilya Sytchev
Bioinformatics Developer

Stphane Corlosquet
Bioinformatics Engineer

Eamonn Maguire Philippe Rocca-Sera Susanna Sansone

También podría gustarte