Data provenance and metadata exchange

Background

Wouldn't it be nice if everyone used XNAT for their brain imaging research? That would make data sharing much more straight forward! Sadly they don't - rather a number of databases exist that have very different data models and ways to access and query their resources. There is hope because many of these systems collect much of the same information, but use different terms.

It would be really great if there was a common data exchange layer so the tools we create can talk to multiple popular databases (e.g., XNAT, LORIS, NIMS, HID, ABA, etc?) and query for some subset of common data elements.

This project will explore how to map the current XCEDE, XNAT and other brain imaging schemas and represent metadata within the context of the PROV data model (PROV-DM) being developed by the W3C. The focus here is on metadata related to exchanging provenance.

The scope of the project is limited to public, open access data sets (OASIS, fcon_1000, etc.) to limit authorization/authentication issues.

This project may hook in with other projects including: REDCap Integration, Common Data Elements

When/where

Overview of PROV and discus provenance ~10am after initial talks
Conference call w/INCF Friday 11am CDT in room 214
Hack/plan for future development

Who

Nolan (lead)
Christian (lead)
Kevin Archie
Tom Gee
Chad Cumba
Add your name

Approach

Map the core XNAT schema into the XCEDE data model, which is based on W3C PROV
Develop a module to hijack the XNAT REST API to serve up data in the common data model

Resources

W3C PROV Primer - http://www.w3.org/TR/prov-primer/
A more gentle intro to PROV - https://www.ibm.com/developerworks/mydeveloperworks/blogs/nlp/?lang=en
Link to current spec - https://docs.google.com/document/d/1Bic42-jURqPUzvTCB_YBL5xcxO5EgEcEiYi560W0dR4/edit