XNAT for Multi-site Studies: Iowa PREDICT-HD Project

This case study should be invaluable to anyone interested in setting up an XNAT to manage and coordinate a large multi-site clinical research project.This case study was prepared after speaking with Mark Scully from the University of Iowa.

Introduction To Using XNAT for Multi-Site Studies

Imaging research and analysis is increasingly dependent on acquiring data from large numbers of subjects, which in turn means searching across wide geographical areas to find enough subjects that meet your study's criteria. One way to manage this is to collaborate with a number of research institutions to recruit and image subjects from.

While this is economically more feasible (and friendlier to your subjects), it introduces a new host of challenges for study coordination: disparate scanning technologies and devices; non-uniform process for image acquisition and data handling; the challenge of aggregating all this data into a centralized system, and then managing access to this data across a large number of collaborators from outside your institution.

XNAT has evolved to solve this problem.

Because XNAT is a web-based application, it has the built-in capability to be accessed from anywhere in the world. Necessarily, security and fine-grained access controls are built into XNAT at the root level. XNAT administration has been built to support the complexities of multi-center research projects, including:

Highly configurable DICOM data importing, to unify data from multiple scan sources
Fully audited security
Siloed data access for each institution, with the capability of sharing data across all institutions
Customization of data queries that fit data into your study protocols
Protection from inadvertent PHI on data gathered from multiple sources
Reporting tools for study coordination

Currently, XNAT is supporting a number of high-profile multi-site research studies, including the Human Connectome Project, the DIAN study of inherited Alzheimer's Disease, the INTRUST study of post-traumatic stress disorders, and the PREDICT-HD study of Huntington's Disease.

Why did you install XNAT?

The PREDICT-HD project to research Huntington's Disease has been ongoing for 9-10 years now, pulling data from 30+ sites. A few participating sites are in charge of different aspects of the process; the ICTS at the University of Iowa coordinates the imaging study.

Prior to installing XNAT, researchers had already gathered a huge set of scan data, but had no formal system for collecting or organizing that data. Image files would be pulled off the scanner into a PACS system and written to directory somewhere, but conflicting ideas about organization among uploaders led to lots of extra copies of files. Imaging data that was collected at other sites might be uploaded to a "temporary" FTP site, or might arrive in the mail on DVDs.

XNAT was installed to become the canonical data store, a centralized repository that allowed for the creation and enforcement of data management policies. For example: any image associated with the PREDICT-HD project should exist in XNAT; "if it isn't in XNAT, we don't have it." XNAT's uploading tools also replace all methods of submitting data into the project, creating a unified workflow that makes XNAT "the center of everything." A new scan comes in, goes through the PREDICT-HD processing pipeline, and is entered into the database.

However, project-wide enforcement of these policies is moving slowly. Only one remote site is currently uploading everything via XNAT. Training has been a face-to-face process, sending people from Iowa to new sites to demonstrate the process in person. Resistance to learning a new working process or a new technology is common. In this case, many would-be users assume that the upload process will be too slow with that volume of data, or too unpredictable vs. mailing a DVD.

Who are our primary users?

Hans Johnson at ICTS has been spearheading the drive for a single source of canonical data, responding to the needs of the PREDICT-HD project's administrative staff. Reporting and project management have been improved as a result. For example, project coordinators can generate a report of scans that came in during the last week, and use this report to vet reports from external sites. Since sites are compensated by the grant for their imaging sessions, ensuring that data from those sessions are properly received is key to getting return on that investment.

Research-minded users include grad students, research assistants, and those few investigators who like to get hands-on with the data. PIs are being trained on how to search and find by criteria, but this is a slow learning curve for those who have shied away from past online systems.

External researchers and data-sharing collaborators make up another group of users.

Finally, there is a trio of XNAT administrators who manage the installation and turn the project administration's data policy requirements into application functionality. (Shuhua does most of the infrastructure admin, altering the code of XNAT and extending the schema. Mark is a data coordination admin. Adam manages the interface between XNAT and supporting hardware.)

What inherent features of XNAT do your users find most valuable?

Searching and reporting are the most highly-prized features. For example: the ability to search over MR sessions ("give me the list of all sessions that contain diffusion data at 3T, or at this site..."). Or being able to know exactly how many subjects & sessions we have - useful for reporting, soundbytes, etc. Could do that before with scripts to comb through whatever pile of data was at hand, but was a much more difficult process.

The data upload tools are valuable and will become more frequently used. Intelligent permissions that prevent non-admins from screwing things up is huge improvement over the old file system.

From a data control perspective, using XNAT as a single canonical source enables to set policies, such as enforcing a single naming convention.

More recently, the team is beginning to use XNAT to re-anonymize data so it can be shared externally. This new functionality has led to 8 new users joining in the last two weeks. (#1 request from new users: how can I download all the data WITHOUT using the interface.)

Also, the team is working on a desktop application for quality control. (Check for non QC, download first image from series, show to user, upload assessment.)

How have you customized XNAT to meet your needs?

The XNAT interface and schema have been enhanced to support a more robust QA process. Added a custom variable for "has been QA'd" : YES/NO/IN PROCESS have been added to the interface as options, and the results of Manual QA are set to override other user input on the "usable" field.

(In addition, there has been some schema customization, some DB customization, some auto-naming rules, but would need to ask Shuhua for more details.) Modified upload applet. May add extend the schema to support accounting data ("has site been reimbursed for image scan?")

How could XNAT improve for the future?

One common request is the ability to associate metadata with individual scans, rather than at the DICOM session level. This would enable searching across scan attributes rather than session attributes. (This can simulated with PyXNAT).
Would love to have video tutorials on uploading data, geared toward users (MRI techs) who know nothing about XNAT. (Their competence training has been deep but very narrow... little experience using web apps).

Quick Facts:

XNAT installed since 2008, activated January 2011.

dozens of active users across 30+ sites

1 Active Project (very large project) split up into multiple sub-projects (1 per site, 1 per subset of data such as imaging).

~4,000 Stored Imaging Sessions

5.3 TB (uncompressed) of Data

Hardware Setup:University of Iowa Enterprise Storage setup for XNAT