As described in Getting Started / Configuration, the -d data flag is a required parameter to specify a newline-separated text file indicating what projects to import. For convenience, a file called allData.txt is included which will pull all previously prepared sample data. For any project, the script expects two things: a YAML configuration file to define the structure of the project and a folder containing the data for the project. The YAML file must be titled "$projectId_metadata.yaml". The data folder should contain individually zipped sessions at the top level. When the script starts, it will read the text file and begin checking for these files. If the ./data folder does not already contain the YAML file, the script will search for it on the NRG FTP site. Likewise, if the ./data folder does not contain the folder containing the data for the project (or its zip), the script will attempt to locate a zip by the project ID on the NRG FTP site. Below is an example project YAML file which contains most of the currently supported operations (note that the file would necessarily be called study_001_metadata.yaml):
Tips on the YAML above:
The anonymization section mirrors the syntax exactly of how it is used in Site Configuration, except now it is added to the project.
sessions versus complexSessions
The sessions element allows you to easily define an array of sessions which will all be uploaded to the XNAT server as is. So, while it is exceedingly easy to specify the sessions as an array like this, you lose power to fine-tune the sessions, as needed. So, if you want to do this, you specify the sessions under complexSessions, instead. The sessions here will also be uploaded through the session importer, but additional modifications will be added. So, the study_001 folder for this project should contain the following files:
Note that there are two sessions left out of the above list: "empty_session" and "other_session". If a complexSession has the src property set to empty, the session will be created as an empty experiment directly through the experiment API. Then, scans can be manually created and modified under the session. For any scan, such as the above scan 1, you can set seriesDescription, type, note in addition to the required xsiType. This is the path "empty_session" will take. If a complexSession has the src property set to something else, the zip for the session to be uploaded will be searched for at this value (interpreted as an absolute path). So, the "other_session" would have its session zip pulled from /data/xnat/home/other/session.zip.
If the users already exist, and the XNAT user running the script has access to the site user list, the users will be added to the project in their corresponding roles. If the users do not already exist, insecure users will be created (the password is just the username), which should not be used in production (this requires admin access).
For any project, subject, experiment, or scan in the YAML, you can add a resources block. For project resources, this would be included at the root level, for subject resources, it would be a property of a subject, and so on for complexSessions and scans.
This block should be formatted as:
The above would create 2 resource folders: one called "resource_folder_1" with the files my_file.txt, my_paper.pdf, and one called "resource_folder_2" with the zip my_zip.zip extracted. Note that the resources are expected to be in the project's data folder to be picked up for upload. The complexFiles option is similar to the 'complexSessions' option from the experiments section. Files under complexFiles may also specify a 'content' to be added to the REST call to POST them.
By default, XNAT Populate will look for the data:
- First, it checks in a folder under the "data" folder with a name matching the project ID.
- If that doesn't work, it checks for a zip file under the "data" folder named $project_id.zip.
- If that doesn't work, it checks on the NRG FTP site for the data.
However, there is another option: by specifying the 'src' value at the top level of the project YAML, the script will attempt to load the project's data zip from the file path given here. Technically, this is checked before the FTP site check, so the script will not fail as long as it finds your data via 'src'. This is useful if the data zip is large and mounted in a reliable spot (it will still be extracted into "data" to post the session zips).
What is available already?
See this cheat sheet: xnat_populate_cheat_sheet.md.