XNAT provides an enterprise suite of applications which can greatly improve the management of your data. In order to guarantee data integrity and site performance there are several maintenance tasks you should implement.
The developers of XNAT have taken great care to treat your data with all due respect. However, we backup our data... and you should to.
Backing up the Database
The ideal tool for handling backups of your database is the the 'pg_dump' utility which comes with PostgreSQL. We recommend setting up a cron job to do a pg_dump of your database every night. Compared to the size of your imaging data, the SQL dump is fairly minor (particularly if you zip the dump). Having a nightly snapshot will allow you to role back your database to any time point. This is particularly helpful if you need to re-instantiate an older copy of the database to see how data changed.
In addition to nightly snapshots, you should consider having a warm standby of your PostgreSQL server. Related: Postgre Docs: Warm Standby
Backing up your Archive
If there is anything you want to protect in your installation, it is probably your source documents (the image data). Backing up your archive can present some issues (due to the size of the data) but it is still highly encouraged. There are a few techniques which you can use to assist you in the backup of your data.
If the size of the backup is a concern, you could consider only backing up your RAW data. We encourage you to back up everything. However, if space is an issue most of the processed data could be regenerated, whereas the RAW data could not.
"RSYNC is an open source utility that provides fast incremental file transfer. RSYNC is freely available under the GNU General Public License..." If you haven't use RSYNC in the past, you should consider it now. Related: Samba docs: RSYNC
"RSNAPSHOT is a filesystem utility for making backups of local and remote sytems" RSNAPSHOT builds on RSYNC and builds a navigable timestamped snapshot of your archive space. Related: http://rsnapshot.org/
Where's the maid? We have grand visions for how XNAT could assist you in cleaning up the mess which can result from an active XNAT installation. However, as of today, those are still just visions. There are several areas where you (the site administrator) should consider some cron based clean-up.
Cleaning up Logs
The XNAT web application generates logs to track events in your installation. These are generally far more thorough than you will need. However, in the rare situation that you need them (usually for debugging a problem), they are very useful. For the most part, these logs are only relevant for a few days. Certainly the logs in webapps/PROJECT/logs could be deleted after a month (except the access.log... see NOTE below). You should consider setting up a cron job which would delete log files which are older than a month. Similarly the logs in TOMCAT_HOME/logs could be deleted after a month (particularly the catalina.out with is the System out from your server).
NOTE: The 'access.log' is an audit trail of the users of your site. We highly encourage preserving historical copies of this. To guarantee this, we redirect the access.log output to a file outside of the webapp (this prevents it from being lost if the webapp is re-installed) and copy the logs into our archive space (which is backed up like the image data).
Cleaning up Cache
The cache directory is a landing place for a lot of content which could be deleted after a span. The cache is also used to house files which are of temporary use in the server (uploaded zips, etc).
XNAT doesn't actually delete any files. When you tell XNAT to delete imaging data, it actually moves it to the CACHE_DIRECTORY/DELETED. This is similar to a Recycling Bin and should be cleaned up accordingly. You may consider deleting files from the DELETED folder which are older than a specific amount of time (~90 days).
Project Based cache
Each project in your XNAT server has its own space in the cache. This is where uploaded zips are temporarily stored and extracted. After a session is moved from the prearchive to the archive, the original data is copied to this project cache directory. This is intended to provide a temporary backup of your files. If something goes wrong with the archive process for your data, you may be able to retrieve the data from here, rather than having to re-upload. The transfer backups are stored in CACHE_DIRECTORY/PROJECT_ID/transfer_bk. These files will quickly double the size of your XNAT and could be deleted after a specific amount of time (~30 days).
Cleaning up the Pre-archive
Not all data sent to an XNAT server ends up in the archive. Sometimes redundant or erroneous data is left in the prearchive space. Over extended usage this data can add up, and you should consider removing it. We would recommend moving all files older than 60 days to the CACHE space (which then be deleted after 90 days). Our new Prearchive UI makes this much easier to do through the UI than before.