We will detail our system architecture for the CNDA, Washington University's flagship installation of XNAT. The current architecture (as of July 2012) responsively handles traffic for more than 800 projects and more than 23,000 imaging sessions.
Current CNDA Architecture (July 2012)
The NRG vSphere cluster consists of 6 ESXi hosts. Each has two Core i7 hex-core CPUs. The cluster contains a mix of 2.66 Ghz and 3.46Ghz CPUs and has a total of 432 GB ram. The each host is dual connected to a Cisco Nexus 5010 10GBe switch. The resources are shared among 150+ virtual machines which includes other production XNAT instances, development and infrastructure systems. Everything is load balanced and managed via vCenter Server.
Production virtual machines are configured for high availability. Should the host it is running on fail it will be immediately restarted on another host.
Mallinckrodt and Human Connectome BlueArc NAS
The vSphere cluster's primary storage is on two HDS BlueArc NAS systems. The vmdk storage for virtual machines is hosted on a 15k SAS pool on the Human Connectome BlueArc, while production XNAT instance storage is on a NL-SAS pool on the Mallinckrodt BlueArc.
Each BlueArc has a DR BlueArc that is offsite and synced several times daily.
ZFS Backup/Development Server
The CNDA is backed up daily to a ZFS filesystem where a snapshot record is kept indefinitely.
NRG Lab & Compute Cluster
The NRG has a 10 workstation lab. Each workstation has a quad core i7 CPU w/ 8 GB RAM. Besides being a desktop workstation, each system participated in the Sun Grid Engine computing cluster.
CNDA Virtual Architecture
CNDA Production Virtual Machines
cnda, cnda-shadow, and cnda-fs01 all play crucial functions to running the production CNDA. The primary VM, cnda.wustl.edu, runs the tomcat instance. Tomcat is connected to pgsql01 for the database. The archive, prearchive, cache, etc are all NFS mounted to the BlueArc NAS.
cnda-shadow and cnda-fs01 are both used for batch processing and cron jobs. cnda-shadow is running an identical tomcat instance as the cnda, but reserved for job processing. cnda-fs01 is connected only to the production file system and used for cron based reporting.
Sun Grid Engine
Pipeline processing takes place on the Sun Grid Engine. Jobs are managed by the sge-master and sge-shadow. Processing takes place on the 6 six sge-exec VMs and 10 lab workstations.
pgsql01 and pgsql02 run in a master slave relationship. The pgsql02 receives WAL archive from pgsql01 and is always with 100ms from the state of pgsql01. In the future we will be adding a pooling virtual machine that will load balance all SELECT calls across all Postgres machines.
Development / Test Virtual Machines
Using cloned snapshots each developer on the CNDA team has one or more development VM that is connected to its own private clone of the CNDA file system and a corresponding database dump.
From within the development VM everything operates the same as the CNDA. The file system is identical and so is the database from the time of the snapshot and database dump. Everything is writable in this sand-boxed environment. Each development or test VM contains PostgreSQL and the SGE environment. Special overlay directories are used on the SGE environment to capture submitted jobs and force them to run on the VM instead of the production compute cluster.
Builds for potential updates to the production system are done in this environment. The tomcat webapp is pushed to a bitbucket repository when ready for testing/production deployment.
Once a new webapps is ready it is deployed to a test VM. This test VM is built first as a clone of production and updated with the same routine that production will be updated. The developer then runs a series of test to validate the build before it is release to the production system.
Active Directory Domain Controllers
All authentication across the entire network is handled through Active Directory via kerberos and LDAP.
Help desk and bug tracking is managed via FogBugz
Puppet Configuration Management
All Linux virtual machines are managed by puppet. When a VM is deployed is cloned from a template VM and puppet transforms it to the configuration necessary as define in the manifests. There is zero manual configuration of any virtual machine.
Previous CNDA Architecture (May 2010)
(Original: current_cnda_architecture.svg )
Reverse Proxy / Load Balancer
The Kemp LoadMaster handles SSL communications with the client and sends unencrypted traffic to the Tomcat and DicomServer instances. The Kemp includes hardware accelerated SSL, which may be excessive in many cases (other reverse proxies may be appropriate, like Apache with mod_proxy, HAProxy, nginx, or Pound).
We currently run two Tomcat 5.5 web containers (each are dual core Xeon processors with about 4GB of RAM). Because CNDA is running on XNAT 1.4, the app servers also run DicomServer. Kemp is configured to route traffic to a single app server until that app server fails. A single app server sufficiently responds to our daily traffic demands.
A single instance of PostgreSQL 8.3 runs on a quad core Xeon processor with 16GB of RAM. We have tuned PostgreSQL to take advantage of available memory.
XNAT's archive is located on an NFS mount of a BlueArc NAS. The NAS transparently handles replication to a second instance running remotely. We currently have dedicated roughly 20TB of the BlueArc to the CNDA.
The CNDA utilizes a variety of machines for its Pipeline processing via the Sun Grid Engine. These machines include Solaris, Linux, 32 bit, and 64 bit, which are used as required by the specific pipeline.
- Current CNDA Architecture (July 2012)
- CNDA Virtual Architecture
- Previous CNDA Architecture (May 2010)