The text here is mostly taken from an email chain between Rick, Dan, and Tim.
Load balancing for multiple XNAT servers would be configured using Apache HTTP 2.2’s mod_proxy and mod_proxy_balancermodules. This provides the quickest implementation for load balancing by supporting back-end stickiness, which obviates the requirement for back-end session pooling. Using weighted traffic counting as the balancing criteria will allow both for heterogeneous hardware environments and for unbalanced client requests.
The caching mechanism methods within XNAT will be refactored to handle consistent session state across multiple XNAT instances on separate tomcat servers. Existing caching structures will be moved to use a common externalizable API, Ehcache.
The caching mechanisms, when refactored from XNAT's existing home-grown solution, will let us use more advanced load balancing strategies as well. Back-end stickiness doesn't really require the caching changes, since each user session is tied to a particular Tomcat instance, so you don’t have to worry about the session and state pooling problems. The problem is that, if two users are assigned to the same Tomcat and one starts to increase the load on the server, the load balancer won’t dynamically shift the second user to another Tomcat server to balance that load. You can only move users in subsequent sessions. So it’ll be better to have the pooled session and state caching overall, because then you’re doing real dynamic load balancing.
The complication raised by caching is when servers are on physically separate infrastructure, e.g. servers within network firewalls and on EC2 or perhaps in different institutions in the case of a multi-site study. This indicates support for a number of advanced features as well:
- Distributed Caching Using Ehcache on Amazon EC2
- Creating Terracotta Server Arrays with EC2 CloudFormation for use by Ehcache
- Distributed cache pools that can be accessible from multiple locations. There are lots of security concerns here, especially around authentication and possible breech of PHI data.
- Geo-location or time-to-response selection of back-end servers.