Adding Custom Image Import Processors

Processors are a feature we added to XNAT in 1.7.5 and refined further in 1.7.6. They are what power the relabeling that the DICOM Query-Retrieve Plugin can perform and could also be used for a variety of other things as well. For example, maybe data coming into a certain SCP receiver should have additional anonymization or changes to its DICOM tags. Or maybe you want XNAT to reject data coming in to a certain SCP receiver if it's not coming in from the scanner you expect to be sending to that receiver. Almost any processing you want can be achieved by creating a processor class, adding it to a plugin, and doing a few REST calls to configure it. While flexible enough to accommodate a variety of use cases, processors were designed with DQR relabeling in mind, and relabeling is the first feature to take advantage of it.

This is a powerful feature that allows site administrators a great deal of flexibility, but it does have some limitations that are important to understand.

Limitations

There are three key points to understand about processors when figuring out whether they are appropriate for your use case.

They can only be used during import.
They can only be injected into certain places in the import process.
They will not work on all imports (primarily just data coming in to an SCP receiver).

Import only

XNAT processors are designed to take a DICOM file that was received by XNAT and do some operation based on it. This can involved making changes to the values in the DICOM, but can also involve whatever you want to code up, such as sending an email to the admin when DICOM files that meet certain conditions are received.

For more general cases of having code you want to run in response to something happening in XNAT, XNAT's automation and event services are typically more appropriate. For example, if you want something to be performed when data is archived or later modified, DQR would not be appropriate.

Finite injection points/ locations

There are three points in the import code that we think people would be most interested in having the ability to inject their own custom code and have it run. Injection points are each identified by a descriptive string that offers insight into when processors at that injection point will be run. Each instance you create of a processor will have a location field that indicates the point at which its code should be run.

AfterDicomRead: The first injection point / location.

This is the location to add any processing that should be done as soon as XNAT receives new DICOM. At this point, XNAT has not yet even tried to figure out what project this data is supposed to be put into. If you wanted to notify the admin whenever, for example, data from an unknown scanner was received, this would be a good place to add code to send them an email. However, if you wanted the name of the project the DICOM was being imported into to be sent to the admin, then you would want to add your processor to the second location instead.

AfterProjectSet: The second injection point / location.

The only thing XNAT does between the first and second locations is set the project. So if you only want a processor to be run on data coming into specific projects (which you can configure using the processor instance class' projectIdsList field which will be discussed later), you should not set up your processor to run in the first location. However, if you want your processor to do something that might change what project the data will be mapped to in XNAT, you would want to run that processor in the first location. The processor that handles relabeling is configured by default to run at this location so that the project is already known (in case relabeling should only be done for certain projects), but XNAT has not yet assigned subject and session labels to the data ( so any relabeling of the Patient's Name and Study ID DICOM fields will be reflected in what subject and session you see for the session in the prearchive).

AfterAddedToPrearchiveDatabase: The third injection point / location.

At this point the project, subject, and session have all already been assigned by XNAT and the session has been added to the preachive database. This is fairly late in the import process and is useful when you want to be able to see what subject and session the data is going into. This is the point in XNAT at which the XNAT site anonymization script is run by default.

Gradual DICOM Importer

One last important limitation is that processors will currently only work on data coming in through the GradualDicomImporter (which is used for all imports that were sent to an XNAT SCP receiver as well as for a couple other imports). While most of your data is probably coming in through GradualDicomImporter, there are ways (such as the compressed uploader) that data can get into XNAT via other importers.

Processors that come with XNAT

There are two processors that come with XNAT, the MizerArchiveProcessor, which performs site anonymization, and the StudyRemappingArchiveProcessor, which runs a study anonymization script to remap fields on incoming data if the studyInstanceUID of the incoming data matches the studyInstanceUID associated with the stored anonymization script (currently, these associations between studyInstanceUID and anonymization script are only made when importing from PACS using DQR, so this processor will have no effect on data not imported using DQR or some other plugin written to take advantage of this feature). For each DICOM SCP receiver in your XNAT, you have the option of either having processors be ignored and have data be imported like it always has been in XNAT, or to have incoming data be processed with whatever processors you have configured. This is configured by the custom processing setting on SCP receivers. By default, custom processing is disabled, but admins can edit SCP receivers to turn custom processing on.

Processor Instances

As discussed above, XNAT comes with two processors: MizerArchiveProcessor and StudyRemappingArchiveProcessor. It also comes with two processor instances defined, one for each of these processors. A processor instance is how you tell XNAT that you want a certain processor to be run at a certain time in a certain way for certain data coming in to certain receivers or projects. There is a lot of ability to configure how you want a processor to be run and sometimes you want processing to be a little different for different data. Sometimes you're able to express the configuration you want with only a single processor instance for each type of processor, but sometimes you might need more than one.

For example, let's say that you have two different SCP receivers (let's call them A and B) where data from two different sources come in. Let's say that receiver A data has generally already been anonymized before coming in to XNAT, while data coming in to receiver B has not. In this case you could simply edit the default processor instance for MizerArchiveProcessor to add receiver A to the scpBlacklist so that site anonymization will not be run on data coming in to that receiver. However, maybe you realize that the owner of the project Nonconformist in XNAT likes importing their data using receiver A even though their data should be anonymized on import because it hasn't been anonymized yet. In this case you could create a new processor instance for MizerArchiveProcessor that has only processor A in its scpWhitelist and the project Nonconformist in the projectIdsList for that processor instance. In that case, the existing processor instance that runs site anonymization on data coming in to receiver B will be unchanged, but there will also be a processor instance that runs site anonymization of data coming in to the Nonconformist project through receiver A. This will become clearer after have we discussed more details on processor instances, but the main point is that often you will be fine having only a single processor instance for each processor, but sometimes you will want multiple processor instances for some of your processors.

Defining a Processor Instance

Processors are configured by creating/modifying/deleting processor instance objects. Here are the two processor instances that XNAT creates for you by default:

Default Processor Instances in XNAT

CODE

[
  {
    "label": "Remapping",
    "scope": "site",
    "scpWhitelist": [],
    "scpBlacklist": [],
    "location": "AfterProjectSet",
    "priority": 10,
    "parameters": {},
    "processorClass": "org.nrg.xnat.processors.StudyRemappingArchiveProcessor",
    "projectIdsList": [],
    "id": 2,
    "enabled": true
  },
  {
    "label": "Site Anonymization",
    "scope": "site",
    "scpWhitelist": [],
    "scpBlacklist": [],
    "location": "AfterAddedToPrearchiveDatabase",
    "priority": 10,
    "parameters": {},
    "processorClass": "org.nrg.xnat.processors.MizerArchiveProcessor",
    "projectIdsList": [],
    "id": 1,
    "enabled": true
  }
]

Note that these are created by XNAT and exist whether you use the DQR plugin or not (though they won't currently do much without DQR). You can modify and delete these processor instances and they will stay modified/deleted. XNAT will only add them if there is no history of them ever having existed on your XNAT (such as when upgrading from an earlier version of XNAT).

Here is what the different processor instance fields mean:

Processor instance field	What it does
label	A String that identifies which processor instance this is. This will be used when processor instances are listed in the UI (in future versions of XNAT).
scope	A String representing the scope of the incoming data that should hit the processor code. Currently, the only supported scope is "site", which means that all data coming in will be checked by the code in this processor instance's processor class to see whether it should be processed (all data coming in on an SCP receiver that has custom processing turned on and that is configured to use this processor). In the future, additional scopes could be added, such as project scope, in which a project-scoped processor is only even considered if data is coming into its project (however, since you can already restrict whether you want a processor to be run on data coming in to a specific project using the projectIdsList field discussed below, it may not make sense to have a separate project scope).
scpWhitelist	This is a list of all the SCP receivers whose incoming data you may want to process with this instance's processor. If empty, this will not restrict which SCP receivers' data this processor can be run on. This should be formatted as a list of Strings in the format AE:PORT (e.g. ["XNAT:8104","XNAT2:8105"] ). This will not override any other restrictions (such as those defined by the projectIdsList), but merely indicates that this processor instance will not be used for SCP receivers that are not on this list.
scpBlacklist	This is a list of all the SCP receivers whose incoming data you never want to process with this instance's processor. If empty, this will not restrict which SCP receivers' data this processor can be run on. This should be formatted as a list of Strings in the format AE:PORT (e.g. ["XNAT:8104","XNAT2:8105"] ).
location	Location represents the stage in the process at which the processor instance should be executed. The allowable locations are "AfterDicomRead", "AfterProjectSet", and "AfterAddedToPrearchiveDatabase". The details of what these mean are discussed in the "Finite injection points/ locations" section.
priority	Priority indicates the order in which the processor instances in a given location should be executed. If there are 5 enabled processor instances in location "AfterProjectSet", the processor instance with the smallest number for priority will be executed first (so if the priorities were 1,2,3,4,5, they would be executed in that same order).
parameters	The parameters field is a map of String to String and can contain whatever information you want to pass in to the processor's accept and process methods in order to customize the behavior of that instance of the processor. For example, you could write a very general purpose processor to change the values in an incoming DICOM file and create processor instances of it whose parameter maps each contain a JSON string defining exactly what changes should be made to that data.
processorClass	The processor class should be a String containing the location of the processor class (the Java class's package, followed by a period, followed by the class name). This class must exist, either in core XNAT code, or in a plugin you installed.
projectIdsList	These are the list of projects (by ID) which should use this processor instance. Only data coming in to projects in this list will get the processing defined in this processor instance. This relies on the project already having been set so should never be used with location="AfterDicomRead".
id	This is an automatically assigned internal identifier.
enabled	This indicates whether the processor instance should be active. If disabled, it will have no effect until you enable it again.

There is currently no UI for adding, viewing, modifying processor instances, but there are REST calls. For more information on what processor instance REST calls are available, admins can go to their site's Swagger page (by going to /xapi/swagger-ui.html or following the link from Administer-> Site Administration-> Other-> Miscellaneous-> Development Utilities and opening up the archive-processor-instance-api section). You can also perform those REST call using the UI on that page.

If for some reason you wanted to have the site anonymization script be run twice, instead of modifying the existing one, you could create a new processor instance whose processorClass is org.nrg.xnat.processors.MizerArchiveProcessor. If you only have the MizerArchiveProcessor and the StudyRemappingArchiveProcessor, it may not make sense to want to have multiple processor instances for one class. However, if you start writing your own processor classes, you may find the ability to have as many processor instances as you want for a given processor class very useful.

For example, let's say that my XNAT is receiving data from scanner A on SCP receiver XNAT:8104, data from scanner B on SCP receiver XNAT2:8105, and data from scanner C on SCP receiver XNAT3:8106. Sometimes I discover that values of some of the DICOM tags are being set incorrectly. I don't know every way that tags will be incorrect in the future, so I want to write a flexible processor class that will allow me to fix these DICOM tags without having to restart Tomcat every time I discover a new way that these tags are being set incorrectly. I could write a processor class that looks for three fields in the processor instance parameters map: tag, regex, newValue. It looks at the DICOM coming in to see what the value of the DICOM tag indicated by the 'tag' parameter is, checks whether it matches the regex in the 'regex' parameter, and if so, changes the value of that tag in the DICOM to the value of the 'newValue' parameter.

Once I have written this processor class, added it to a plugin, and added that plugin to my XNAT, I can create a new processor instance whenever I discover an issue with the one of the tags coming in from one of the scanners, so that future similarly messed up data coming in to that SCP receiver will have its tag set to the correct value. I might want to log that this change was made, and if I thought that I would only want to log this some of the time, I could have another parameter on my processor instances indicating whether the change should be logged. So processors can range between being completely non-customizable (with processor instances of them only affecting things like what SCP receivers they should be used for and when in the import process they should be executed), to being completely customizable and written so that they could do anything you could envision ever wanting to do. While the latter would likely be a bad idea, you might find it useful to write your processor classes so that there are a couple things that can be configured via parameters.

Creating Processor Classes

It's important to highlight again the difference between a processor and a processor instance. A processor instance indicates when a processor should be run, and with what parameters. You might want a processor to run in a certain way for one SCP receiver and a different way for a different SCP receiver. A processor is a java class with the @Component annotation that extends AbstractArchiveProcessor and which has two methods: accept and process. The accept method returns a boolean for whether that processing should be attempted on the incoming data. If it returns false, the import process continues without that processing taking place. If it returns true, the process method is called. If the process method returns false or an exception is thrown, the import process stops. The import process also stops if an exception is thrown in the accept method.

There is a default version of the accept method in AbstractArchiveProcessor which you do not need to override in your processor class. If you use the version from AbstractArchiveProcessor, all it will do is call processorConfiguredForDataComingInToThisScpReceiverAndProject, which will return true if and only if both the projectIdsList contains the data's project (or is empty) AND the SCP receiver the data came in is not in the scpBlacklist and is either in the scpWhitelist or is not in the scpWhitelist because the scpWhitelist is empty. If this is not sufficient and you want to check something else, you can add an accept method to your processor class and add whatever checks you want. For example, in the accept method of the MizerArchiveProcessor class (see below code block), we return false if the prevent anonymization flag is set to true or if the sitewide anonymization script is not enabled.

If we don't return false for one of those reasons, we then execute the processorConfiguredForDataComingInToThisScpReceiver method and return the value of that.

While you almost certainly still want to run processorConfiguredForDataComingInToThisScpReceiverAndProject as part of your accept method, you do have the power to add additional checks and order them in whatever way you wish (for example if there's a check that is quick to run that will lead to rejecting most incoming data, you may want to run that check first so that more time consuming checks run more rarely).

Here's what the accept method looks like for MizerArchiveProcessor:

MizerArchiveProcessor.accept method

CODE

    @Override
    public boolean accept(final DicomObject dicomData, final SessionData sessionData, final MizerService mizer, ArchiveProcessorInstance instance, Map<String, Object> aeParameters) throws ServerException{
        try {
            // check to see of this session came in through an application that may have performed anonymization
            // prior to transfer, e.g. the XNAT Upload Assistant.
            if (sessionData.getPreventAnon()){
                log.debug("The session {} {} {} has already been anonymized by the uploader, proceeding without further anonymization.", sessionData.getProject(), sessionData.getSubject(), sessionData.getName());
                return false;
            }
            else if (DefaultAnonUtils.getService().isSiteWideScriptEnabled()){
                return processorConfiguredForDataComingInToThisScpReceiverAndProject(sessionData, instance, aeParameters);
            }
            else {
                return false;
            }
        } catch (Throwable e) {
            log.debug("Failed check of whether dicom anonymization could be performed: " + dicomData, e);
            //Throw exception so we don't just proceed with importing the data without anonymization.
            //I'm not certain whether this is what we want, but this is how it currently works and I don't want to mess anything up.
            throw new ServerException(Status.SERVER_ERROR_INTERNAL, e);
        }
    }

The process and accept methods both have five parameters. They take in a DicomObject with all the DICOM tags of the incoming data, a SessionData object with data such as the project and subject of the session and the directory on the filesystem where the session's files are located, a MizerService object you can use for performing anonymization, the ArchiveProcessorInstance object which you can invoke the getParameters() method on to get a map of the parameters that should be used in this execution of the processor, and a Map containing information about the SCP receiver the data came in on (which is used when checking against the SCP whitelist and blacklist).

Below is a code block showing what the process method of the MizerArchiveProcessor class looks like. It gets data about the session from the sessionData parameter and gets the sitewide anonymization script and uses that to anoymize the incoming DICOM data.

MizerArchiveProcessor.process method

CODE

@Override
public boolean process(final DicomObject dicomData, final SessionData sessionData, final MizerService mizer, ArchiveProcessorInstance instance, Map<String, Object> aeParameters) throws ServerException{
    try {
        Configuration c = DefaultAnonUtils.getCachedSitewideAnon();
        if (c != null && c.getStatus().equals(Configuration.ENABLED_STRING)) {
            //noinspection deprecation
            Long scriptId = c.getId();
            String proj = "";
            String subj = "";
            String folder = "";
            if(sessionData!=null){
                proj = sessionData.getProject();
                subj = sessionData.getSubject();
                folder = sessionData.getFolderName();
            }
            mizer.anonymize(dicomData, proj, subj, folder, c.getContents());
        } else {
            log.debug("Anonymization is not enabled, allowing session {} {} {} to proceed without anonymization.", sessionData.getProject(), sessionData.getSubject(), sessionData.getName());
        }
    } catch (Throwable e) {
        log.debug("Dicom anonymization failed: " + dicomData, e);
        throw new ServerException(Status.SERVER_ERROR_INTERNAL,e);
    }
    return true;
}

Adding the Processor Class to a Plugin

Once you have written a java class that extends AbstractArchiveProcessor, is annotated with @Component, and overrides the AbstractArchiveProcessor process method (and possibly its accept method as well), you are ready to add it to a plugin. If you already have a plugin, you can add it to your plugin, or you can create a new plugin (see Developing XNAT Plugins for more information on plugin creation/development). You will need to make sure that you add a @ComponentScan annotation to your plugin class which includes the package within your plugin that you put your custom processor in. Once you have built your plugin jar, put it in the plugins directory and restarted Tomcat, you should be able to create processor instances (by going into Swagger and doing a POST to xapi/processors/site/create) whose processor class equals the Java class's package, followed by a period, followed by the class name.

Things to Check if Your Processor is Not Processing

If you are are trying to have a processor process incoming data, but it doesn't appear to be working, here are a few general things you can check:

Check whether there is a processor instance for the processor that is set to enabled.
Check whether the SCP receiver you're trying to receive the data on is enabled, has Custom Processing turned on, is not in the scpBlacklist for this processor instance, and is either in the scpWhitelist or is not in the scpWhitelist because the scpWhitelist is empty
Either attach the debugger and step through the code in the processor class or add logging statements to make sure that the accept and process methods are being run on existing data and there are not exceptions causing processing to stop earlier than you'd like.

If you created your own processor class but it doesn't appear to be doing any processing, there are a few additional things you can check:

Check whether the plugin containing your processor is in the list of installed plugins (http://localhost:8081/app/template/Page.vm?view=admin#tab=plugins).
Check whether your processor extends AbstractArchiveProcessor, is annotated with @Component, and there is a @ComponentScan annotation on your plugin class which includes the package within your plugin that you put your custom processor in.
Perform GET to xapi/processors/classes to make sure XNAT is able to access your processor class.

If the MizerArchiveProcessor that comes with XNAT doesn't seem to be working, there are a few additional things you can check:

If you go to Administer → Site Administration → Manage Data → Session Upload, Import & Anonymization → Anonymization Script (Site Wide), is the site-wide script enabled, and is it non-empty?
Try adding a new line to the anonymization script, sending data to XNAT, and then viewing the DICOM headers on the data once it gets in to XNAT to see if the change was made. For example, you could add a line like:
CODE
```
(0008,0080) := "This is a test."
```

If the StudyRemappingArchiveProcessor that comes with XNAT doesn't seem to be working, there are a few additional things you can check:

Does it still fail to work when sending data that you have not ever sent to your XNAT before? If you previously imported the session and it is sitting in the prearchive, XNAT may still think it needs to perform the remapping when it comes in again.
Is the location field in the processor instance set to "AfterProjectSet"? Otherwise the relabeling may not work right.