# DicomEdit 6.3 Language Reference

Author: Dave Maffitt

DicomEdit defines a small language for scripting modifications to DICOM metadata and pixel data. This document describes the DicomEdit language 6.3 and extensions used in XNAT. This version is compatible with previous 6.0, 6.1, and 6.2 versions.

This version applies to XNAT 1.8 and greater.

## New Features in DicomEdit 6.3

DicomEdit 6.3 is backwards compatible with 6.0, 6.1, and 6.2. All previous 6.x scripts should work unchanged.

New! Alter Pixels function: Blank PHI burned into pixel data.

New! Match and IsMatch functions aid in conditional execution or extracting substrings to build values.

New! Shift date-time by increment function:  Helps obscure dateTimes for privacy but retains the intervals between dates.

New! Set function:  Enables direct addressing of private tags in non-conforming DICOM.

New! Assign-if-exists operation '?=':  Assign value to an attribute only if it already exists in the DICOM. This complements the existing assign-always operation ':=' that will create the attribute if it does not exist.

Syntax Basics

DicomEdit scripts are composed of a sequence of statements. Each statement must be on one line. The two-character sequence // indicates the start of a comment; everything on a line after // is ignored.

DicomEdit scripts for versions 6+ must be marked with a version. Versioning of scripts was introduced with XNAT 1.7.3. A significant break in syntax occurred then between the the older DicomEdit 4.x syntax and the newer DicomEdit 6.x syntax.  XNAT supports both versions.  XNAT assumes scripts without a version statement use DicomEdit version <= 4.x.

// The leading statement should be the version identifier
version "6.3"

## Language Elements

DicomEdit scripts are composed of four major elements:

String literals are delimited by quotes ("") and contain a concrete value. Examples: "Jones^Desmond","1.3.12.2.1107.5.2.32.35177.1.1""" (empty string)

Identifiers are names that can represent either user-defined variables or the names of functions. Identifiers consist of a sequence of letters, digits, and the underscore character _, except that no identifier may begin with a digit. Examples: format, uppercasepatient_name.

Operators are symbols or words that represent actions to be performed. Some symbolic operators are := (value assignment or variable initialization), - (attribute deletion), == (value comparison). Some word operators are echo (print a value to the console output) and describe (define variable characteristics).

Tagpaths represent a DICOM attribute or a set of attributes. Depending on context, a tagpath may represent a location or it may represent the value at that location. For example, on the left side of an assignment operator, a tagpath represents the location at which to do the assignment.  On the right side of the assignment operator, the tagpath represents the value at that location which is to be assigned.

• Single Tag: The simplest tagpath represents a single attribute in the same notation used in the DICOM standard: (ggg@,eeee) where g and e are any hex digit and @ is an even hex digit that specify the group, ggg@, and the element, eeee, portions of the tag. All standard DICOM attributes have even group numbers. Example: (0010,0010)is the Patient Name attribute.
• Element Wildcards: Tagpaths may use element wild cards to match a range of attributes. The element wildcards are X or x, #, and @. Replace any of the simple tag hex digits with an element wildcard character.  The character X or x, matches any hex digit, # matches odd hex digits and @ matches even. Example: (50x@,xxxx) matches all elements in all even groups 5000 through 50FE.
• Sequences: DICOM attributes can be sequence attributes. These are attributes that contain zero or more Items where each item is a list of zero or more attributes. A specific attribute within a sequence is addressed in DicomEdit by (gggg,eeee)[<item-number>]/(gggg,eeee). Example: (0008,0110)[0]/(0008,0115) matches the Coding Scheme Name, (0008,0115), in the first item,[0], in the Coding Scheme Identification Sequence (0008,0110).
• Item-number wildcard: %.  Item number can be replaced with '%' to match all item numbers.
• Sequence wildcards: '*', '+', and '.'  Sequences can be nested arbitrarily deep. A sequence element may be replaced by a sequence wildcard.  The asterisk matches zero or more levels. The dot '.' matches one level.  The plus '+' matches 1 or more levels. Example: * / (0010,0010) will match Patient Name anywhere it occurs.
• Private Tags: Private data elements do not have unique tags. Instead, they are mapped into one of a block of tags and the block used can vary depending on the context in which the DICOM object was created and processed.  Dicom Edit uses a syntax that accounts for the variability of private tag addresses and enables scripts to be written that are more universally applicable across different environments. A particular private attribute can appear anywhere in the range (ggg#,XXee). In any particular DICOM object, the block address is determined by a value in the Private Creator Data Element range (ggg#,0010-00FF). The value of an attribute in this range is a string that names a block of reserved private tags. The last two hex digits in Private Creator Data Elements reserve the corresponding block of private data elements with those two hex digits as the first two digits in their elements. For example, Siemens defines the Diffusion b-value attribute at (0019,xx0C) as one of a set of private codes named "SIEMENS MR HEADER".  The diffusion b-value attribute would be found at (0019,100C) if the Private Creator Data Element (0019,0010) contains the value "SIEMENS MR HEADER".  If the value of (0019,0010) were not "SIEMENS MR HEADER", the value at (0019,100C) would not be the diffusion b-value but a different element defined by another creator.  Thus, Siemens' Diffusion b-value can be specified uniquely with the notation (0019,{SIEMENS MR HEADER}0C). DicomEdit knows to find the corresponding Private Creator Data Element and map the tag to the correct location. The DICOM specification for private tags is found here: http://dicom.nema.org/medical/dicom/current/output/html/part05.html#sect_7.8.

## Values

A value is a string produced by evaluating part of the script. String literals evaluate to themselves; tags evaluate to the string representation of the DICOM attribute (or null, if the attribute is not defined); user-defined variables (described in more detail below) evaluate to the value they have been assigned. Built-in functions also produce values, and are described below. Tagpaths can reference single attributes or a set of attributes.

Value TypeExampleDescription
String literal(0010,0010) := "Doe^John"Assign the literal value "Doe^John" to the value of (0010,0010).
tagpath: simple(0008,103E) := (0010,0010)Assign the value of PatientName (0010,0010) to the value of SeriesDescription (0008,103E)
tagpath: with element wildcard

- (0010,001X)

delete the range of tags (0010,0010) through (0010,001F)
tagpath: sequence(0008,1140)[0]/(0008,1150) := myVariableassign the value of the variable 'myVariable' to the value of (0008,1150) in item 0 of (0008,1140) sequence.
tagpath: sequence wildcard

- +/(0010,0010)

delete PatientName (0010,0010) in any sequence level except the top level.
tagpath: private elementretainPrivateTags[ "(0029,{SIEMENS MEDCOM HEADER2}XX)"]remove all private tags except those in the SIEMENS MEDCOM HEADER2 block of group 0029. Note that the tag is in quotes because the function needs a tagpath and not the value at that tagpath.

## Operations

An operation specifies a change to an attribute value. The operations are defined in the following table.

OperationSymbolSinceExampleMeaning
Assign - always:=6.0(0008,0080) := "My Institution"Assign the literal value "My Institution" to tag (0008,0080). Overwrite the existing value or create the tag if it does not exist.
Assign - if exists?=6.3(0008,0080) ?= "My Institution"Assign the literal value "My Institution" to tag (0008,0080). Overwrite the existing value or do nothing if the (0008,0080) does not exist.
Deletion-6.0

- (0010,1030)

Remove Patient Weight, (0010,1030), from the DICOM. Note this is different from setting the value of the tag to the empty value.

Examples:

(0008,0080) := "Washington University School of Medicine"

// you can NOT assign to a set of attributes
// assign "new value" to all pre-existing attributes in range (0010,0100-010F)
// Fails
(0010,010X) := "new value"

// you can NOT assign from multiple values.
// Fails
(0008,0080) := (0010,010X)

-(0010,1030)     // delete Patient Weight
// referencing a set of attributes makes more sense in the context of deletion.
-(50X@,XXXX)     // delete Curve Data
-*/(0010,0010)   // delete Patient Name no matter where it appears

// delete private tags
// delete all SIEMENS MEDCOM HEADER attributes, regardless if they are mapped to (0029,10XX), or (0029,11XX) , or ...
- (0029,{SIEMENS MEDCOM HEADER}XX)  

Operations occur in the order they appear in the script.

(0010,0010) := "Patient Name 1"
(0010,0010) := "Patient Name 2"

This will result in (0010,0010) containing the value "Patient Name 2".

## Conditional Operations

Conditional statements mirror Java's ternary if-then-else operator.  Conditional statements have the form 'condition ? if-true-operation : if-false-operation', where ': if-false-operation' is optional.

Condition OperatorMeaning
=equal
!=not equal

~

matches
!~not matches

For example, the following code sets the Series Description based on the Series ID:

// Set Series Description based on Series ID.
(0020,0011) = "1" ? (0008,103E) := "Series One"
(0020,0011) = "2" ? (0008,103E) := "Series Two"

In addition to exact value matches as above, constraints can use a tilde ~ to specify regular expressions (see the Java Pattern class) to which attribute values will be matched:

(0020,0010) ~ "\d" ? (0008,1030) := "One digit study"
(0020,0010) ~ "\d\d" ? (0008,1030) := "Two digit study"

Constraints can similarly be applied to deletion operations:

// delete the Series description for series 1-5
(0020,0011) ~ "[1-5]" ? -(0080,103E)

Uses of the optional action include assigning default values:

default_series_description := "Some other series"
(0020,0011) = "1" ? (0008,103E) := "Series One" : (0008,103E) := default_series_description
(0020,0011) = "2" ? (0008,103E) := "Series Two" : (0008,103E) := default_series_description

## Single-word Operations

DicomEdit provides several single-word operations, either for convenience or to provide special functionality.

NameArgumentsDescription
versionStringProvide the DicomEdit Language version of this script.
removeAllPrivateTagsNoneJust what it says. This is the easy way to do this common operation.
describe<variable>, string labelProvide the user-defined variable, <variable>, with the external label, label

## Built-in Functions

The DicomEdit language includes several built-in functions to support complex value construction. A built-in function application has the form function[arg-1, arg-2, ...]. A function can return a value and/or it may have silent side effects. The functions included in the base DicomEdit language are described in the following table:

NameArgumentsSinceDescription
alterPixelsshape, shape-params, fill-pattern, fill-pattern-params6.3

Change pixel values in images with the specified shape and fill pattern.  See the table "Supported AlterPixel Algorithms" for allowed parameter values.

alterPixels["rectangle", "l=100, t=100, r=200, b=200", "solid", "v=100"]

concatenatevalue-1, value-2, ...6.0Returns a single value that is the concatenation of the arguments.
deletestring-literal tagpath6.1

The preferred way to delete tags is with the delete-statement syntax, e.g.

// Preferred DicomEdit-6 delete syntax
- (0019,{SIEMENS}10)

The preferred syntax allows the use of the full tagpath syntax, complete with wild-cards and creator-specific private tags. The 'delete' function is useful when the full tagpath syntax fails. This can occur, for example, when

1. The DICOM data is broken and the private-creator-id tag is missing for a block of private tags, and
2. To facilitate migration from DICOMEdit version 4 scripts to version 6 scripts. DE-4 allowed syntax like

// DicomEdit-4 syntax
- (0019,1010)

which would delete this tag regardless of the private tag's owner. The 'delete' function allows DICOMEdit-6 to replicate this behavior using

//DicomEdit-6 syntax replicating DicomEdit-4 delete syntax
delete[ "(0019,1010)" ]

Function argument: String-literal tagpath

Note that the function's argument is a string literal and not a Tagpath (See Language Elements: Tagpaths). These strings directly address a single tag. The strings have the following syntax.

 (0010,0010) public tag (0051,1000) private tag (0028,0010)[0]/(0020,0020) tag in a sequence item ( 0028 , 0010 ) [ 0 ] / ( 0020 , 0020 ) insignificant whitespace is ignored

Sequences can be to any depth. The slash character, '/', separating sequence levels is not optional.

formatformat-string, value-1, value-2, …6.0Formats the values according to the format string, using the same syntax as java.text.MessageFormat. For example, format[ "{1}—{0}", "foo", "bar"] returns "bar—foo".
getURLURL6.0Retrieves the content of the resource at URL. If a username and password are included in the URL (as described in RFC 3986, section 3.2.1), HTTP Basic Authentication is used.
hashUIDUID6.0

Creates a one-way hash UID by first creating a Version 5 UUID (SHA-1 hash) from the provided string, then converting that UUID to a UID.

ismatchvalue, regex string6.3Returns true if value matches the regex.
lookupkey, value6.2

Returns the value from a lookup table matching the specified key and value. The method to specify the lookup table varies by the context in which DicomEdit is used.

For example:

(0010,0010) := lookup[ "pn", (0010,0010)]
(0010,0020) := lookup[ "pid", (0010,0020)]
// Example lookup table
pn/PatientName1 = MappedPatientName1
pid/PatientID1 = MappedPatientID1
pn/PatientName2 = MappedPatientName2
pid/PatientID2 = MappedPatientID2

Causes DicomEdit to take the 'key' ("pn") and the 'value' (value of the tag (0010,0010)) and return the matching value from the lookup table or null if no match is found. Thus, all DICOM files with current PatientName (0010,0010) of "PatientName1" will be assigned a new value of "MappedPatientName1".

This function is available to the DicomEdit command line tools but as of XNAT 1.7.6, there is no way to specify the lookuptable file.

lowercaseString6.0Converts all uppercase characters in the argument to lowercase.
mapReferencedUIDsprefix, tagpath-1, tagpath-2, ...6.2Replace the value at all occurrences of the provided tagpaths with a new value composed of the prefix.
matchvalue, regex, group-index6.3

Matches value against the regular expression regexp and returns the content of the capturing group with the given index.
This is a more powerful and flexible method for extracting substrings than the index-based substring function.

newUIDnone6.0

Generate a new UID. UID root for UUIDs (Universally Unique Identifiers) generated as per Rec. ITU-T X.667 | ISO/IEC 9834-8.

replaceString,target,replacement6.0Replaces all occurrences of target in the given string with replacement.
retainPrivateTagstagpath-1, tagpath-2, ...6.2

Removes all private tags except those matching specified tagpaths.

For example:

retainPrivateTags[ "(0029,{SIEMENS MEDCOM HEADER2}XX)"]

Causes all private tags except those in the "SIEMENS MEDCOM HEADER2" block of group 0029 to be removed.

setstring-literal tagpath, value6.1

Set the attribute at the specified tagpath to the given value.

Sometimes your DICOM just doesn't play by the rules. Especially private tags.  The correct way to address private tags is using the private creator ID. If the private creator ID is missing from the DICOM or in some other DICOM emergency, bust this out.

set["(0009,1010)", "fubar"]

String-literal tagpath

Note that the tagpath argument is a string literal.  See the 'delete' function for the allowed syntax of these strings.

shiftDateByIncrementdate-string, shift (in days)6.2

Return a date string of the specified date shifted by the specified number of days.

(0008,0020) := shiftDateByIncrement[ (0008,0020), "14"]
shiftDateTimeByIncrementdateTime-string, shift (in seconds)6.3

Return a dateTime string of the specified dateTime shifted by the specified number of seconds.

// shift acquisition datetime by 14 days and 3 hours (1220400 seconds)
(0008,002A) := shiftDateTimeByIncrement[ (0008,002A), "1220400"]

Time shifts are with the precision of seconds. Fractional-second digits will be retained with the same precision as the original value.

shiftDateTimeSequenceByIncrementshift (in seconds), DateTime sequence element tagpath as string.6.3

Shift all dateTime sequence elements by the specified number of seconds.

// shift per-frame-functional-groups-sequence/frame-content-sequence/frame-acquisition-datetime by 14 days and 3 hours (1220400 seconds)
shiftDateTimeSequenceByIncrement[ "1220400", "(5200,9230)/(0020,9111)/(0018,9151)"]
substringString,start-index,end-index6.0Returns a substring beginning at zero-indexed character number start-index and extending to character number end-index – 1.
uppercaseString6.0Converts all lowercase characters in the argument to uppercase.

## Custom Functions

Applications can define custom functions that extend the DicomEdit and make sense only in the context of that application. XNAT provides these custom functions:

NameArgumentsDescription
makeSessionLabelcustomized format using ## and modalityLabelProvides a unique session identifier. The '##' refers to the count of sessions (zereo based) this subject has of the given modalityLabel + 1. Example: makeSessionLabel[format["{0}_v##_{1}", subject, lowercase[modalityLabel]]] returns a string of the form 'subject1_v09_mr' for an MR session for subject 'subject1' who has 9 preexisting MR sessions in XNAT.

## Supported AlterPixel Algorithms

DICOM supports a very wide array of image types and encodings. This means that it is difficult to specify a single approach that can handle image modifications in a manner that is appropriate to every type of image data that can be encountered. It is quite likely that an algorithm that works with data from one source could easily render data from a second source unusable.

Note: It is essential that all types of data from all data sources processed by alterPixels be vetted for expected behavior.

The "alterPixels" function takes 4 string parameters; shape, shape-parameters, fill-type, and fill-parameters.  The shape parameter specifies the shape of the image region to alter. The shape-parameters specify the detailed location of the shape and the content of this parameter will be dependent on the specified shape. The fill-type specifies how the pixels will be altered. Fill parameters provide details needed by the fill type and will specific to the fill type chosen.

AlgorithmSinceDescription
Basic6.3
• Images input with jpeg lossy encoding are written out uncompressed
• The blanked regions are given a value of zero. The fill color is ignored.
• The current implementation is based on ImageEditUtilities. blackout() from the Pixelmed Toolkit.
AlgorithmShapeShape ParametersFill TypeFill Parameters
Basicrectangle"l=n, t=n, r=n, b=n" where l,t,r,b stand for left, top, right, and bottom and n are integer indices of pixel coordinates. Pixel indices start at 0 in the top left corner and increase to the right and down.solid"v=c" where v stands for value and c is the integer value to be assigned to the region of the image. This value is currently ignored.

## User-Defined Variables

Scripts may contain variables, identifiers that represent a value. Variables can be defined and initialized to a particular value using the assignment operator :=.

// Define the variable 'patientID' and initialize it to the value in the tag.
patientID := (0010,0020)

## Externally-Modified Variables

DicomEdit provides methods for applications to create and set variables and inject them into scripts. This enables users to interactively modify variable values. Such applications use the variable label to identify the variable, and pre-populate the value field with the initialized value. The value is set in the application and then injected into the script. The variable label is the name of the variable, unless the script specifies otherwise using the describe operation, as shown below:

// Allow the user to set Patient ID
// Initial value is the pre-modification value
patientID := (0010,0020)
describe patientID "Subject ID"
(0010,0020) := patientID

The code snippet above creates a user-defined variable named patientID, and sets its initial value to the contents of DICOM attribute (0010,0020). Applications that allow users to modify the values interactively will provide a field where the variable value can be edited (next to the label Subject ID), and the content of the DICOM attribute (0010,0020) will be set to the resulting value of the variable patientID.

Variables can also be declared to be hidden, in which applications are expected not to allow the user to edit them. Such variables are typically intermediate steps in a complex value construction, as illustrated here:

// Get visit ID from a web service using Patient ID and Study Date
describe visurl hidden
visurl := format["http://nrg111:3000/services/visitID?id={0}&date={1}", \
urlEncode[(0010,0020)], urlEncode[(0008,0020)]]
describe visit hidden
visit := getURL[visurl]

// Generate new Study Description based on Patient ID, Visit ID, modality
describe studyDesc "Study Description"
studyDesc := format["{0}_{1}_{2}", (0010,0020), visit, (0008,0060)]
(0008,1030) := studyDesc

## XNAT-Predefined Variables

XNAT defines and sets these variables to aid in translating between DICOM and XNAT metadata:

NameValue
projectThe project's label
subjectThe subject's label
sessionThe session's label
modalityLabelThis refers not to the DICOM tag (0008,0060) Modality, but to the ultimate modality the session will receive.  For example, a dual-modality PET-CT session is stored as PET.

XNAT users can use these variables in scripts, relying on XNAT to initialize their values. See How XNAT Scans DICOM to Map to Project/Subject/Session for the details.

## Other Language Features

The base DicomEdit language includes one additional statement type that is neither an operation nor a variable declaration: echo prints a value to the console output:

echo format["Study Description '{0}' -> '{1}'", (0008,1030), studyDesc]

Handling of syntax errors in the DicomEdit interpreter is primitive: while invalid scripts generally produce error messages, the messages tend to be inscrutable (and some applications even hide the messages). We firmly recommend making no errors in your scripts.

## Reference

This document is partially derived from DicomBrowser: Software for Viewing and Modifying DICOM Metadata, Archie KA and Marcus DS, J. Digit. Imaging, 2012. The original publication is available at www.springerlink.com.