Engaging Conversations on Healthcare and Technology

  • TwitterFacebookRSS

HL7 Escape Sequences

HL7 defines character sequences to represent ‘special’ characters not otherwise permitted in HL7 messages. These sequences begin and end with the message’s Escape character (usually ‘\’), and contain an identifying character, followed by 0 or more characters. The most common use of these escape sequences is to escape the HL7 defined delimiter characters. These delimiter or separator characters are defined in MSH-1 and MSH-2 of the HL7 message and typically have the following values:

| Field Delimiter
^ Component Delimiter
& Sub-Component Delimiter

The other 2 special case characters defined in MSH-2 are:

~ Repetition Separator
\ Escape Character

When parsing HL7 messages, interface engines and HL7-enabled applications should be able to recognize these special escape sequences and properly convert them. Similarly, if producing HL7 messages that contain special data, the application should properly escape the data using the HL7 escape sequences.

A simple example of this can be seen in the following OBX segments:

OBX||10|||Current Medications
OBX||11|||DILANTIN & NORVASC

The OBX-5 field is defined by HL7 to contain data conforming to an ST datatype. This should be a single string value. What we see in our message is the value DILANTIN & NORVASC. The ‘&’ character is typically used as the sub-component separator in an HL7 message. By the rules of HL7 this message is not properly formatted. Since the ‘&’ character in this case is meant to be part of the resulting text, it needs to be escaped. The correct representation of this message should be as follows:

OBX||10|||Current Medications
OBX||11|||DILANTIN \T\ NORVASC

When a receiving application reads this message it should follow the rules of the HL7 standard, and convert the \T\ to the ‘&’ character prior to importing this information into their system. When pursuing an HL7 integration project in the real world, it is very common to interface with a system that claims to be HL7 compliant, but they do not support the HL7 encoding sequences.

This means that you may receive an HL7 message with improperly formatted data. This also means that when interfacing with one of these non-compliant HL7 systems, they may actually require you to send them HL7 messages where the special characters of HL7 are not properly escaped.

The table below shows the HL7 Escape sequences, and how they are converted:

Character Description Conversion
\Cxxyy\ Single-byte character set escape sequence with two hexadecimal values not converted
\E\ Escape character converted to escape character (e.g., ‘\’)
\F\ Field separator converted to field separator character (e.g., ‘|’)
\H\ Start highlighting not converted
\Mxxyyzz\ Multi-byte character set escape sequence with two or three hexadecimal values (zz is optional) not converted
\N\ Normal text (end highlighting) not converted
\R\ Repetition separator converted to repetition separator character (e.g., ‘~’)
\S\ Component separator converted to component separator character (e.g., ‘^’)
\T\ Subcomponent separator converted to subcomponent separator character (e.g., ‘&’)
\Xdd…\ Hexadecimal data (dd must be hexadecimal characters) converted to the characters identified by each pair of digits
\Zdd…\ Locally defined escape sequence not converted

Related posts:

  1. HL7 Specifications Defined
  2. What Are the Components of an HL7 Message?
  3. HL7 Beginner’s Perspective – Day 2 of Training
  4. What Is an ACK?
  5. What Is a DFT Message?
  • http://hl7standards.com/blog/2007/05/14/hl7-engine-mapping/ HL7 Engine Mapping

    [...] that inherits all the characteristics of the standard format – supports Z segments, non-standard separator characters, [...]

  • http://hl7standards.com/blog/2007/09/10/variations-of-the-hl7-orur01-message-format/ Variations of the HL7 ORU^R01 Message Format

    [...] Notice also that the backslash characters () in the path need to be properly escaped using the HL7 escape sequence E. MSH|^~&|System1||||200707090801||MDM^T01|3542196|P|2.3 EVN|T01|200707090801|200707090801 [...]

  • cynthia peters

    This is great information…and you guys are in out backyard.
    Do you have any one day training seminars on Sat.
    For basic HL7. I need a refresher.

  • David Wong

    Suppose I have some unfortunate filename text that should be properly escaped such as “C:\Cade2\file.txt”, how should the string be escaped?
    If I insert, “C:\E\Cade2\E\file.txt” won’t the HL7 encoder get confused by the \Cxxyy\ substring?

    Another way this could much more easily happen is in inserting the data from a binary file into an ED field.

  • marchewek

    I undestand that all sequences (eg. H) look like this because is the escape character, don’t they?
    If the escape character was *, then these sequences’d look like *H* – am I right?

  • Anonymous

    David, the parser should not get confused. when it parses and finds the ” character it knows that is a special character and what follows is some sort of escape sequence. It’s really no different than escape sequences in other languages such as XML. If you want to send an ampersand ‘&’ in an XML document you must escape it as ‘&’

    You should not attempt to send binary data in an HL7 message, it should be encoded prior to sending. Typically interfaces will Base64 encode binary data prior to sending it across an interface. Just like e-mail typically uses Mime encoding to deliver binary data.

  • Mike Stockemer

    You are correct. The escape character is set in MSH-2, and if you set it to * then the escape sequences would use that character. Most applications use the suggested encoding characters. McKesson Star is one application that you typically run into that modifies MSH-1 and MSH-2. You know you are interfacing with Star when you see lots of : and ; characters in the HL7 messages.

  • marchewek

    It’s because unescaping escaped characters should occur only once – Cade2 won’t get interpreted. If there was a character you wanted to escape, e.g. ą – “ącade” it should be escaped (the number I’m about type now is random) E\C1234cadeE – notice double backslash. Then unescaping would produce: E – , C1234 – ą, cade – cade, E -

  • http://topsy.com/trackback?url=http%3A%2F%2Fwww.hl7standards.com%2Fblog%2F2006%2F11%2F02%2Fhl7-escape-sequences%2F%23comment-144681677%23pd_a_4476384%3Futm_source%3D%26utm_medium%3D%26utm_campaign%3D& Tweets that mention HL7 Escape Sequences | HL7 Standards — Topsy.com

    [...] This post was mentioned on Twitter by Healthcare_3.0, Rishi Bhalerao. Rishi Bhalerao said: RT @HealthStandards: What top tracks are you planning to focus on at HIMSS11? Answer poll question here: http://ht.ly/3UBmi #HIMSS11 [...]

blog comments powered by Disqus