HL7 Separator Characters

In HL7 messaging, the separator characters are also known as the message delimiters or special encoding characters. The following are the HL7 recommended values:

     Segment terminator
|              Field separator, aka pipe
^            Component separator, aka hat
&           Sub-component separator
~            Field repeat separator
\             Escape character

The segment separator is not negotiable. It is always a carriage return (ASCII 13 or HEX 0D). The others are suggested values only, but usually used as indicated above. The HL7 standard lets you choose your own as long as you show them in the MSH segment.

The MSH is the first segment of all HL7 messages (except HL7 batch messages). The field separator is presented as the 4th character in the message and it also represents the first field of the MSH segment. Since the first field of the MSH is typically only a pipe,’|', counting MSH fields becomes tricky. Field 2 of the MSH (MSH-2) contains the other separator characters in this order: component, field repeat, escape, and sub-component.

Thus, the following is an example of the beginning of an HL7 message:
MSH|^~\&|…

The delimiter values used in the MSH segment are supposed to be the delimiter values used throughout the entire message. Encoding HL7 messages in this manner allows an application parser to simply use the special characters in the MSH to parse the message. However, beware that many application parsers just use hard coded values and ignore MSH-1 (Field Separator) and MSH-2 (Encoding Characters).

Last 3 posts by Sonal Patel

4 Responses to “HL7 Separator Characters”
  1. N. Murali Mohan says:

    Could someone tell me if I can use sub-component separotor like FirstName&&&&^LastName&&&^MiddleName&&&&.
    The trend that I observed is &’s at the end of each subocomponent is trimmed and the output is FirstName^LastName^MiddleName.

    If I have Firts&&Name, the &’s are ‘not’ truncated.

    Thanks in advance,
    Murali

  2. G Wang says:

    I am new to HL7. EDU-2 “academic degree” is IS with max len 10 and the column RP/# is empty in 15.4.3. However, I have seen messages like

    EDU|1|BA^BACHELOR OF ARTS^HL70360|…

    Is this legal? If not, what is the right way to encode the same information?

    Thanks.

  3. J Lloyd says:

    Please forgive me for not contributing, but I find no other way to post a query… I need to make certain of something seemingly obvious: How are the characters in “MSH”, which identifies the message header segment, themselves encoded? The HL7 standard states that the MSH-18 field “contains the character set for the entire message.” If “MSH” is part of the message, and conforms with MSH-18, then MSH-1 cannot always be located unambiguously, and so then neither can MSH-18. My hope is that “MSH” is always 7-bit ascii, but is it?

  4. Jon Mertz says:

    To try to answer your questions, we need additional information, but we asked some of our HL7 experts to respond to two of your questions.

    1. How are the characters in “MSH”, which identifies the message header segment, themselves encoded?

    The encoding characters in MSH-1 and MSH-2 are used for parsing the rest of the message. They are encoded in that they become special parsing characters for an application to properly determine what separates the data. Typically, the MSH-1 is a | which delimits different fields. The MSH-2 is ^~\& which translates to ^ delimiting components, ~ delimiting repeating fields, \ as an escape character, and & to delimit subcomponents (rare). These can be changed. HL7 specifies that the 5 characters after MSH in that order are what determine the separators. If you choose different ascii characters, that is ok, but not standard.

    2. Is “MSH” always 7-bit ascii?

    I have dealt with interfaces that are standard ascii. I have seen them about 100% of the time. This, too, is a loose standard. In HL7 standard 2.3, Chapter 1, section 1.6, it says “All data is represented as displayable characters from a selected character set. The ASCII displayable character set (hexadecimal values between 20 and 7E, inclusive) is the default character set unless modified in the MSH header segment.” However, it also gives the addition that this is a suggestion and not an unbreakable rule. I have not yet seen an application that has a need to break this standard. So the answer is no; it’s not required to be standard ascii but I have never seen HL7 that is not standard ascii.

    Let us know if this helps or if you have further questions.

Leave a Reply