FIX Converter

Last updated on 13th Jun 2011.

This code is an attempt to create a reference implementation of the translation between tag=value and FIXML. It attempts to correctly convert valid messages, but it doesn't pay detailed attention to what is valid and what is not. It will spot and reject certain invalid messages, but not all (use the FIXML XSDs to confirm the validity of the message). It is not performance optimised, but does use appropriate datastructures (such as hash maps) to avoid obvious problems.

FIX definition

The definition of the FIX protocol is held in the FIX Repository, and recently (as of 2009 onwards) there has been a new and easier to use version of this known as the Unified FIX Repository. This is known to be the strategic direction.

Vendor supplied tools can import, amend and export this format. They can also generate FIXML XSD schemas from it.

The Unified FIX Repository is therefore the ideal configuration from which to drive FIX tools, such as this FIX Converter (and indeed, an earlier attempt to write a FIX Converter driven off of the FIXML XSDs failed when it was realised that information relating to repeating groups and other areas is missing).

FIX representations

FIX has two representations: tag=value and FIXML.

tag=value

tag=value is a compact representation, a typical message might start with :-

8=FIX.4.4<SOH>9=176<SOH>35=D<SOH>49=AFUNDMGR<SOH>56=ABROKER<SOH>34=521...

where <SOH> is ASCII character 1.

FIX Engines typically speak tag=value over network links between each other.

The FAST wire protocol is a compressed version of the tag=value format. We could imagine other compressed variants of tag=value, perhaps using ASN.1, but note that some of the FIX HFT discussion recently has proposed a number of measures that result in a different protocol rather than a different representation of the current protocol.

FIX Engines typically present an API to the programs they are embedded into in which a message object is basically an ordered collection of tags with values.

FIXML

FIXML is an alternative XML based representation of FIX. A typical example might look like this :-

<FIXML xmlns="http://www.fixprotocol.org/FIXML-4-4" v="4.4">
    <Order ID="123456" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
           Typ="2" Px="93.25" Acct="26522154">
        <Hdr Snt="2001-09-11T09:30:47-05:00" PosDup="N" PosRsnd="N"
             SeqNum="521" SID="AFUNDMGR" TID="ABROKER"/>
        <Instrmt Sym="IBM" ID="459200101" Src="1"/>
        <OrdQty Qty="1000"/>
    </Order>
</FIXML>

FIX in this representation can be easier to process, due to the ready availability of standard and open XML tools.

Many corporate messaging systems are XML based, in order to handle ISO 20022, FpML, XBRL and SWIFT (as MT/XML). Being able to represent FIX as FIXML can enable the processing of FIX by these messaging systems.

This further enables the use of the FIXML XSDs for message validation, the use of XSLT, Schematron or ISO Schematron for cross-field validation, and even the use of XQuery rules for cross-field validation. Routing and triggering based upon XPath expression is also enabled.

The use of FIXML instead of tag=value is usually at a non-trivial performance cost, and may not be for everyone. Therefore FIXML may play less in the pre-trade space, and more in the post-trade space.

Prior to FIX.4.4, the FIXML representations are defined by DTD, not by XSD. The Unified FIX Repository annotates them with fixml="0" suggesting that schemas shouldn't be generated from it. These earlier forms of FIXML seem to use elements where the modern form uses attributes.

What variants of FIXML can the FIX Converter cope with?

This converter only understands valid FIXML as of version 4.4 onwards. The example shown previously is an example of this, and a good way to spot such messages is to note that the CompIDs are SID and TID attributes of the Hdr element. Unfortunately, the FIX Converter doesn't support the older 4.2 style of FIXML.

If you were to add metadata to the FIX.4.2 entry in the Unified FIX Repository, you could generate FIXML schemas in the new 4.4 style (although of course, these wouldn't match the official 4.2 DTD). FIX Converter would then be able to handle this alternative XML representation of FIX.4.2. It would also be necessary to ensure messages didn't directly include repeatingGroups, but instead used componentRefs to components containing repeatingGroups.

Unfortunately, it is common practice to send FIX messages in which the elements are not in the appropriate FIX namespace (or any namespace for that matter). The next two examples break the rules in this way. FIX Converter can cope with this (versioning related constraints below). It always generates FIXML with the right namespace.

http://fixwiki.fixprotocol.org/fixwiki/FPL:FIXML_Syntax has a couple of FIXML examples, reproduced here.

It shows an example of "FIXML 4.2 Version", which is somewhat verbose. A good way to spot such messages is that they have a FIXMLMessage element and the CompIDs are CompID elements within Sender and Target elements. The example looks like this :-

<FIXML>
    <FIXMLMessage>
        <Header>
            <PossDupFlag Value="N"/>
            <PossResend Value="N"/>
            <SendingTime>20020103-12  00  01</SendingTime>
            <Sender>
                <CompID>AFUNDMGR</CompID>
            </Sender>
            <Target>
                <CompID>ABROKER</CompID>
            </Target>
        </Header>
        <ApplicationMessage>
            <Order>
                ... stuff
            </Order>
        </ApplicationMessage>
    </FIXMLMessage>
</FIXML>

Note that in the example, the SendingTime should actually be 2002-01-03T12:00:01.

The wiki also shows something it calls "FIXML 4.4 Schema Version". However, in the FIXML 4.4 schema, the CompIDs are not elements called Sndr or Tgt, they are attributes called SID and TID of the Hdr element. The example therefore incorrectly looks like this :-

<FIXML>
    <Order ClOrdID="123456" Side="2"
            TransactTm="2001-09-11T09:30:47-05:00"
            OrdTyp="2" Px="93.25" Acct="26522154">
        <Hdr Snt="2001-09-11T09:30:47-05:00"
             PosDup="N" PosRsnd="N" SeqNum="521">
             <Sndr ID="AFUNDMGR"/>
             <Tgt ID="ABROKER"/>
        </Hdr>
        <Instrmt Sym="IBM" ID="459200101" IDSrc="1"/>
        <OrdQty Qty="1000"/>
    </Order>
</FIXML>

Note that even the the fixml-schema-4-4-examples-20040109.zip file that can be downloaded from the FIX Protocol site include a file called PositionReportExample1.xml has this incorrect Hdr structure.

I have also seen variants of the above where the sender CompID element was called Snd instead of Sndr.

In fixml-schema-4-4-examples-20040109.zip the included AllocationInstruction.xml file has no namespace, has bad dates, and has the wrong repeating group structure.

As incorrect messages like this don't agree with the FIX Repository, or schemas generated from it, the FIX Converter doesn't support them.

The need for conversion

Just as the SWIFT MT (a compact binary format) is transported over the SWIFT network between organisations and SWIFT MT/XML is sometimes transported between applications within an organisation, we can forsee FIX tag=value being spoken between organisations (such as brokers and fund managers) and FIXML being used within these organisations.

Usage

The FIX Converter defines two classes for describing a tag=value message. The FixTag class has an integer ID and a byte array as its value. Getters and setters are provided to access these. The FixTagMessage class is basically a Vector of FixTags.

FIXML documents are manipulated in org.w3c.dom.Document form.

Bytes to tag=value

Your FIX Engine may already be providing you with a tag=value representation of your message, in which case, use the methods on FixTag and FixTagMessage to make your FixTagMessage. If not, read on...

To use the FIX converter to parse an array of bytes into its representation of a FIX message in tag=value format, use :-

import org.fixprotocol.contrib.converter.*;

Document d = parsed FixRepository.xml
FixConv fc = new FixConv(d);
byte[] b = the FIX message
FixTagMessage ftm = fc.bytesToFixTagMessage(
  b,
  extMaj, // eg: "4", "5", or null
  extMin, // eg: "4", "0", or null
  extSP,  // eg: "SP1", "SP2", or null
  extCv   // eg: "jpmc-to-lse-custom-version", or null
  );

If the tags (BeginString (8), then possibly the ApplVerID (1128) and then possibly the CstmApplVerID (1129) tag) identify a version of FIX that is not present in the repository, then the FIX message may not be decoded properly.

The the message doesn't fully define its version, and non-null external values are passed in, then these are used.

tag=value to bytes

To convert a FixTagMessage back to bytes, you can either :-

byte[] b = fc.fixTagMessageToBytes(ftm);

or :-

byte[] b = ftm.toBytes();

The FixTagMessage also has a handy .toString() method which produces a printable string representation, which is intended primarily for debug purposes. In this printable form, the \xHH notation is used for unprintable ASCII characters, so the SOH characters for example appear as \x01.

tag=value to FIXML

To convert a FixTagMessage to FIXML :-

Document d = fc.tagToXml(
  ftm,    // the FixTagMessage
  extMaj, // eg: "4", "5", or null
  extMin, // eg: "4", "0", or null
  extSP,  // eg: "SP1", "SP2", or null
  extCv   // eg: "jpmc-to-lse-custom-version", or null
  );

The FixTagMessage must obey a minimum of FIX rules, including :-

For FIX.5.0 onwards, if the ApplVerID tag is not found, the extMaj and extMin values are used (possibly with the extSP appended, if defined). If the CstmApplVerID tag is not found, the extCv is used if present. If these external values are needed, and are not supplied, then an exception is thrown.

The generated XML will have the elements in the correct namespace, eg:

If the input tag=value message has the ApplExtID (1156) tag then the resulting FIXML message has the xv attribute.

Certain tags in the FixTagMessage must be in the places described by the rules above. The converter will accept other tags in messages and components in orders different to that in the Unified FIX Repository. Rules relating to repeating groups NoXxx tags, and the first tag in each group must still be adhered to (else the message can't be processed).

tag=value to pretty string

To convert a FixTagMessage back to bytes, you can either :-

String s = fc.fixTagMessageToPretty(
  ftm,
  extMaj, // eg: "4", "5", or null
  extMin, // eg: "4", "0", or null
  extSP,  // eg: "SP1", "SP2", or null
  extCv   // eg: "jpmc-to-lse-custom-version", or null
  );

This produces a string with one line per field, with each tag annotated with its name. Sometimes it can be easier to read a FIX tag=value message this way.

This method has to operate on an instance of a FixConv as it needs access to the Unified FIX Repository that has been loaded.

The the message doesn't fully define its version, and non-null external values are passed in, then these are used.

FIXML to tag=value

To convert from FIXML to tag=value :-

List<FixTagMessage> ftms = fc.xmlToTag(
  d,      // the FIXML document
  extMaj, // eg: "4", "5", or null
  extMin, // eg: "4", "0", or null
  extSP,  // eg: "SP1", "SP2", or null
  extCv   // eg: "jpmc-to-lse-custom-version", or null
  );

The FIXML document is expected to have its elements in the correct namespace. The code still works if no namespace is employed, but you may have to help it determine the version (see the versioning algorithm below for details).

The conversion algorithm inspects the namespace to work out what it can about the major, minor and service pack number of the message. It then looks for the v attribute on the root FIXML element, and also the cv attribute for the custom version. If this is insufficient, it tries to complete the picture by using the external version information passed into the xmlToTag method.

The version algorithm is as follows :-

  if namespace missing or doesn't define major and minor
    if FIXML v attribute present
      use it
    else
      if extMaj and extMin supplied
        use them
      else
        error
    append extSP if supplied
  else if namespace defines major only // this is the new FIX.5.0SP3 onwards style
    if FIXML v attribute present
      use it
      if namespace disagrees with major number in FIXML v attribute
        error
    else
      if extMaj and extMin is supplied
        use them
      else
        error
    append extSP if supplied
  else namespace defines major and minor (and possibly service pack too)
    use them
    if FIXML v attribute present
      if namespace disagrees with info in FIXML v attribute
        error

Apologies for the above, but I didn't define the standard. This is an attempt to try to do the right thing by default.

The custom version algorithm is as follows :-

  if FIXML cv attribute present
    use it
  else if extCv supplied
    use it
  else
    normal, non-customised version

If the input FIXML message has the xv attribute, then the resulting tag=value message will have the AppExtID (1156) tag.

You may have noticed the result is a list of tag=value messages. This is because FIXML documents may contain a Batch of messages, eg:

<FIXML xmlns="http://www.fixprotocol.org/FIXML-4-4" v="4.4">
    <Batch>
        <!-- This header applies to all the messages -->
        <Hdr Snt="2001-09-11T09:30:47-05:00" PosDup="N" PosRsnd="N"
            SeqNum="521" SID="AFUNDMGR" TID="ABROKER"/>
        <Order ID="123456" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
               Typ="2" Px="93.25" Acct="26522154">
            <Instrmt Sym="IBM" ID="459200101" Src="1"/>
            <OrdQty Qty="1000"/>
        </Order>
        <!-- This second message is a copy of the first, with different ID -->
        <Order ID="123457" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
               Typ="2" Px="93.25" Acct="26522154">
            <!-- I've also included a message specific header -->
            <Hdr PosDup="Y"/>
            <Instrmt Sym="IBM" ID="459200101" Src="1"/>
            <OrdQty Qty="1000"/>
        </Order>
    </Batch>
</FIXML>

The converter will ensure each output message has tags from the Batch Hdr, with any values added or overwritten in the individual message Hdrs.

The FIX converter cannot convert a list of tag=value messages into a single FIXML document.

The generated FixTagMessages have their tags in the same order as they are defined in the Unified FIX Repository.

The test program

Assuming run.sh contains (run.cmd is similar) :-

#!/bin/ksh
java -cp fixconv.jar org.fixprotocol.contrib.converter.FixConvTest "$@"

Usage :-

$ ./run.sh
usage: FixConvTest xmlToTag FixRepository.xml file.fixml [file.tag [extMaj [extMin [extSP [extCv]]]]]
   or: FixConvTest tagToXml FixRepository.xml file.tag [file.fixml [extMaj [extMin [extSP [extCv]]]]]
   or: FixConvTest repair FixRepository.xml file.tag [file2.tag [extMaj [extMin [extSP [extCv]]]]]
   or: FixConvTest pretty FixRepository.xml file.tag [extMaj [extMin [extSP [extCv]]]]
   or: FixConvTest validate file.fixml fixml-main-X-X.xsd

This can be used to convert either way, eg: to convert from FIXML to tag=value :-

$ ./run.sh xmlToTag FixRepository.xml file.fixml

To do the same, and keep the result in a file :-

$ ./run.sh xmlToTag FixRepository.xml file.fixml file.tag

If the input FIXML file has a Batch of messages, and it contains more than one message, then several output files are written, with filenames ending in -1, -2, etc..

To convert back :-

$ ./run.sh tagToXml FixRepository.xml file.tag

and keep the result (without overwriting the original) :-

$ ./run.sh tagToXml FixRepository.xml file.tag file.fixml2

The test program displays the message before translation, some debug tracing (which isn't normally shown when using the FixConv classes), and the output message(s). tag=value messages are shown in a debug printable form, ie: using the .toString() method described earlier, thus resulting in \xHH notation in the output.

The repair command reads a tag=value message and if the BodyLength (9) field is missing, it inserts and calculates it. If the CheckSum (10) field is missing, it appends and calculates it. This is handy because sometimes people write sample test messages using text editors and omit these fields (or get them wrong). Some text editors also append newline characters (\n or \r\n) to the end of the line - this test program silently truncates these unwanted additions.

The pretty command dumps the fields in a tag=value message, one to a line, annotating each with the tag name.

If you have the schemas handy, the validate command can be used to check your FIXML file for validity. Validation errors are displayed, and silence is good. If a FIXML message is valid, this program should be able to convert it to tag=value. When tag=value messages are converted to FIXML, they should be valid.

Note that the FIX Converter can convert session tag=value messages to FIXML, but note that these will fail to validate against the FIXML XSDs. This is not because there is anything wrong with them structurally, its just that the folks behind the Unified FIX Repository have decided that FIX session messages are not requred in FIXML. Looking at the 2010 Edition of the repository, I can see a problem in the way the FIX.4.4 Logon message is defined (it has a NoMsgTypes repeatingGroup directly within the message, rather than nested within a component which is componentRefd from the message. This prevents the correct generation of FIXML for that message.

The test data

The test pack can be run :-

$ ./test.sh

For each test, an input file is read, a log is written of the conversion and an output file(s) is written. Input tag=value files have .tag extension and input FIXML files have .fixml extension. The files are named, and sometimes have content within them to give an indication of the expected result. eg: fix44-malformed-tag.tag.

Test messages have been difficult to source, and in many cases have been created by getting sample or real messages and cleaning them up so that they validate. Some came from the FPL site and some are real messages with names changed to protect the guilty. Some are designed to exercise specific functionality and some are designed to be defective in specific ways.

The test data covers :-

Note that with the encoded data samples, although the data is right in the input and output files, it won't look right on the standard output because the terminal typically doesn't use the right encoding.

Issues

The FIX Converter has hard coded knowledge of the relationship between FIX versions (eg: FIX.5.0SP2) and the enumerated value used in the ApplVerID (1128) to represent them (eg: 9). The enumerations symbolicName attribute in the repository is not sufficient for this purpose. A better solution is for the <fix/> elements to have an additional attribute, such as appVerID="9".

The FIX Converter has hard coded knowledge of how the namespaces work in FIX.4.x .. FIX.5.0, FIX.5.0SP1 .. FIX.5.0SP2 and how it has been stated that they will work for FIX.5.0SP3 onwards. A better solution is for the <fix/> elements to have an attribute, such as fixmlNamespace="http://www.fixprotocol.org/FIXML-5".

The 2010 Edition of the Unified FIX Repository does not identify which tags contain encoded text. The code assumes the 2011 Edition will include an encoded="1" attribute on such fields. For the 2010 Edition, it assumes that if the field name starts with Encoded and doesn't end in Len, then its encoded. Looking at the existing 2010 FixRepository.xml content, this should probably be ok.

The FIX Converter has hard coded knowledge of the relationship between FIX character encodings (eg: Shift_JIS) and the equivelent Java Charset (eg: SJIS). This is technology dependant, and may even be JVM vendor dependant, so I don't suppose its a good idea to extend the Unified FIX Repository to contain this information.

The FIX Repository defines certain fields as data, in that they have a length tag and a data tag of that length. The FIX Specification does not specify how such field are represented within FIXML attributes. Today, the FIX Converter stores them base64 encoded, as it is content preserving and precedent for this exists in other XML files.

XMLData fields contain XML documents, and the root element becomes a nested element of the enclosing component. Given what is in FIXimate, this looks reasonable, but I've not seen any samples to compare against.

Note: beware xercesImpl-2.6.2.jar, as we've seen org.apache.xerces.dom.DocumentImpl cannot be cast to org.apache.xerces.dom.DeferredDocumentImpl when processing XMLData fields. This is a Xerces-J bug, claimed to be fixed in 2.8.0. We've tried 2.8.1 and it appears fixed. At time of writing, 2.11.0 is current. Note that we also observe that Java 1.6.0_20 internally includes Xerces-J 2.6.2, but this doesn't show the problem. Presumably Sun included a patch in this version.

Release notes

This code is supplied as-is and without warranty or any assertion of fitness for purpose (either by the author or his employer). It is as good as the test-pack it is delivered with. It is made available for use by any member of FIX Protocol Limited in good standing.

Please do not distibute directly; refer interested parties to http://www.fixprotocol.org/.

If you make any modifications to this code, please clearly annotate any modifications and label the resulting code as clearly distinct from the original. If they're bugfixes, please tell the author.

I would like to recognise and acknowledge the assistance of Jim Northey in clarifying aspects of FIX and the Unified FIX Repository, thus allowing me to create this code.


The documentation is written and maintained by the FIX Converter author, Andy Key
andy.z.key@googlemail.com