Last updated on 13th Jun 2011.
This code is an attempt to create a reference implementation of the translation between tag=value and FIXML. It attempts to correctly convert valid messages, but it doesn't pay detailed attention to what is valid and what is not. It will spot and reject certain invalid messages, but not all (use the FIXML XSDs to confirm the validity of the message). It is not performance optimised, but does use appropriate datastructures (such as hash maps) to avoid obvious problems.
The definition of the FIX protocol is held in the FIX Repository, and recently (as of 2009 onwards) there has been a new and easier to use version of this known as the Unified FIX Repository. This is known to be the strategic direction.
Vendor supplied tools can import, amend and export this format. They can also generate FIXML XSD schemas from it.
The Unified FIX Repository is therefore the ideal configuration from which to drive FIX tools, such as this FIX Converter (and indeed, an earlier attempt to write a FIX Converter driven off of the FIXML XSDs failed when it was realised that information relating to repeating groups and other areas is missing).
FIX has two representations: tag=value and FIXML.
tag=value is a compact representation, a typical message might start with :-
8=FIX.4.4<SOH>9=176<SOH>35=D<SOH>49=AFUNDMGR<SOH>56=ABROKER<SOH>34=521...
where <SOH> is ASCII character 1.
FIX Engines typically speak tag=value over network links between each other.
The FAST wire protocol is a compressed version of the tag=value format. We could imagine other compressed variants of tag=value, perhaps using ASN.1, but note that some of the FIX HFT discussion recently has proposed a number of measures that result in a different protocol rather than a different representation of the current protocol.
FIX Engines typically present an API to the programs they are embedded into in which a message object is basically an ordered collection of tags with values.
FIXML is an alternative XML based representation of FIX. A typical example might look like this :-
<FIXML xmlns="http://www.fixprotocol.org/FIXML-4-4" v="4.4">
<Order ID="123456" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
Typ="2" Px="93.25" Acct="26522154">
<Hdr Snt="2001-09-11T09:30:47-05:00" PosDup="N" PosRsnd="N"
SeqNum="521" SID="AFUNDMGR" TID="ABROKER"/>
<Instrmt Sym="IBM" ID="459200101" Src="1"/>
<OrdQty Qty="1000"/>
</Order>
</FIXML>
FIX in this representation can be easier to process, due to the ready availability of standard and open XML tools.
Many corporate messaging systems are XML based, in order to handle ISO 20022, FpML, XBRL and SWIFT (as MT/XML). Being able to represent FIX as FIXML can enable the processing of FIX by these messaging systems.
This further enables the use of the FIXML XSDs for message validation, the use of XSLT, Schematron or ISO Schematron for cross-field validation, and even the use of XQuery rules for cross-field validation. Routing and triggering based upon XPath expression is also enabled.
The use of FIXML instead of tag=value is usually at a non-trivial performance cost, and may not be for everyone. Therefore FIXML may play less in the pre-trade space, and more in the post-trade space.
Prior to FIX.4.4, the FIXML representations are defined by DTD, not by XSD.
The Unified FIX Repository annotates them with fixml="0"
suggesting that schemas shouldn't be generated from it.
These earlier forms of FIXML seem to use elements where the modern form
uses attributes.
This converter only understands valid FIXML as of version 4.4 onwards.
The example shown previously is an example of this, and a good way to spot
such messages is to note that the CompIDs are SID and
TID attributes of the Hdr element.
Unfortunately, the FIX Converter doesn't support the older 4.2 style of FIXML.
If you were to add metadata to the FIX.4.2 entry in the Unified FIX Repository, you could generate FIXML schemas in the new 4.4 style (although of course, these wouldn't match the official 4.2 DTD). FIX Converter would then be able to handle this alternative XML representation of FIX.4.2. It would also be necessary to ensure messages didn't directly include repeatingGroups, but instead used componentRefs to components containing repeatingGroups.
Unfortunately, it is common practice to send FIX messages in which the elements are not in the appropriate FIX namespace (or any namespace for that matter). The next two examples break the rules in this way. FIX Converter can cope with this (versioning related constraints below). It always generates FIXML with the right namespace.
http://fixwiki.fixprotocol.org/fixwiki/FPL:FIXML_Syntax
has a couple of FIXML examples, reproduced here.
It shows an example of "FIXML 4.2 Version", which is somewhat verbose.
A good way to spot such messages is that they have a FIXMLMessage
element and the CompIDs are CompID elements within
Sender and Target elements.
The example looks like this :-
<FIXML>
<FIXMLMessage>
<Header>
<PossDupFlag Value="N"/>
<PossResend Value="N"/>
<SendingTime>20020103-12 00 01</SendingTime>
<Sender>
<CompID>AFUNDMGR</CompID>
</Sender>
<Target>
<CompID>ABROKER</CompID>
</Target>
</Header>
<ApplicationMessage>
<Order>
... stuff
</Order>
</ApplicationMessage>
</FIXMLMessage>
</FIXML>
Note that in the example, the SendingTime should actually
be 2002-01-03T12:00:01.
The wiki also shows something it calls "FIXML 4.4 Schema Version".
However, in the FIXML 4.4 schema, the CompIDs are not elements
called Sndr or Tgt, they are attributes
called SID and TID of the Hdr element.
The example therefore incorrectly looks like this :-
<FIXML>
<Order ClOrdID="123456" Side="2"
TransactTm="2001-09-11T09:30:47-05:00"
OrdTyp="2" Px="93.25" Acct="26522154">
<Hdr Snt="2001-09-11T09:30:47-05:00"
PosDup="N" PosRsnd="N" SeqNum="521">
<Sndr ID="AFUNDMGR"/>
<Tgt ID="ABROKER"/>
</Hdr>
<Instrmt Sym="IBM" ID="459200101" IDSrc="1"/>
<OrdQty Qty="1000"/>
</Order>
</FIXML>
Note that even the the fixml-schema-4-4-examples-20040109.zip
file that can be downloaded from the FIX Protocol site include a file
called PositionReportExample1.xml has this incorrect
Hdr structure.
I have also seen variants of the above where the sender CompID
element was called Snd instead of Sndr.
In fixml-schema-4-4-examples-20040109.zip
the included AllocationInstruction.xml file has no namespace,
has bad dates, and has the wrong repeating group structure.
As incorrect messages like this don't agree with the FIX Repository, or schemas generated from it, the FIX Converter doesn't support them.
Just as the SWIFT MT (a compact binary format) is transported over the SWIFT network between organisations and SWIFT MT/XML is sometimes transported between applications within an organisation, we can forsee FIX tag=value being spoken between organisations (such as brokers and fund managers) and FIXML being used within these organisations.
The FIX Converter defines two classes for describing a tag=value message.
The FixTag class has an integer ID and a byte array as its value.
Getters and setters are provided to access these.
The FixTagMessage class is basically a Vector of
FixTags.
FIXML documents are manipulated in org.w3c.dom.Document form.
Your FIX Engine may already be providing you with a tag=value
representation of your message, in which case, use the methods on
FixTag and FixTagMessage to make your
FixTagMessage.
If not, read on...
To use the FIX converter to parse an array of bytes into its representation of a FIX message in tag=value format, use :-
import org.fixprotocol.contrib.converter.*; Document d = parsed FixRepository.xml FixConv fc = new FixConv(d); byte[] b = the FIX message FixTagMessage ftm = fc.bytesToFixTagMessage( b, extMaj, // eg: "4", "5", or null extMin, // eg: "4", "0", or null extSP, // eg: "SP1", "SP2", or null extCv // eg: "jpmc-to-lse-custom-version", or null );
If the tags (BeginString (8), then possibly the
ApplVerID (1128) and then possibly the
CstmApplVerID (1129) tag) identify a version of FIX
that is not present in the repository, then the FIX message may
not be decoded properly.
The the message doesn't fully define its version, and non-null external values are passed in, then these are used.
To convert a FixTagMessage back to bytes, you can either :-
byte[] b = fc.fixTagMessageToBytes(ftm);
or :-
byte[] b = ftm.toBytes();
The FixTagMessage also has a handy .toString()
method which produces a printable string representation, which is intended
primarily for debug purposes.
In this printable form, the \xHH notation is used for unprintable
ASCII characters, so the SOH characters for example appear as
\x01.
To convert a FixTagMessage to FIXML :-
Document d = fc.tagToXml( ftm, // the FixTagMessage extMaj, // eg: "4", "5", or null extMin, // eg: "4", "0", or null extSP, // eg: "SP1", "SP2", or null extCv // eg: "jpmc-to-lse-custom-version", or null );
The FixTagMessage must obey a minimum of FIX rules,
including :-
BeginString (8)
BodyLength (9), although its not
checked
MsgType (35)
ApplVerID (1128) tag should
define the FIX application version, including service pack number
CstmApplVerID (1129) tag must identify
that version, and that same ID must match the customVersion
attribute in the Unified FIX Repository
Shift_JIS encoding), then the MessageEncoding
tag has to be present
For FIX.5.0 onwards, if the ApplVerID tag is not found,
the extMaj and extMin values are used
(possibly with the extSP appended, if defined).
If the CstmApplVerID tag is not found, the
extCv is used if present.
If these external values are needed, and are not supplied, then an
exception is thrown.
The generated XML will have the elements in the correct namespace, eg:
http://www.fixprotocol.org/FIXML-4-4
http://www.fixprotocol.org/FIXML-5-0
http://www.fixprotocol.org/FIXML-5-0-SP2
http://www.fixprotocol.org/FIXML-5
If the input tag=value message has the ApplExtID (1156) tag
then the resulting FIXML message has the xv attribute.
Certain tags in the FixTagMessage must be in the places
described by the rules above.
The converter will accept other tags in messages and components in orders
different to that in the Unified FIX Repository.
Rules relating to repeating groups NoXxx tags, and the first tag in each
group must still be adhered to (else the message can't be processed).
To convert a FixTagMessage back to bytes, you can either :-
String s = fc.fixTagMessageToPretty( ftm, extMaj, // eg: "4", "5", or null extMin, // eg: "4", "0", or null extSP, // eg: "SP1", "SP2", or null extCv // eg: "jpmc-to-lse-custom-version", or null );
This produces a string with one line per field, with each tag annotated with its name. Sometimes it can be easier to read a FIX tag=value message this way.
This method has to operate on an instance of a FixConv
as it needs access to the Unified FIX Repository that has been loaded.
The the message doesn't fully define its version, and non-null external values are passed in, then these are used.
To convert from FIXML to tag=value :-
List<FixTagMessage> ftms = fc.xmlToTag( d, // the FIXML document extMaj, // eg: "4", "5", or null extMin, // eg: "4", "0", or null extSP, // eg: "SP1", "SP2", or null extCv // eg: "jpmc-to-lse-custom-version", or null );
The FIXML document is expected to have its elements in the correct namespace. The code still works if no namespace is employed, but you may have to help it determine the version (see the versioning algorithm below for details).
The conversion algorithm inspects the namespace to work out what it can
about the major, minor and service pack number of the message.
It then looks for the v attribute on the root FIXML
element, and also the cv attribute for the custom version.
If this is insufficient, it tries to complete the picture by using the
external version information passed into the xmlToTag method.
The version algorithm is as follows :-
if namespace missing or doesn't define major and minor
if FIXML v attribute present
use it
else
if extMaj and extMin supplied
use them
else
error
append extSP if supplied
else if namespace defines major only // this is the new FIX.5.0SP3 onwards style
if FIXML v attribute present
use it
if namespace disagrees with major number in FIXML v attribute
error
else
if extMaj and extMin is supplied
use them
else
error
append extSP if supplied
else namespace defines major and minor (and possibly service pack too)
use them
if FIXML v attribute present
if namespace disagrees with info in FIXML v attribute
error
Apologies for the above, but I didn't define the standard. This is an attempt to try to do the right thing by default.
The custom version algorithm is as follows :-
if FIXML cv attribute present
use it
else if extCv supplied
use it
else
normal, non-customised version
If the input FIXML message has the xv attribute, then the
resulting tag=value message will have the AppExtID (1156) tag.
You may have noticed the result is a list of tag=value messages. This is because FIXML documents may contain a Batch of messages, eg:
<FIXML xmlns="http://www.fixprotocol.org/FIXML-4-4" v="4.4">
<Batch>
<!-- This header applies to all the messages -->
<Hdr Snt="2001-09-11T09:30:47-05:00" PosDup="N" PosRsnd="N"
SeqNum="521" SID="AFUNDMGR" TID="ABROKER"/>
<Order ID="123456" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
Typ="2" Px="93.25" Acct="26522154">
<Instrmt Sym="IBM" ID="459200101" Src="1"/>
<OrdQty Qty="1000"/>
</Order>
<!-- This second message is a copy of the first, with different ID -->
<Order ID="123457" Side="2" TxnTm="2001-09-11T09:30:47-05:00"
Typ="2" Px="93.25" Acct="26522154">
<!-- I've also included a message specific header -->
<Hdr PosDup="Y"/>
<Instrmt Sym="IBM" ID="459200101" Src="1"/>
<OrdQty Qty="1000"/>
</Order>
</Batch>
</FIXML>
The converter will ensure each output message has tags from the
Batch Hdr, with any values added or overwritten in the
individual message Hdrs.
The FIX converter cannot convert a list of tag=value messages into a single FIXML document.
The generated FixTagMessages have their tags in the same order
as they are defined in the Unified FIX Repository.
Assuming run.sh contains (run.cmd is similar) :-
#!/bin/ksh java -cp fixconv.jar org.fixprotocol.contrib.converter.FixConvTest "$@"
Usage :-
$ ./run.sh usage: FixConvTest xmlToTag FixRepository.xml file.fixml [file.tag [extMaj [extMin [extSP [extCv]]]]] or: FixConvTest tagToXml FixRepository.xml file.tag [file.fixml [extMaj [extMin [extSP [extCv]]]]] or: FixConvTest repair FixRepository.xml file.tag [file2.tag [extMaj [extMin [extSP [extCv]]]]] or: FixConvTest pretty FixRepository.xml file.tag [extMaj [extMin [extSP [extCv]]]] or: FixConvTest validate file.fixml fixml-main-X-X.xsd
This can be used to convert either way, eg: to convert from FIXML to tag=value :-
$ ./run.sh xmlToTag FixRepository.xml file.fixml
To do the same, and keep the result in a file :-
$ ./run.sh xmlToTag FixRepository.xml file.fixml file.tag
If the input FIXML file has a Batch of messages, and it contains
more than one message, then several output files are written, with filenames
ending in -1, -2, etc..
To convert back :-
$ ./run.sh tagToXml FixRepository.xml file.tag
and keep the result (without overwriting the original) :-
$ ./run.sh tagToXml FixRepository.xml file.tag file.fixml2
The test program displays the message before translation,
some debug tracing (which isn't normally shown when using the
FixConv classes), and the output message(s).
tag=value messages are shown in a debug printable form, ie: using the
.toString() method described earlier, thus resulting
in \xHH notation in the output.
The repair command reads a tag=value message and if the
BodyLength (9) field is missing, it inserts and calculates it.
If the CheckSum (10) field is missing, it appends and calculates it.
This is handy because sometimes people write sample test messages
using text editors and omit these fields (or get them wrong).
Some text editors also append newline characters (\n or
\r\n) to the end of the line - this test program silently
truncates these unwanted additions.
The pretty command dumps the fields in a tag=value
message, one to a line, annotating each with the tag name.
If you have the schemas handy, the validate command
can be used to check your FIXML file for validity.
Validation errors are displayed, and silence is good.
If a FIXML message is valid, this program should be able to convert it
to tag=value.
When tag=value messages are converted to FIXML, they should be valid.
Note that the FIX Converter can convert session tag=value messages
to FIXML, but note that these will fail to validate against the FIXML XSDs.
This is not because there is anything wrong with them structurally,
its just that the folks behind the Unified FIX Repository have decided
that FIX session messages are not requred in FIXML.
Looking at the 2010 Edition of the repository, I can see a problem
in the way the FIX.4.4 Logon message is defined (it has a
NoMsgTypes repeatingGroup directly within the
message, rather than nested within a component
which is componentRefd from the message.
This prevents the correct generation of FIXML for that message.
The test pack can be run :-
$ ./test.sh
For each test, an input file is read, a log is written of the conversion
and an output file(s) is written.
Input tag=value files have .tag extension and input
FIXML files have .fixml extension.
The files are named, and sometimes have content within them to give an
indication of the expected result.
eg: fix44-malformed-tag.tag.
Test messages have been difficult to source, and in many cases have been created by getting sample or real messages and cleaning them up so that they validate. Some came from the FPL site and some are real messages with names changed to protect the guilty. Some are designed to exercise specific functionality and some are designed to be defective in specific ways.
The test data covers :-
data fields
Note that with the encoded data samples, although the data is right in the input and output files, it won't look right on the standard output because the terminal typically doesn't use the right encoding.
The FIX Converter has hard coded knowledge of the relationship between
FIX versions (eg: FIX.5.0SP2) and the enumerated value used
in the ApplVerID (1128) to represent them (eg: 9).
The enumerations symbolicName attribute in the repository
is not sufficient for this purpose.
A better solution is for the <fix/> elements to have an
additional attribute, such as appVerID="9".
The FIX Converter has hard coded knowledge of how the namespaces work
in FIX.4.x .. FIX.5.0, FIX.5.0SP1 .. FIX.5.0SP2 and how it has been stated that
they will work for FIX.5.0SP3 onwards.
A better solution is for the <fix/> elements to have an
attribute, such as
fixmlNamespace="http://www.fixprotocol.org/FIXML-5".
The 2010 Edition of the Unified FIX Repository does not identify which
tags contain encoded text.
The code assumes the 2011 Edition will include an encoded="1"
attribute on such fields.
For the 2010 Edition, it assumes that if the field name starts with
Encoded and doesn't end in Len, then its encoded.
Looking at the existing 2010 FixRepository.xml content, this
should probably be ok.
The FIX Converter has hard coded knowledge of the relationship between
FIX character encodings (eg: Shift_JIS) and the equivelent
Java Charset (eg: SJIS).
This is technology dependant, and may even be JVM vendor dependant, so
I don't suppose its a good idea to extend the Unified FIX Repository to
contain this information.
The FIX Repository defines certain fields as data, in that they have a length tag and a data tag of that length. The FIX Specification does not specify how such field are represented within FIXML attributes. Today, the FIX Converter stores them base64 encoded, as it is content preserving and precedent for this exists in other XML files.
XMLData fields contain XML documents, and the root element
becomes a nested element of the enclosing component.
Given what is in FIXimate, this looks reasonable, but I've not seen any
samples to compare against.
Note: beware xercesImpl-2.6.2.jar, as we've seen
org.apache.xerces.dom.DocumentImpl cannot be cast to org.apache.xerces.dom.DeferredDocumentImpl
when processing XMLData fields.
This is a Xerces-J bug, claimed to be fixed in 2.8.0.
We've tried 2.8.1 and it appears fixed.
At time of writing, 2.11.0 is current.
Note that we also observe that Java 1.6.0_20 internally includes
Xerces-J 2.6.2, but this doesn't show the problem.
Presumably Sun included a patch in this version.
This code is supplied as-is and without warranty or any assertion of fitness for purpose (either by the author or his employer). It is as good as the test-pack it is delivered with. It is made available for use by any member of FIX Protocol Limited in good standing.
Please do not distibute directly; refer interested parties to
http://www.fixprotocol.org/.
If you make any modifications to this code, please clearly annotate any modifications and label the resulting code as clearly distinct from the original. If they're bugfixes, please tell the author.
I would like to recognise and acknowledge the assistance of Jim Northey in clarifying aspects of FIX and the Unified FIX Repository, thus allowing me to create this code.