Building an XML Parser for Symbian

Part 1

Introduction

As any Symbian developer will tell you, there is no native, standards compliant XML parser that can be tailored for use in your applications. To do anything sensible with XML you really need to have a solid W3C standard compliant XML parser that can handle most things thrown at it.

Over the course of three articles I will describe what I did to port a well known parser over to the Symbian operating system and then what I had to do to make it usable within all the normal Symbian programming idioms and styles.

Background

Whilst its not strictly true to say that Symbian does not have an XML parser, it is true that there is not a usable one for developers. The SMIL parser supplied as part of the MMS editor SDK does not allow customization and the XML parser in the WAP plugin is not designed to be used by end users as it is marked as internal and is undocumented.

Requirements

In order to quantify the requirements for the parser, a list of requirments needs to be drawn up so that each parser can be evaluated on the criteria

 Low memory usage: The device does not have a lot of memory and appications should treat it as a scarce resource. this means both the runtime requirements of the parser should be low as well as it should not require the whole XML file to be in memory at once.

 Incremental parsing: To reduce the runtime memory requirements and allow for the asynchronous processing model favoured by Symbian, the parser cannot be recursive. It will need to support incremental parsing.

 Low stack usage: The Symbian operating system has a limit of 8KB of stack per thread, so the parser must have a low stack usage which more or less rules out the recursive parsers.

 W3C standards compliant: The parser should ideally be W3C compliant. It should support entities, namespaces and code pages as well as being able to detect if a document is well formed or not.

 Symbian style API: The API provided for the parser should have a Symbian "look and feel" based API that is based around a class/classes. It should also adhere to the Symbian Coding standards document avalible from the Symbian developer website.

 Avaliable as a Symbian DLL: The parser should be usable across multiple applications so that the runtime requirements of the parser are at a minimum. Ideally there should be one instance of the parser binary shared across multiple applications. This will then reduce the runtime footprint of all the applications by moving the parser into a seperate DLL.

 Small compiled code size: Because the device has a low processor speed and has a relativly small amount of memory compared to a desktop the runtime code size should be kept as small as possible, this means that some of the more esoteric features of the XML specification such as XPath can possibly be omitted from the device side parser. Ideally we want something less than 150k.

With the requirements established the marketplace can now be examined to see which of the products can best match the requirements specified above. Needless to say, without spoiling the ending there is not a product currently on the market that matches all the requirements above!

Market Avaliable Parsers:

When looking at the parsers on the market the criteria used to evaluate them was as follows:

 LOC The "Lines of Code" metric gave a good measure as to how much work was likely to be involved in porting. A large number is bad

 Operating Systems The more operating systems the base code will run on the better. It is especially useful to look for parsers already ported to devices, as this will make it easier to parse.

 C or C++ The Symbian operating system has really poor support for additional languages that are not C++ based. Even some regular c programs will not run on the device due to the unconventional programming model. Whilst the ANSI C header files have to a large part been supported, there are still ecentricities and glitches that will confound developers

 Industry Acceptance The more widely deployed and tested the parser is the more likly it will work. The key metric for this is the number of addon classes and extensions avaliable for it. The more there are, the more committed developers there out there will to assist with porting as they know how the internals work

 Active Development The level support that is likly to be avaliable was a key factor. The parser should have an active forum and mailing list as well ideally an open bug list and access to the revision history for the product.

What we are looking for is a relativly mature product, widely deployed with an active developer base providing timely support and bug fixes.

The only Symbian avaliable parser is one marketed by Digia in their Enabling Technologies package. I did not however look further at this package, mainly because of the commercial nature of the package and the questions as to how well it would operate outside of the HTTP stack which it seemed to be rather tightly bound to. There seem to be a number of packages on the market for free which I felt should provide the same level of support that a commercial package could without the additional cost.

Ideally, I would have chosen to use the Xerces package, however this design is only suited for server and desktop type environments. It fully supports all the features of the W3C specification, but it is huge! Whilst ideally I would have liked to had the oportunity to port the Xerces parser to Symbian, the sheer scope and size of the project was too daunting. If anyone has attempted this however I would be interested to see how they fared.

There are a number of additional parsers out there written in C or C++ that would possibly be suitable for the device. It would be impracticle to list all of the ones evaluated over the course of a month. However the Symbian asynchronous programming style and idioms, especially the requirement that there be no writable global data in the DLL and the fact the Symbian programming model is different to the traditional C++ model employed by most applications makes it difficult to assess which parser will port with the minimum amount of effort. I know I tried it!

Conclusion:

The choice ultimatly fell to Expat. This a widely ported and deployed parser that has been written by one of the founders of XML (James Clark). It has an active open source project site with forums and mailing lists. The only downside to it, is that it is written in C and not C++, but in the next article this will be shown not to have too significant a downside.

In the next article the porting of the parser to symbian will be discussed and the issues found will be addressed. In the final articles the wrapper class for the Symbian library will be explained and its design revealed as well a full download of the source code and binary dlls


> Building an XML Parser for Symbian

Thanks for the informative overview of the subject. However, I just wanted to point out that our XML parser (which is a level 1 SAX parser) has no dependencies to/from our HTTP stack :-)

Br, Heikki Pora / Digia

> Building an XML Parser for Symbian

Sorry, I must apologize, you are right,

I missed the link on the right of the Enabling page which details the parser.

I have included the corrected link at the bottom of this note.

Paul

> Building an XML Parser for Symbian

You may be interested that I did a port of the Expat library to Symbian as part of the port of the Simkin scripting language.

It can be downloaded from the site.

The port was fairly straightforward - the main problems were with handling uninitialized static data in the DLL.

> Building an XML Parser for Symbian

What I am looking to build is a parser that can be reused in binary form and shared amongst a number of applications - the actual code and docs will be explained in part 3.

One obvious use for this would be web services for example

I also wanted to document porting of the lib and parser from a developers point of view