Monday, January 29, 2007

So, you want to implement an XML Schema Processor?

I was recently asked to look into the design and later on the implementation of XML processors that are able to do XML parsing, XML Schema validation, XPath/XQuery queries, WSDL and SOAP analysis and enforcement, and much much more.

When one is required to implement a tool which implements complex and detailed standards, which rely on other complex standards, especially when short in time, one seeks to classify the features into two main categories: "frequently used" and "rarely used". The motivation for this classification is that "frequently used" features get to be considered and implemented first while the "rarely used" ones get time and attention later on.

Starting with XML Schema (of course with the aid of the very useful book Definitive XML Schema), I was faced with the need to perform such categorization. Not having enough time to properly read the standard throughout, read commentary about it in mailing lists such as the xml-dev mailing list, and complement the knowledge with explanations from books and of course from actual practice (examining freely available XML Schemas for example), I was forces to an ad-hoc approach.

I browsed the web for some available summary on XML Schema, hopefully, including the above mentioned categorization. To my happiness, I was successful. I came across the article Profiling XML Schema by by Paul Kiel, which was published on xml.com on September 20th, 2006.

My short term approach will be to try and confirm the results and conclusions that were presented in this article with several examples of WSDLs and XML Schemas available on the wild (e.g., Google's XML based interfaces for its services).

Let's see how it goes.