On Thu, 12 Dec 2002 16:29:08 +0530 Kapil Karekar wrote:
Congrats Tahir, for the first release of Xqueeze !
Thanks a lot Kapil! Well, I'd have announced this here and on the PRC list but I'm waiting for the reference implementation to be uploaded. Some clean-up and source documentation are remaining and I hope it'll get done in 2 weeks.
Sameer, I'm yet to go through the kuro5hin article but there had been discussions on XML-Dev[1] about why "binary" XML and other attempts at reducing the complexity are not acceptable to a majority. A compilation of this discussion is there on xml.com[2]. The main reservations that were raised were:
1. Compression adds a computation overhead that may offset any performance gains
2. Usability of XML tends to get restricted due to these attempts
3. Some of the "compact" formats are more difficult to parse than XML
4. Even if everything is fine, the performance benefit obtained may be too insignificant to warrant a switch to the new format.
I can't do much to help point 4. The only thing needed to switch to xqML is to replace the XML generating and parsing software with one that generates/parses xqML and nothing gets changed above (i.e. in software layers).
Though I don't have any benchmarking results, I'm expecting xqML to parse much faster than XML. I wrote a parser from scratch in 5 hrs and 250 odd lines of C++ code. While parsing XML, a parser has to look for at least 2 characters (< and &) per character of input and for PDATA, it has to look for *atleast* 6 characters (< & > ' " [whitespace chars]). So the number of comparisons required to parse a document goes up too high. The *max* comparisons required are 1 in xqML CDATA (<) and 6 for PDATA. Moreover, the parser knows before-hand where those 6 comparisons need to be made, so it doesn't spend time doing useless comparisons.
Point 2 has been taken care of and xqML aims to be XML for all practical purposes, except in the way it looks.
Point 1 was a seriously valid point IMHO, so I avoided compression completely. BTW, xqML too yields to compression nicely, though the "niceness" is proportional to the amount of CDATA in the file.
[1] http://lists.xml.org/archives/xml-dev/200104/threads.html#00205
[2] http://www.xml.com/pub/a/2001/04/18/binaryXML.html
PS: Interested people please join the xqueeze-users mailing list: http://lists.sourceforge.net/lists/listinfo/xqueeze-users