|
TclXML
XML Parsing For Tcl
|
Project Home TclXML ![]() About TclXML Parser Implementations Download and Installation Documentation Contributing Discussion TclDOM TclXSLT Combo Distribution tkxmllint tkxsltproc TclTidy Xmlgen
Success Stories |
TclXML is an API for parsing XML documents using the Tcl scripting language. It is also a package with several parser implementations. The goal of the TclXML package is to provide an API for Tcl scripts that allows "Plug-and-Play" parser implementations; ie. an application will be able to use different parser implementations without change to the application code. The TclXML package provides a streaming, or "event-based", interface to an XML document. An application using TclXML creates a parser "object", sets a number of callback scripts and then instructs the parser to parse an XML document. The parser scans the XML document's text and as it finds certain constructs, such as the start/end of elements, character data, and so on, it invokes the appropriate callback script. This processing model is very similar to SAX. However, TclXML does not use exactly the same method names, etc, as SAX and so it is not 100% SAX compliant. Functionally, TclXML is equivalent to SAX2. TclXML v2.X's architecture separates the application interface from the parser implementation, in the same fashion as SAX. Currently there are two parser implementations available: tcl-parser and expat. Version 3.0 of TclXML has the same architecture and introduces a new parser class: libxml2. The TclXML package provides a parser implementation written purely as a Tcl script. When no other parser is available, this one can always be used. Having a pure-Tcl parser has some advantages, apart from the obvious advantages in not having to compile any code. This parser implementation is more configurable, can provide better error reporting and recovery and can also be modified at run-time to achieve various special features. expat is a C-based XML parser, originally written by James Clark. It has the advantage of having very high performance. Now available! The Gnome libxml2 library is also a C-based XML parser. It has the advantage of having high performance, as well as being able to (DTD) validate documents. The latest version of libxml2 can also perform WXS (W3C XML Schema) schema-validation and Relax-NG validation, but that functionality will be provided by the TclDOM package. A wrapper for this library is available as part of version 3.0 of the TclXML package. Q: I heard that there has been work on a Tcl wrapper for the Apache Xerces-C parser. What's happening with that? A: (SRB) I started writing a module to wrap the Xerces-C++ parser, but got bogged-down, side-tracked and eventually abandoned the effort :-( However, now that Xerces-C++ supports both DTD and WXS validation, it's probably worth trying to resurrect this subproject - a volunteer is needed! TclXML v3.1 is the current release. This version is known to work with Tcl v8.1 and above[1]. Download from SourceForge: Pull the CVS tree to get the latest-and-greatest code. The module name is "tclxml". There are no guarantees that the code is in any particular state (or even working at all). Download the package in one of the forms above and unpack the downloaded file. Detailed instructions on how to build and install TclXML may be found in the You can never have enough...
Contributions of any kind to the project are welcome. An important contribution you can make is to test the package and report any bugs. Use SourceForge bug tracking to report problems or request features. Please make sure you set the category to "TclXML". There are three mailing lists for the TclXML project:
There are instructions on SourceForge for subscribing or unsubscribing to any of these mailing lists. You might like to view, and contribute to, the TclXML page on the Tcl/Tk Wiki. There are many other pages on the Wiki discussing XML and its use with Tcl. Discussion also takes place on the comp.lang.tcl USENET newsgroup. |