Xmlgen Writing XML with Tcl xmlgen


Xmlgen — Writing XML with Tcl


package require xmlgen

::xmlgen::declaretag funcname ?tagname?

::xmlgen::doTag tagname ?attr=value ...? ?control? ?body ...?

::xmlgen::buffer varname body

::xmlgen::channel chan body

::xmlgen::put ?arg ...?

::xmlgen::esc ?arg ...?

::xmlgen::setTagformat which format

::xmlgen::makeTagAndBody tagname stuff ?attrary?


Due to its clear and concise bracing style, Tcl is as good or even better suited than XML to write text markup. Instead of surrounding pieces of text by, e.g. <boo> and </boo>, a Tcl command with the name boo is defined. The command takes as arguments a list of attribute value pairs and the body of text to be marked up with tags. The text body can again contain markup commands.

You will normally first call ::xmlgen::declaretag several times to declare the markup commands. With the generated commands you write your XML.

::xmlgen::declaretag funcname ?tagname?

Creates a markup command with the name funcname. By default the generated tag will be the same as funcname, but it can also be explicitly specified with the 2nd argument. The details about the generated markup commands are described below.

::xmlgen::doTag tagname ?attr=value ...? ?control? ?body ...?

Instead of first declaring a markup command with declaretag, doTag can be called with the tag name to do exactly what the markup command for that tag name would have done. This is typically used if the tag is needed just once in a document.

::xmlgen::buffer varname body

Normally, markup commands write their output immediately to standard output. However, when run within the context of buffer, the output is captured instead in varname.

::xmlgen::channel chan body

Normally, markup commands write their output immediately to standard output. However, when run within the context of channel, the output is send to the output channel chan.

::xmlgen::put ?arg ...?

Is a convenience function to just print a string. In the context of buffer or channel its output is redirected accordingly. In contrast to Tcl's puts, put can have many arguments and does not automatically append a newline.

::xmlgen::esc ?arg ...?

Concatenates all arguments and then replaces characters according to the following table:

character replacement
& &amp;
< &lt;
> &gt;
" &#34;
] &#93;

As a consequence, the result of applying esc can safely be used as character data for XML elements.

::xmlgen::setTagformat which format

This proc allows some control over the white space to put around generated XML tags in order to pretty print the resulting XML. The formatting can be controlled individually for open and close tag for each of the four control characters "+-.!". You may have to read the section below on markup commands to understand the meaning of the control characters.

Parameter which is a combination of one control character and either "o" or "c". In format you should pass a string which, when passed through [subst], creates the tag with some surrounding white space or indentation. The substitution will take place in a context where $tag contains the opening or closing tag and $indent contains just the right amount of space characters for proper indentation.

The default settings for tag formatting are as follows:

which format     which format
!o \n$indent$tag !c \n$indent$tag
+o \n$indent$tag +c \n$indent$tag
-o \n$indent$tag -c $tag
.o $tag .c $tag

To implement a totally different indentation scheme, you may override the proc

::xmlgen::formatTag which indent tag

It must return the formatted tag.

::xmlgen::makeTagAndBody tagname stuff ?attrary?

This is the function called by a markup command to analyse its command line. Parameter stuff is the list of arguments which where passed to the markup command. It is parsed into attribute-value pairs, an optional control character and the body. The tagname and the attribute-value pairs are assembled into an XML open tag. Finally a four-element list like

[list $opentag $control $body $closetag]

is returned.

Parameter attrary is not used by the normal markup commands. It supports the implementation of so called markup macros which will look like markup commands but generate more elaborate markup than just a pair of tags and a body. When calling makeTagAndBody, the macro can pass the name of an array in attrary. If makeTagAndBody finds an attribute which is a key (index) of the array, it does not include it in the opening tag but stores the value in the array. As an example of a macro implementation have a look at ::hmtlgen::tab.

An Example

Load xmlgen, import the commands and declare two tags.

package require xmlgen
namespace import ::xmlgen::*
declaretag voo
declaretag doo

Generate and write some tagged text, attributes included.

voo color=red align=left ! {
  doo - some text for doo
  doo - another doo element
  put text on the voo-level

The result will look like this:

<voo color="red" align="left">
  <doo>doo some text for doo</doo>
  <doo>another doo element</doo>
  text on the voo-level

Note how the two markup commands handle their body arguments differently depending on the control characters "!" and "-" just before the body. Command voo evaluates its body argument, while doo takes the argument list as is and prints it out. The details about the control characters are described below.

Generated Markup Commands

When you run

::xmlgen::declaretag boo

the markup command boo is defined as

boo ?attr=value ...? ?control? ?body ...?

The command accepts three kinds of arguments which must appear in the order given:

  1. attribute-value pairs,
  2. a control character,
  3. body text.

All arguments are optional. How they are identified and handled is described below.

Attribute Value Pairs

An argument is considered an attribute-value pair, if one of the following conditions hold.

  1. It is the empty string.
  2. it satisfies the following regular expression:

    ^ *([A-Za-z_:][a-zA-Z0-9_.:-]*)=(.*)

    This requires an attribute-value pair to consist of optional blanks, an attribute name comprising one or more of the characters "a-zA-Z0-9_-", an immediately following equal sign and something else. Everything after the equal sign is taken as the value.

The first argument which does not satisfy the above conditions stops processing of attribute-value pairs. Later arguments matching the expression are not taken as attribute-value pairs.

Every attribute-value pair must be a word in the Tcl sense. Enclose it in quotes or braces if it contains blanks or other characters which trigger special Tcl-behaviour. For example an HTML style attribute must be written as:

boo {style=border: solid 2px red;} {body stuff}

The whole attribute-value pair is enclosed in braces to make sure that boo sees it as one argument.

The resulting XML will be.

<boo style="border: solid 2px red;">body stuff</boo>

Note how the attribute's value is automatically enclosed in quotes in the XML output. In fact you must not supply the quotes around the value yourself, except if you really want them to be part the the value. Nevertheless you can put quotes into the value if they belong there. They will be sufficiently escaped to not corrupt the XML syntax, i.e.

boo quote=\" - some text

will result in

<boo quote="&#34;">some text

Control Character

If the first argument which is not an attribute-value pair consists of a single character, it is taken as the control character. It decides how the body is handled (see below).

Currently the control characters "!+-." are defined. If the first argument after the attribute-value pairs contains only a single character which is not defined as a control character, an error is generated. To get rid of the error, add the default control character "." (dot) as an extra argument just in front of the first body argument.

Handling the Body

In general, two independent dimensions of handling the body can be distinguished:

  1. The body can be
  2. The resulting text can be either

Taken together, there are six possible combinations. Four of the six are supported by every markup function. The control character decides which one to use as summarized in the following table.

  print return
eval "!" (exlam. mark) not
subst "+" (plus)
as is "-" (minus) "." (dot)
Control Characters and their Function.

If no control character is specified, "." (dot) is assumed.

The function of the four control characters are described below in more detail.

Eval and Print: !

This is most useful for markup commands which contain a lot of inner structure, like HTML's table, form or body.

html ! {
  body ! {
   # Tcl commands go here

Due to the "!", the body of html is evaluated which in particular means that the command body is run. Since body is again followed by "!", the commands in its body are evaluated and whatever output they produce will be enclosed in <body> and </body>.

Another example use of "!" is:

tr ! td - the only column in this row

where tr runs td while enclosing its output in <tr> and </tr>.

Subst and Print: "+"

When the control character is a plus sign, the body is passed through [subst]. You will most frequently use it for tags which contain text which needs markup. The typical example is HTML's tag "p".

p + {
  Subst allows to [b put things in bold].

You could in fact produce the same result with

p - Subst allows to [b put things in bold]

but the former allows you to easier write long paragraphs without the necessity to escape the end of the line in the source code.

Just Print: "-"

The minus as the control character instructs the markup function to print the body argument as is. Note however, that Tcl has its chance to expand the markup function's parameter before the function is even called.

boo - Some text [doo with markup inline]

will generate

<boo>Some text <doo>with markup inline</doo></boo>

because when boo is actually called, doo was already run by the Tcl interpreter.

Just Return: "."

When the control character is the dot, the body is not changed in any way and not even printed. Instead the markup command returns it as is:

set phrase [big [big some big text]]

will set the variable phrase to

<big><big>some big text</big></big>


Harald Kirsch

See Also

Overview, Htmlgen, Notebook Tabs, Navigation Bar

Xmlgen 1.4 xmlgen