Xmlgen | Writing XML with Tcl | xmlgen |
Xmlgen — Writing XML with Tcl
package require xmlgen
::xmlgen::declaretag funcname ?tagname?
::xmlgen::doTag tagname ?attr=value ...? ?control? ?body ...?
::xmlgen::buffer varname body
::xmlgen::channel chan body
::xmlgen::put ?arg ...?
::xmlgen::esc ?arg ...?
::xmlgen::setTagformat which format
::xmlgen::makeTagAndBody tagname stuff ?attrary?
Due to its clear and concise bracing style, Tcl is as good or even better suited than XML to write text markup. Instead of surrounding pieces of text by, e.g. <boo> and </boo>, a Tcl command with the name boo is defined. The command takes as arguments a list of attribute value pairs and the body of text to be marked up with tags. The text body can again contain markup commands.
You will normally first call ::xmlgen::declaretag several times to declare the markup commands. With the generated commands you write your XML.
::xmlgen::declaretag funcname ?tagname?
Creates a markup command with the name funcname. By default the generated tag will be the same as funcname, but it can also be explicitly specified with the 2nd argument. The details about the generated markup commands are described below.
::xmlgen::doTag tagname ?attr=value ...? ?control? ?body ...?
Instead of first declaring a markup command with
declaretag
, doTag
can be called with the tag
name to do exactly what the markup command for that tag name
would have done. This is typically used if the tag is needed
just once in a document.
::xmlgen::buffer varname body
Normally, markup commands write their output immediately to
standard output. However, when run within the context of
buffer
, the output is captured instead in varname.
::xmlgen::channel chan body
Normally, markup commands write their output immediately to
standard output. However, when run within the context of
channel
, the output is send to the output channel
chan.
::xmlgen::put ?arg ...?
Is a convenience function to just print a string. In the
context of buffer
or channel
its output is
redirected accordingly. In contrast to
Tcl's puts
, put
can have many arguments and
does not automatically append a newline.
::xmlgen::esc ?arg ...?
Concatenates all arguments and then replaces characters according to the following table:
character | replacement |
---|---|
& |
& |
< |
< |
> |
> |
" |
" |
] |
] |
As a consequence, the result of applying esc
can safely
be used as character data for XML elements.
::xmlgen::setTagformat which format
This proc allows some control over the white space to put around
generated XML tags in order to pretty print the resulting
XML. The formatting can be controlled individually for open
and close tag for each of the four control characters
"+-.!"
. You may have to read the
section below on markup commands to
understand the meaning of the control characters.
Parameter which is a combination of one control character
and either "o"
or "c"
. In format you
should pass a string which, when passed through
[subst]
, creates the tag with some surrounding white
space or indentation. The substitution will take place in a
context where $tag
contains the opening or closing tag
and $indent
contains just the right amount of space
characters for proper indentation.
The default settings for tag formatting are as follows:
which | format | which | format | |
---|---|---|---|---|
!o |
\n$indent$tag | !c |
\n$indent$tag |
|
+o |
\n$indent$tag | +c |
\n$indent$tag |
|
-o |
\n$indent$tag | -c |
$tag |
|
.o |
$tag | .c |
$tag |
To implement a totally different indentation scheme, you may override the proc
::xmlgen::formatTag which indent tag
It must return the formatted tag.
::xmlgen::makeTagAndBody tagname stuff ?attrary?
This is the function called by a markup command to analyse its command line. Parameter stuff is the list of arguments which where passed to the markup command. It is parsed into attribute-value pairs, an optional control character and the body. The tagname and the attribute-value pairs are assembled into an XML open tag. Finally a four-element list like
[list $opentag $control $body $closetag]
is returned.
Parameter attrary is not used by the normal markup
commands. It supports the implementation of so called
markup macros
which will look like markup commands but generate more elaborate
markup than just a pair of tags and a body. When calling
makeTagAndBody
, the macro can pass the name of an array
in attrary. If makeTagAndBody
finds an attribute
which is a key (index) of the array, it does not include
it in the opening tag but stores the value in the array. As an
example of a macro implementation have a look at
::hmtlgen::tab
.
Load xmlgen, import the commands and declare two tags.
package require xmlgen
namespace import ::xmlgen::*
declaretag voo
declaretag doo
Generate and write some tagged text, attributes included.
voo color=red align=left ! {
doo - some text for doo
doo - another doo element
put text on the voo-level
}
The result will look like this:
<voo color="red" align="left">
<doo>doo some text for doo</doo>
<doo>another doo element</doo>
text on the voo-level
</voo>
Note how the two markup commands handle their body arguments
differently depending on the control characters "!
" and
"-
" just before the body.
Command voo
evaluates its body argument, while
doo
takes the argument list as is and prints it out. The
details about the control characters are
described below.
When you run
::xmlgen::declaretag boo
the markup command boo
is defined as
boo ?attr=value ...? ?control? ?body ...?
The command accepts three kinds of arguments which must appear in the order given:
All arguments are optional. How they are identified and handled is described below.
An argument is considered an attribute-value pair, if one of the following conditions hold.
^ *([A-Za-z_:][a-zA-Z0-9_.:-]*)=(.*)
This requires an attribute-value pair to consist of optional blanks, an attribute name comprising one or more of the characters "a-zA-Z0-9_-
", an immediately following equal
sign and something else. Everything after the equal sign is taken
as the value.
The first argument which does not satisfy the above conditions stops processing of attribute-value pairs. Later arguments matching the expression are not taken as attribute-value pairs.
Every attribute-value pair must be a word in
the Tcl sense. Enclose it in quotes or
braces if it contains blanks or other characters which
trigger special Tcl-behaviour.
For example an HTML style
attribute must be
written as:
boo {style=border: solid 2px red;} {body stuff}
The whole attribute-value pair is enclosed in braces to make
sure that boo
sees it as one argument.
The resulting XML will be.
<boo style="border: solid 2px red;">body stuff</boo>
Note how the attribute's value is automatically enclosed in quotes in the XML output. In fact you must not supply the quotes around the value yourself, except if you really want them to be part the the value. Nevertheless you can put quotes into the value if they belong there. They will be sufficiently escaped to not corrupt the XML syntax, i.e.
boo quote=\" - some text
will result in
<boo quote=""">some text
</boo>
If the first argument which is not an attribute-value pair consists of a single character, it is taken as the control character. It decides how the body is handled (see below).
Currently the control characters "!+-.
" are defined.
If the first argument after the attribute-value pairs contains
only a single character which is not defined as a control
character, an error is generated. To get rid of the error, add the
default control character ".
" (dot) as an extra argument
just in front of the first body argument.
In general, two independent dimensions of handling the body can be distinguished:
[subst]
or[eval]
.::xmlgen::buffer
.
Taken together, there are six possible combinations. Four of the six are supported by every markup function. The control character decides which one to use as summarized in the following table.
  | return | |
---|---|---|
eval | "! " (exlam. mark) |
not implemented |
subst | "+ " (plus) |
|
as is | "- " (minus) |
". " (dot) |
If no control character is specified, ".
" (dot) is
assumed.
The function of the four control characters are described below in more detail.
!
This is most useful for markup commands which contain a lot of
inner structure, like HTML's table
, form
or
body
.
html ! {
body ! {
# Tcl commands go here
}
}
Due to the "!
", the body of html
is evaluated
which in particular means that the command body
is
run. Since body
is again followed by "!
", the
commands in its body are evaluated and whatever output they
produce will be enclosed in <body>
and </body>
.
Another example use of "!
" is:
tr ! td - the only column in this row
where tr
runs td
while enclosing its output in
<tr>
and </tr>
.
When the control character is a plus sign,
the body is passed through [subst]
. You will most
frequently use it for tags which contain text
which needs markup. The typical example is HTML's tag
"p
".
p + {
Subst allows to [b put things in bold].
}
You could in fact produce the same result with
p - Subst allows to [b put things in bold]
but the former allows you to easier write long paragraphs without the necessity to escape the end of the line in the source code.
The minus as the control character instructs the markup function to print the body argument as is. Note however, that Tcl has its chance to expand the markup function's parameter before the function is even called.
boo - Some text [doo with markup inline]
will generate
<boo>Some text <doo>with markup inline</doo></boo>
because when boo
is actually called, doo
was already
run by the Tcl interpreter.
When the control character is the dot, the body is not changed in any way and not even printed. Instead the markup command returns it as is:
set phrase [big [big some big text]]
will set the variable phrase
to
<big><big>some big text</big></big>
Overview, Htmlgen, Notebook Tabs, Navigation Bar
Xmlgen | 1.4 | xmlgen |