SAXON home page

SAXON Extensions

This page describes the extension functions and extension elements supplied with the SAXON product.

If you want to implement your own extensions, see extensibility.html.

These extension functions and elements have been provided because there are things that are difficult to achieve, or inefficient, using standard XSLT facilities alone. As always, it is best to stick to standard if you possibly can: and most things are possible, even if it's not obvious at first sight.

Contents
Extension attributes
saxon:trace
saxon:allow-avt
saxon:disable-output-escaping
xsl:output method
Extension functions
saxon:difference()
saxon:distinct()
saxon:evaluate()
saxon:has-same-nodes()
saxon:line-number()
saxon:if()
saxon:intersection()
saxon:node-set()
saxon:range()
saxon:system-id()
saxon:tokenize()
Extension elements
saxon:assign
saxon:entity-ref
saxon:group
saxon:handler
saxon:item
saxon:output
saxon:preview
saxon:set-attribute
saxon:while

Extension attributes

An extension attribute is an extra attribute on an XSL-defined element. Following the rules of XSLT, such attributes must be in a non-default namespace. For SAXON extension elements, the namespace must be the SAXON namespace URI "http://icl.com/saxon"

For example, the saxon:trace attribute can be set as follows:


<xsl:template match="item" saxon:trace="yes" xmlns:saxon="http://icl.com/saxon">

The extension attributes supplied with the SAXON product are as follows:

saxon:trace This attribute may be set on the xsl:stylesheet element or the xsl:template element. If set to the value "yes", it causes execution of template rules to be traced to the standard error output. If present on xsl:stylesheet, all template rules are traced; otherwise only selected templates are traced. When present on xsl:stylesheet, it also outputs a list of all the top-level elements in the expanded stylesheet, along with their import precedence.
saxon:allow-avt This attribute may be set on the xsl:call-template element. If set to the value "yes", it causes the name attribute of xsl:call-template to be interpreted as an attribute value template. This allows the selection of the called template to be decided at run-time. Typical usage is:
<xsl:call-template name="{$tname}" saxon:allow-avt="yes">
saxon:disable-output-escaping This attribute may be set on the xsl:attribute element. If set to the value "yes", it causes the value of the attribute to be output as-is, without any escaping of special characters. This affects both regular XML escaping of characters such as ampersand, less-than, and quotes, and also the special escaping applied to URL attributes in the HTML output method, whereby non-ASCII characters are replaced by %XX. Typical usage:
<xsl:attribute name="href" saxon:disable-output-escaping="yes">servlet?x=2&amp;y=3</xsl:attribute>

The method attribute of xsl:output and saxon:output can take the standard values "xml", "html", or "text", or a QName.

If a QName is specified, the local name may be:

  • the value "fop", which directs output to James Tauber's FOP processor (which must be installed separately)
  • the value "xhtml", which outputs the result tree in XHTML format. This follows the same rules as method="xml", except that it follows the guidelines for making the XML acceptable to legacy HTML browsers. Specifically (a) empty elements such as <br/> are output as <br/>, and (b) empty elements such as <p/> are output as <p></p>. The indent attribute defaults to "yes", and indenting follows the HTML rather than XML rules. Other attributes may be specified as for XML output, e.g. cdata-section-elements and omit-xml-declaration.
  • the fully-qualified class name of a class that implements either the SAX org.xml.sax.DocumentHandler interface, or the SAX2 org.xml.sax.ContentHandler interface, or the com.icl.saxon.output.Emitter interface. If such a value is specified, output is directed to the user-supplied class.

The prefix of the QName must correspond to a valid namespace URI. It is recommended to use the SAXON URI "http://icl.com/saxon", but this is not enforced.


Extension functions

A SAXON extension function is invoked using a name such as saxon:localname().

The saxon prefix (or whatever prefix you choose to use) must be associated with the SAXON namespace URI "http://icl.com/saxon" or (for backwards compatibility) any URI ending with "/com.icl.saxon.functions.Extensions".

For example, to invoke the node-set function, write:

<xsl:variable name="fragment">value</xsl:variable>
..
<xsl:apply-templates
     select="saxon:node-set($fragment)"
     mode="postprocess"
     xmlns:saxon="http://icl.com/saxon"/>

The extension functions supplied with the SAXON product are as follows:

difference(node-set-1, node-set-2) This returns a node-set that is the difference of the two supplied node-sets, that is, it contains all the nodes that are in node-set-1 that are not also in node-set-2.
distinct(node-set-1) This returns a node-set obtained by eliminating nodes in node-set-1 that have duplicate string-values. If several nodes have the same string-value, all but one of them are discarded; it is not defined which one will be retained.
evaluate(string) The supplied string must contain an XPath expression. The result of the function is the result of evaluating the XPath expression. This is useful where an expression needs to be constructed at run-time or passed to the stylesheet as a parameter, for example where the sort key is determined dynamically. The context for the expression (e.g. which variables and namespaces are available) is exactly the same as if the expression were written explicitly at this point in the stylesheet.
if(condition, value-1, value-2) The first argument is evaluated as a boolean; if it is true, the function returns the value value-1, if it is false, it returns value-2. The value may be of any type. Both the second and third arguments are evaluated even though only one of the values is used.
intersection(node-set-1, node-set-2) This returns a node-set that is the intersection of the two supplied node-sets, that is, it contains all the nodes that are in both sets. Note that the union operation can be done using the built-in operator "|"
has-same-nodes(node-set-1, node-set-2) This returns a boolean that is true if and only if node-set-1 and node-set-2 contain the same set of nodes. Note this is quite different from the "=" operator, which tests whether there is a pair of nodes with the same string-value.
line-number() This returns the line number of the current node in the source document within the entity that contains it. There are no arguments.
node-set($fragment) This takes a single argument that is a result tree fragment. Its function is to convert the result tree fragment to a node-set. The resulting node-set contains a single node, which is a root node (class DocumentInfo); below this are the actual nodes added to the result tree fragment, which may be element nodes, text nodes, or anything else. Note that a result tree fragment is not in general a well-formed document, for example there may be multiple element nodes or text nodes as children of the root.
range(number-1, number-2) The two arguments are converted to numbers and then rounded to integers. A new node-set is constructed containing one node for each integer in the range number-1 to number-2 inclusive; if number-2 is less than number-1 the result will be empty. The string-value of each node will be the relevant number; for example range(2, 5) generates a set of four nodes with string-values "2", "3", "4", and "5". The main intended usage is <xsl:for-each select="range($from, $to)"> which simulates a conventional for loop in other programming languages.
system-id() This returns the system identifier (URI) of the entity containing the current node in the source document. There are no arguments.
tokenize(string-1, string-2?) The first argument is converted to a string and is treated as a list of separated tokens. If the second argument is present, any character in string-2 is taken as a delimiter character, and any sequence of delimiter characters is taken as a token separator. If the second argument is omitted, any sequence of whitespace is taken as a token separator: or to put it another way, the default for string-2 is '&#x09;&#x0A;&#x0D;&#x20;'.
A new node-set is constructed containing one node for each token; if the string is empty or contains a separator only then the result will be empty. The string-value of each node will be the relevant token; for example tokenize("a cup of tea") generates a set of four nodes with string-values "a", "cup", "of", and "tea".

The source code of these methods, which in most cases is extremely simple, can be used as an example for writing other user extension functions. It is found in class com.icl.saxon.functions.Extensions


Extension elements

A SAXON extension element is invoked using a name such as <saxon:localname>.

The saxon prefix (or whatever prefix you choose to use) must be associated with the SAXON namespace URI "http://icl.com/saxon". The prefix must also be designated as an extension element prefix by including it in the extension-element-prefixes attribute on the xsl:stylesheet element, or the xsl:extension-element-prefixes attribute on any enclosing literal result element or extension element.

However, top-level elements such as saxon:handler and saxon:preview can be used without designating the prefix as an extension element prefix.


saxon:assign

The saxon:assign element is used to change the value of a local or global variable that has previously been declared using xsl:variable (or xsl:param).

As with xsl:variable, the name of the variable is given in the mandatory name attribute, and the new value may be given either by an expression in the select attribute, or by expanding the content of the saxon:assign element

Example:

<xsl:variable name="i" expr="0"/>
<saxon:while test="$i &lt; 10">
    The value of i is <xsl:value-of select="$i"/>
    <saxon:assign name="i" expr="$i+1"/>
</saxon:while>
    

saxon:entity-ref

The saxon:entity-ref element is useful to generate entities such as &nbsp; in HTML output. To do this, write:

        <saxon:entity-ref name="nbsp"/>


saxon:group

The <saxon:group> element causes iteration over the nodes selected by a node-set expression.

There is a mandatory attribute, select, which defines the nodes over which the statement will iterate. This is analogous to the select attribute of <xsl:for-each>

There is also a mandatory group-by attribute to control grouping. The value of this attribute is a string expression, which is applied to each item selected by the select expression. The XSL statements subordinate to the <saxon:group> element are applied once to each group of consecutive source nodes selected by the select expression that have the same value for the group-by expression.

The <saxon:group> element may have one or more <xsl:sort> child elements to define the order of sorting. The sort keys are specified in major-to-minor order. Note that group-by does not itself cause sorting, but it can conveniently be used in conjunction with sorting. The group-by key will often be the same as the major sort key.

The <saxon:group> element must contain somewhere within it an <saxon:item> element. The XSL instructions outside the <saxon:item> element are executed only once for each group of consecutive elements with the same value for the grouping key; the instructions within the saxon:item are executed once for each individual item in the saxon:group selection.

The context for the select expression is the usual context for expressions within an XSL element, i.e. it is based on the current node and current node list of the containing template body.

The context for the group-by expression is as if the expression were written inside the saxon:group loop. If the select expression selects a node-set S, then for each node N within S, the group-by expression is evaluated with N as the context node, with count(S) as the context size, and with the context position taking the values 1..count(S) in turn. The context position represents the position of the node in the node-set after sorting.

If there is an <xsl:sort> element present, then the context for evaluating the sort key follows exactly the same rules as for <xsl:for-each>. In particular, the context position is the position before sorting.

Within the <saxon:group> element, and also within the <saxon:item>; element, the context reflects the full node-set being processed (that is, the node-set selected by the select attribute). The context position is the position of the node within this node-set, and the context size is the size of this node-set. It is not possible to determine the size of an individual group, or the position of the current node within an individual group. The instructions preceding <saxon:item> are executed with the first node of a group as the current node, and the instructions following <saxon:item> are executed with the last node of a group as the current node.

The expressions used for sorting and grouping can be any string expressions. The following are particularly useful:

Example: This example groups the BOOK elements having the same AUTHOR.

<xsl:template match="BOOKLIST"> <h2> <saxon:group select="BOOK" group-by="AUTHOR"> <xsl:sort select="AUTHOR"/> <h3>AUTHOR: <xsl:value-of select="AUTHOR"></h3> <saxon:item> <p>TITLE: <xsl:value-of select="TITLE"/></p> </saxon:item> <hr/> </saxon:group> </h2> </xsl:template>

saxon:handler

The saxon:handler element is used at the top level of the stylesheet, in the same way as xsl:template. It takes attributes match, mode, name, and priority in the same way as xsl:template, and is considered along with all XSL templates when searching for a template to execute in response to xsl:apply-templates or xsl:call-template. However, the action performed when a saxon:handler is invoked is to call the user-written Java NodeHandler named in the mandatory handler attribute.

The Java node handler must be written as a subclass of com.icl.saxon.handlers.NodeHandler. It is supplied with a Context parameter, which gives access to a wide range of information and services, including the current context in the source document, any parameters on the call, and the Outputter object used to write to the result tree. The Context parameter also provides access to a method applyTemplates() which allows the Java node handler to make a call back to process XSLT templates in the stylesheet.


saxon:item

The saxon:item element is always used within a saxon:group element. The XSL instructions outside the saxon:item element are executed once for each group (that is, each group of consecutive items with the same value for the group-by expression), while the XSL instructions within the saxon:item element are executed once for each individual item.

See saxon:group for further details.


saxon:output

The saxon:output element is used to define a new output destination. This element is a proprietary SAXON feature. Output reverts to the previous destination when the saxon:output end tag is encountered.

The file attribute is used to direct the output to a named file. The filename is an attribute value template: it will often be parameterised, e.g. using the current element number.

The next-in-chain attribute is used to direct the output to another stylesheet. The value is the URL of a stylesheet that should be used to process the output stream. In this case the output stream must always be pure XML, and attributes that control the format of the output (e.g. method, cdata-section-elements, etc) will have no effect.

The user-data attribute can contain any string value; it is made available to a user-defined Emitter class named in the method attribute. The value can be accessed through the getUserData() method of the OutputDetails object passed to the Emitter.It is an attribute value template, so the value can be computed at run-time.

The file and next-in-chain attributes are mutually exclusive. they can both be omitted: this will normally only be useful if a user-supplied Emitter class is defined in the method attribute.

The indent attribute accepts the values "yes" and "no" or an integer; the default is "3" for HTML and "no" for XML. Setting it to "yes" causes the output to be indented, using an algorithm that doesn't meet the strict rules in the spec for the xsl:output element, because it sometimes adds whitespace in places where it shouldn't. The value can also be set to an integer indicating the amount of indentation, for example indent="2" indents by two spaces. The default is three spaces.

The other attributes are the same as on xsl:output

Here is an example that uses saxon:output:

<xsl:template match="preface">
    <saxon:output file="{$dir}\preface.html">
        <html><body bgcolor="#00eeee"><center>
            <xsl:apply-templates/>
        </center><hr/></body></html>
    </saxon:output>
    <a href="{$dir}\preface.html">Preface</a>
</xsl:template>

Here the body of the preface is directed to a file called preface.html (prefixed by a constant that supplies the directory name). Output then reverts to the previous destination, where an HTML hyperlink to the newly created file is inserted.


saxon:preview

The saxon:preview element is a top-level element used to identify elements in the source document that will be processed in preview mode. The purpose of preview mode is to enable XSLT processing of very large documents that are too big to fit in memory: the idea is that subtrees of the document can be processed and then discarded as soon as they are encountered.

There are two mandatory attributes: mode identifies the mode in which the relevant templates will be applied, and elements is a space-separated list of element names that will be processed in preview mode.

While the source XML document is being read, if an element end tag is encountered for an element that is in the list of preview elements, the relevant template is found (using the normal matching rules, with mode equal to the specified preview mode). This template is then executed. After the template has completed execution, the child nodes of the preview element (but not the element itself, nor its attributes) are deleted from the tree to save memory.

During the matching of a preview element and during the execution of the preview template, only part of the source document is visible. This part includes the ancestors of the preview element, the descendants of the preview element, and all nodes that precede the preview element in document order, except for nodes that are descendants of another preview element.

Global variables are not available to a preview template. The supplied values of global parameters are available, but not the default values of unsupplied parameters.

A preview template may write to a secondary output destination using saxon:output, or it may set global variables using saxon:assign, or it may write data to the source document using saxon:set-attribute. It may also write directly to the principal output destination, but note that in this case each instantiation of the preview template will produce a subtree immediately below the root of the output tree. Normally this means the output document will have multiple element nodes as children of the root. This is not well-formed XML, but you can easily construct a well-formed XML document by referencing this file as an external entity.

One simple use for saxon:preview is simply to delete unwanted parts of the tree to reduce the amount of memory needed. In this case, just provide a preview template that does nothing.


saxon:set-attribute

The saxon:set-attribute element is used to create or modify the value of an attribute in the input document. It can be used, for example, where an attribute has a default value: the calculation of the default value can then be separated from the use of the attribute. It must be used within a template.

The element affected is always the current element. To modify an attribute on a different element, use saxon:set-attribute within an xsl:for-each that selects the required elements.

The mandatory name attribute defines the name of the attribute that is to be set. If this attribute already exists, it is overwritten.

The new value of the attribute may be specified in the same way as with xsl:attribute: either by a select attribute giving a string expression to be evaluated, or as the content of the saxon:set-attribute element.

The example below sets the SIZE attribute to 100 on all BLOCK elements that do not have a SIZE provided:

        <xsl:for-each select="//BLOCK[not(@SIZE)]">
            <saxon:set-attribute name="SIZE" select="100"/>
        </xsl:for-each>
    

saxon:while

The saxon:while element is used to iterate while some condition is true.

The condition is given as a boolean expression in the mandatory test attribute. Because this expression must change its value if the loop is to terminate, the condition will always reference a variable that is updated somewhere in the loop using an saxon:assign element.

Example:

<xsl:variable name="i" expr="0"/>
<saxon:while test="$i &lt; 10">
    The value of i is <xsl:value-of select="$i"/>
    <saxon:assign name="i" expr="$i+1"/>
</saxon:while>
    

Michael H. Kay
20 April 2000