Last Update: 7/27/03
XSLT is a transformation language for XML documents, it is expressed in the form of a stylesheet that can be applied to an XML document instance to produce an entirely new document. One can use XSLT to transform an XML document instance into three other document formats:
An XSLT transformation has three participants, the input document instance, the stylesheet which supplies the information about the transformation to occur, and the XSLT Processor which actually reads these two inputs and produces the desired output document specified by the stylesheet. There are a number of popular XSLT Processors available, we'll discuss two of them, Xalan, and libxslt. XSLT 1.0 is limited to one input document and one output document, however XSLT 2.0 will allow multiple inputs and multiple outputs. A number of fairly stable experimental implementations of 2.0 exist if you might have need of multiple input or output documents in an application.
XSLT is intertwined with another W3C specification, XPath. XPath is a language for selected parts of an XML document. XPath syntax is quite similar to the navigation commands for the UNIX file system. A typical XPath expression looks something like this:
/message/greeting
This statement selects all greeting elements that occur as a child of a message element in a document. XSLT uses XPath expressions to decide which portion of a document it will transform.
XSLT is actually one part of larger specification by W3C, XSL (Extensible Stylesheet Language, which is made of two parts: XSLT and XSL-FO. XSL-FO (Formatting Objects) is an XML vocabulary that supplies formatting information for a printed. One common application of is XSLT the conversion from an XML encoded document into XSL-FO so that it can be rendered in printable form.
Two widely-used open source XSLT Processors are:
A formatting-objects (FO) printer is an application that understand the XSL-FO formatting language and can output formats suitable for printing like pdf or postscript. Apache FOP is a popular open source FO printer. Unfortunately FOP is only an incomplete implementation of the FO specification, so don't be surprised if you receive errors regarding unsupported properties when attempting to process FO files. Check the FOP compliance page for information on supported properties.
I'll go over two brief examples that use instances encoded with two simple DTDs, one for a greeting and another for a poem. Each stylesheet associated with each document type is an example of a basic XSLT application. They illustrate two fundamental concepts that will get you started with XSLT, the creation templates and modes to manipulate the tree structure of an XML document to produce an output document.
An XSLT stylesheet defines a series of templates that use XPath expressions to select different parts of your document tree. Each template describes how the matching part of the tree should be transformed in the output of a transformation. The author of the stylesheet controls when a particular template will be applied, this construct is what allows one to reorder the tree of XML instance in the output document. XPath is a W3C specification for selecting a particular node or set of nodes in an XML document. XPath expressions are quite similar to navigation commands in the UNIX file system. Here is a short stylesheet that produces an HTML version of our greeting.xml file from week 4.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" omit-xml-declaration="yes" indent="yes"
encoding="UTF-8" doctype-public="-//W3C//DTD HTML 4.0//EN" />
<xsl:template match="/">
<html>
<head>
<title>A Greeting</title>
<link rel="stylesheet" href="greeting.css" type="text/css"/>
</head>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="message">
<!-- select the current node -->
<h1><xsl:value-of select="." /></h1>
</xsl:template>
</xsl:stylesheet>
The first template above uses the XPath expression "/" to match the root element of the input document. Most XSLT stylesheets typically begin with a template matching the root document. XSLT is a recursive process in that the process will move down the document tree to the highest branch of the tree matching every template possible. A stylesheet author uses the <apply-templates> element to decide what templates will be matched and when. The general directive of <apply-templates /> means that all subsequent matching templates in the stylesheet will be executed.
The second template, which matches the element "message" in a greeting, introduces a second important XSLT element, <xsl-value-of>. This element is typically used when on wants to output the textual content of one element in the result tree. In the template matching "message", this element is uses another familiar UNIX-like convention the "." notation, to output the value of the currently selected node into an HTML <h1> tag. Now we'll talk about reordering the tree of the input instance in the output document.
The following templates demonstrate the technique of creating a mode in XSLT. Defining a mode allows one use the same content of the input instance at multiple points in the output document.
<!-- the root template -->
<xsl:template match="poem">
<html lang="{@lang}">
<xsl:apply-templates select="heading" mode="header" />
<body>
<xsl:apply-templates select="heading" />
<xsl:apply-templates select="verse" />
</body>
</html>
</xsl:template>
<-- header mode -->
<xsl:template match="heading" mode="header">
<head>
<title><xsl:value-of select="./title" /></title>
<link rel="stylesheet" type="text/css" href="poemhtml.css" />
</head>
</xsl:template>
<-- standard heading processing -->
<xsl:template match="heading">
<h1><xsl:value-of select="./title" /></h1>
<h2 class="author">
<xsl:text>By: </xsl:text>
<xsl:value-of select="./author" />
</h2>
</xsl:template>
The example references our poem document type from week 4, to show how the same element from an instance can be placed in both the head and body elements of an HTML document. Again the root element is matched in poem template, but this time the apply-templates elements reference specific templates, notice that two different apply-templates elements refer to the same point in the input document, the element "heading". However this first apply-templates also references a mode header, indicating that at this point the template for "heading" that belongs to the mode "header" should be applied at this point in the output document.
Notice the second apply-templates statement referring to heading does not reference the mode. This indicates that the general matching template for heading should be applied at this point in the document. Also note that you can insert attribute values into the output document. This can be seen in the insertion of the root element poem's lang attribute into the lang attribute of the <html> element of the output document. Next week will discuss more powerful tree-reordering constructs of XSLT, and other constructs that allow XSLT to act like a programming language.
XPath is the language used to select portions (nodes) of the input document for transformation in XSLT. In XSLT you'll typically find XPaths in the match and select attributes of <xsl:apply-templates> or the select attribute of <xsl:value-of>. XPath expressions are most commonly used to select text or attributes, but also can be used to select namespace prefixes, comments, processing-instructions, or the root element of a document. Common XPath expressions:
Like navigation expressions used in UNIX, XPath has both absolute and relative paths, the examples above are all examples of absolute paths, in that they explicitly start from "/" or the root element. A relative path is any selection expression that doesn't start with "/". One could specify an XPath of "line" which would select all of the line elements that are children of the current element, regardless of where they appear. Some common relative XPaths:
As you've seen in the examples above a selection expression can be further modified by and expression enclosed in "[]". This is called a predicate, there is a variety of constraints you can place on an expression in a with a predicate to narrow the group of nodes you can select with the expression. XPath also provides a number of built-in functions for math, strings, boolean operations and other common function types. Here are some examples of predicates and functions:
XSLT provides elements that give one branch and looping logic. I'll go over the most basic constructs here, but you can define your functions, pass parameters to them and do a number of other similar tasks that one can accomplish with a typical programming language. In fact someone has an implementation of XSLT as a full programming language called FXSL.
XSLT provides <xsl:if> to act as an if statement. This particular template tests to see if a table element has a child element, title, and if it does it outputs the title in an HTML <h4> tag.
<xsl:template match="table">
<!-- provide for formatting of a table title -->
<xsl:if test="title">
<h4>
<xsl:value-of select="title" />
</h4>
</xsl:if>
<!-- ....... -->
</xsl:template>
<xsl:choose> lets you execute something like an if .. else statement. This template tests to see if a list element is either ordered or unordered. Each unique test condition in an <xsl:choose> block is contained in and <xsl:when> element. The default case is contained in an <xsl:otherwise> element.
<xsl:template match="list"> <xsl:choose> <xsl:when test="@type='unordered'"> <ul><xsl:apply-templates select="item" /></ul> </xsl:when> <!-- otherwise --> <xsl:otherwise> <ol><xsl:apply-templates select="item" /></ol> </xsl:otherwise> </xsl:choose> </xsl:template>
<xsl:for-each> emulates a for-each or for loop in a programming languages. This template uses <xsl:for-each> to select every child row element of elelement tablebody and converts the contents it to an HTML <tr> element.
<xsl:template match="tablebody">
<xsl:for-each select="row">
<tr>
<xsl:for-each select="entry">
<td style="padding: 2pt"><xsl:apply-templates /></td>
</xsl:for-each>
</tr>
</xsl:for-each>
</xsl:template>
As you can imagine, XSLT is a useful tool for publishing content that is stored in XML. The following chart is taken from Norman Walsh's DocBook Publishing Model website. DocBook is an XML vocabulary that is used to mark up technical documentation, but one could substitute any XML vocabulary for DocBook in the diamond on the upper right hand side of the chart that denotes the content source of the publishing system.
