HTML Transformations
FO Transformations
Docbook XSL Exploration
LIS 450 DP Final Project Spring 02
Final Project Description
The main portion of this project is a DTD for office documentation.
It is designed to be usable for training manuals, online tutorials,
office handbooks, and other types of office reference materials. The
DTD used for all valid document instances included in the final project
is version 6.0. Metadata for a
document instance can be encoded using the Dublin Core. Metadata can be
included within a document instance or in a separate file. The Dublin Core DTD is included.
In addition to the documentation DTD, I explored three different
XML applications we touched upon this semester: XSLT, XSL-FO, and the
Docbook XML DTD. For part one of the project I created a group of XSL
stylesheets that would transform valid document instances of into HTML
documents for display on the web. The second part of the project
involved the creation of a group of XSL stylesheets that would
generate well-formed XSL-FO (Formatting Objects) documents when
applied to valid document instances of the DTD. The final part
involved the creation of a fairly simple XSL stylesheet that when
applied to valid instances of the DTD would result in a valid document
instance of the Docbook DTD. The impetus behind this was to use the
Docbook XSL
distribution available on sourceforge and compare the results of
the FO and HTML transformations using this distribution to the results
produced by my own stylesheets and DTD.
The two valid document instances used as examples throughout the project are:
- Instance One
- Instance Two, Dublin Core metadata file for instance two.
Please see the README file
for an explanation of the directory structure of the project
and for a more detailed explanation of problems with the output of the
various transformations
HTML Transformations
Part one of the project is a group of stylesheets that transform valid document
document instances into HTML 4.01. The group can be found at this
location, or viewed in a
single file.
Project DTD to HTML
- Instance One
- Instance Two
Issues:
- Needed to use a CDATA section to embed javascript in an XSL stylesheet as since the beginning javascript notation and ending notation for a javascript resembles an XML comment
.
- HTML produced is not valid 4.01, since a property to suppress the output of the dc and rdf namespace declarations to HTML documents needs to be found,
additionally the attribute "type" element in invalid, but since some browsers don't recognize the valid language attribute, type has been substituted.
- Need to suppress the appearance of the FO namespace xmlns attribute in HTML documents output from the stylesheets.
FO Transformations
Part two of the project is a group of stylesheets that transform valid document
document instances into well-formed instances of XSL-FO. The group can be found at this
location, or viewed in a
single file.
Once the the well-formed FO documents were created,
FOP, a XSL formatting
objects printer was used to transform the FO documents into the PDF format.
Results of the Transformation
Project DTD to FO
- Instance One
- Instance Two
Project DTD to PDF
- Instance One
- Instance Two
Issues:
- The FOP printer does not implement all of the properties specified in the
FO specification so it is important to review the supported properties at this
page.
- The biggest difficulty faced in creating the template for document layout
using FO was creating a suitable template to process the <table>
element of the DTD. In specific the issue of calculating the correct
width for each column and individual cells that required a special width
within the general table. This is necessary because the fop printer does
not support the automatic layout of tables like a typically web browser.
As a crude workaround I created a template that would assign a fixed amount
of space to each column depending on the number specified in the columns
attribute available for use with the <table> element. A way to
do this based on the length of the string residing in the table cell would
likely to be the most efficient.
- FOP has been upgraded recently, I plan to rewrite some of the stylesheets and test how the FO property support has been improved.
Docbook XSL Exploration
The third portion of the Assignment was an exploration of using the Docbook
XSL
Stylesheet distribution maintained. An XSL stylesheet that transforms
valid instances of the Project DTD to
valid Docbook 4.1.2 was created. The resulting documents from this
transformation were then transformed to HTML and FO using the stylesheets
of the distribution.
Transformations
Project DTD to Docbook
- Instance One
- Instance Two
Docbook to HTML
- Instance One
- Instance Two
Issues:
- The conversion of my internal document links did not work. These
came up as question marks.
- The stylesheets were unable to deal with the placement of an
anchor before each title element, even though this placement was
valid according to the docbook v4.1.2 dtd
Docbook to FO
- Instance One
- Instance Two
Issues:
- The conversion of my internal document links did not work. The processor
gave errors to the effect that the stylesheets were unable to process
the anchors and internal references to those anchors within the
document.
- The stylesheets were unable to deal with the placement of an
anchor before each title element, even though this placement was
valid according to the docbook v4.1.2 dtd
- The tables were not processed correctly, in some instances, they were partially processed in other cases just the text of the table cells was
sent to the FO. This was again problematic because the tables posed
no problems to validation
Docbook FO to PDF using FOP
- Instance One
- Instance Two
Issues:
In general, it was hard to find consistent documentation for learning
the FO language. This is probably the result of the specification being
so new. Additionally it was difficult to use FOP, at times, because a number
of FO properties are not supported by the printer at the moment making the
process of using tutorials frustrating because the examples did not
work when sent to FOP.
Last Update: 7/28/03