22nd April, 2009

Python, libxslt and custom resolvers

I recently had a requirement to read in an XSLT file and transform it using Python.

These transforms weren't created by me and were importing other XSL files. Normally this wouldn't be a problem, however these assumed the XSL files were running on a webserver and were of the form:

<xsl:include src="/root/dir/filename.xslt"/>

This caused libxslt to crash as it couldn't resolve the filename.

The solution.

Tured out to be very basic indeed, however took quite a lot of Googling. Simply add a custom resolver to libxml using the command:

libxml2.setEntityLoader(entity_resolver)

Now you need a function which will get called on each include:

def entity_resolver(url, id, context):
    if url.startswith("file:///"):
        new_filename = url
        # 
        # Put code here to work out the correct path on the FILESYSTEM to load url correctly
        #
        #
        # Return an open FILE object
        #
        file_buffer = open(new_filename, "rt")
        return file_buffer

 

The opinions expressed here are my own and not those of my employer.