<p:declare-step type="p:directory-list">
<p:output port="result"/>
<p:option name="path" required="true"/> <!-- anyURI -->
<p:option name="include-filter"/> <!-- RegularExpression -->
<p:option name="exclude-filter"/> <!-- RegularExpression -->
</p:declare-step>
It generates and outputs an XML document describing the structure and contents of the directory specified on the path attribute.
Example.
Assuming that our path is file:///Users/paul/test/
having the following structure
![]()
then the generated XML is as follows:
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="test" xml:base="file:/Users/paul/test/">
<c:file name=".DS_Store"/>
<c:directory name="A"/>
<c:directory name="B"/>
<c:directory name="C"/>
</c:directory>
where the root element contains the base URI and the name of the directory.
As child elements you will find the child directories and files; in this case a hidden file on MacOS X.
The XProc code used for generating this directory-list in XML is:
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
<p:output port="result" sequence="true"/>
<p:variable name="path" select="'file:///Users/paul/test/'">
<p:empty/>
</p:variable>
<p:directory-list>
<p:with-option name="path" select="$path">
<p:empty/>
</p:with-option>
</p:directory-list>
</p:declare-step>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
<p:output port="result" sequence="true"/>
<p:variable name="path" select="'file:///Users/paul/test/'">
<p:empty/>
</p:variable>
<p:directory-list>
<p:with-option name="path" select="$path">
<p:empty/>
</p:with-option>
</p:directory-list>
<p:for-each name="directoryloop">
<p:output port="result" sequence="true"/>
<p:iteration-source select="/c:directory/c:directory"/>
<p:identity/>
</p:for-each>
</p:declare-step>
our result is the sequence of 3 documents:
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="A"/>
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="B"/>
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="C"/>
This is the reason why the result port has been set to sequence='true'.
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="A"/>
we want to build a new directory-list where we need now to concatenate the path with the name of the childdirectory which we find in the XML above, hence XProcwise
<p:variable name="dirpath" select="p:resolve-uri(concat(c:directory/@name, '/'),$path)"/>
<p:directory-list>
<p:with-option name="path" select="$dirpath"/>
</p:directory-list>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
<p:output port="result" sequence="true"/>
<p:variable name="path" select="'file:///Users/paul/test/'">
<p:empty/>
</p:variable>
<p:directory-list>
<p:with-option name="path" select="$path">
<p:empty/>
</p:with-option>
</p:directory-list>
<p:for-each name="directoryloop">
<p:output port="result" sequence="true"/>
<p:iteration-source select="/c:directory/c:directory"/>
<p:variable name="dirpath" select="p:resolve-uri(concat(c:directory/@name, '/'),$path)"/>
<p:directory-list>
<p:with-option name="path" select="$dirpath"/>
</p:directory-list>
<p:identity/>
</p:for-each>
</p:declare-step>
Giving us for the first iteration:
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="A" xml:base="file:/Users/paul/test/A/">
<c:file name="A1.xml"/>
<c:file name="A2.xml"/>
</c:directory>
<p:make-absolute-uris match="c:file/@name">
<p:with-option name="base-uri" select="$dirpath"/>
</p:make-absolute-uris>
leading to
<c:directory xmlns:c="http://www.w3.org/ns/xproc-step" name="A" xml:base="file:/Users/paul/test/A/">
<c:file name="file:/Users/paul/test/A/A1.xml"/>
<c:file name="file:/Users/paul/test/A/A2.xml"/>
</c:directory>
which gives us enough information now to iterate over the files using XPath /c:directory/c:file, to load them and to process them further.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
<p:output port="result" sequence="true"/>
<!-- <p:import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl"/>-->
<p:variable name="path" select="'file:///Users/paul/test/'">
<p:empty/>
</p:variable>
<p:directory-list>
<p:with-option name="path" select="$path">
<p:empty/>
</p:with-option>
</p:directory-list>
<p:for-each name="directoryloop">
<p:output port="result" sequence="true"/>
<p:iteration-source select="/c:directory/c:directory"/>
<p:variable name="dirpath" select="p:resolve-uri(concat(c:directory/@name, '/'),$path)"/>
<p:directory-list>
<p:with-option name="path" select="$dirpath"/>
</p:directory-list>
<p:make-absolute-uris match="c:file/@name">
<p:with-option name="base-uri" select="$dirpath"/>
</p:make-absolute-uris>
<p:for-each name="fileloop">
<p:iteration-source select="/c:directory/c:file"/>
<p:variable name="file" select="/c:file/@name"/>
<p:load name="file">
<p:with-option name="href" select="$file"/>
</p:load>
<p:identity/>
</p:for-each>
<!-- <p:identity/>-->
</p:for-each>
</p:declare-step>
If you see ways to optimize this approach, please comment.
Comments
Manfred Staudinger (unauthenticated)
Aug 21, 2009
Hi Paul,
First let me thank for this blog entry! I'm running calumet on Win2k with JDK version 1.6.0_14 from the
command line. When I run your very first example ("directory listing"), it works fine (Exit code: 0).
Differences to your result (significant or not) are:
1. additional xml-declaration plus CRLF
2. on the doc-element, the xml:base attribute is missing
I can run also the 2nd example ("repeat-over"), the third ("Adding a loop within the loop") and
the forth ("make-absolute-uris"). Still the xml-declaration gets added to each document of the
resulting sequence, but no xml-base appears.
But when I replace (4th example) the
<p:identity/>
with
<p:for-each name="fileloop">
<p:iteration-source select="/c:directory/c:file"/>
<p:variable name="file" select="/c:file/@name"/>
<p:load name="file">
<p:with-option name="href" select="$file"/>
</p:load>
<p:identity/>
</p:for-each>
then iI get
Exception in thread "main" com.emc.documentum.xml.xproc.pipeline.model.DynamicError: {http://www.w3.org/ns/xproc-error}XC0026: It is a dynamic error if the document does not exist or is not well-formed.
Original message: {http://www.w3.org/ns/xproc-error}XC0026: It is a dynamic error if the document does not exist or is not well-formed.
Original message: XPROC_ERROR:
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
Original message: XPROC_ERROR:
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
Original message: Content is not allowed in prolog.
at com.emc.documentum.xml.xproc.util.impl.StepUtil.dynamicError(StepUtil.java:527)
at com.emc.documentum.xml.xproc.pipeline.model.step.impl.AbstractStepImpl.run(AbstractStepImpl.java:324)
... many more ...
Regards, Manfred
manfred.staudinger@gmail.com