Pipelines all the way

A pipeline consists of a chain of processes arranged so that the output of each process is the input of the next.
This principle became especially popular in the UNIX command shell.
It's also utilised in something like Yahoo pipes.

Yahoo Pipes
The same idea has been ported to XML processing where the XML output of one process is piped to the input of another process.

E.g. take an XML file as input, include an external fragment, output the expanded file, validate it, if valid output the PSVI enhanced XML and take this as input for a schema-aware XSLT transform and output an XHTML file. Every process has XML in and XML out.

Historically we have seen multiple implementations of this idea:

The bad news was of course that all those implementations were different.
It took many years before a standardisation effort started, named XProc, nearing final recommendation now.
But even then we need to wait if all the tools will support XProc.
We have of course a new player Calabash, but I'm not yet aware of others.
What will the makers of Netkernel, Orbeon, Stylus Studio, ... do regarding XProc?

For the RDF world I'm only aware of one RDF pipelining framework being SPARQLMotion, built in TopBraid Composer Maestro Edition and TopBraid Live, which is becoming a very powerfull but 'proprietarian' environment.

SPARQLMotion
Instead of waiting for too many years as in the XML world, I would like to see an earlier standardisation effort for RDF pipelines.

Since being now in wishing mode, a framework where we can integrate XML and RDF pipelines sounds good. Something for Netkernel 4.0?

Comments