Tags

Some SPARQL extension function tricks

Identifiers with qualifiers in round brackets

DBPedia is using in its identifiers qualifiers within round brackets to be able to distinguish between entities having the same names.
An example of such identifier is : http://dbpedia.org/resource/Belle_de_jour_(film).

Belle de jour in TabulatorUsing ARQ (Jena) functions to grab the namespace and the localname of this identiifer doesn't seem to work as hoped for.
SELECT ?ln ?ns
WHERE {
LET (?ln := afn:localname(<http://dbpedia.org/resource/Belle_de_jour_(film)>)) .
LET (?ns := afn:namespace(<http://dbpedia.org/resource/Belle_de_jour_(film)>)) .
}

results in

Result of Jena functions
This is how we solved this issue.

SELECT ?ln ?ns ?fn ?li ?name ?namespace
WHERE {
LET (?ns := afn:namespace(<http://dbpedia.org/resource/Belle_de_jour_(film)>)) .
LET (?ln := afn:localname(<http://dbpedia.org/resource/Belle_de_jour_(film)>)) .
LET (?fn := fn:concat(?ns, ?ln)) .
LET (?li := (smf:lastIndexOf(?fn, "/") + 1)) .
LET (?name := afn:substr(?fn, ?li)) .
LET (?namespace := afn:substr(?fn, 0, ?li)).
}
We work with the complete string to find the position of the last '/'.
Having this position we take the substring before and after.

The functions starting with prefix 'fn' are the by ARQ supported XQuery/XPath functions and operators.
The functions starting with 'afn' are ARQ extension functions.
The function 'lastIndexOf' however is an extension only available in SparqlMotion of TopQuadrant.

Constructing dataset descriptions using voiD

voiD is a vocabulary to describe datasets, being a collection of data, published and maintained by a single provider as RDF and accessible by dereferenceable URI's and/or SPARQL endpoints and/or as data dump, ...

Next to all types of metadata related to the dataset such as dcterms:title, dcterms:creator ... a voiD description also offers the possibility to include some statistical data about the dataset.
Examples are:
  • the number of triples
  • the number of resources
  • the number of DistinctSubjects
  • the number of DistinctObjects
These are some example SPARQL queries which can be used for this purpose.

CONSTRUCT  {:X a void:Dataset.
:X void:statItem _:b0.
_:b0 scovo:dimension void:numberOfTriples.
_:b0 rdf:value ?triples.
:X void:statItem _:b1.
_:b1 scovo:dimension void:numberOfResources.
_:b1 scovo:dimension skos:Concept.
_:b1 rdf:value ?concepts.
:X void:statItem _:b2.
_:b2 scovo:dimension void:numberOfResources.
_:b2 scovo:dimension dcterms:Agent.
_:b2 rdf:value ?agents.
:X void:statItem _:b3.
_:b3 scovo:dimension void:numberOfDistinctSubjects.
_:b3 rdf:value ?nrsubj.
:X void:statItem _:b4.
_:b4 scovo:dimension void:numberOfDistinctObjects.
_:b4 rdf:value ?nrobj.
}
WHERE {
LET (?triples := smf:countMatches(?s, ?p, ?o)) .
LET (?concepts := smf:countMatches(?concept, rdf:type, skos:Concept)) .
LET (?agents := smf:countMatches(?agent, rdf:type, dcterms:Agent)) .
LET (?nrsubj := smf:countResults("SELECT DISTINCT ?s WHERE {?s ?p ?o.}")) .
LET (?nrobj := smf:countResults("SELECT DISTINCT ?o WHERE {?s ?p ?o.}")) .
}
Once again using heavily TopQuadrant's SparqlMotion specific SPARQL extension functions.
For an overview of SPARQL extension functions available in different products, see the SPARQL Extension Function Survey of Leigh Dodds.

Hope

I hope that the upcoming SPARQL 1.1 spec comes up with a long list of standardized functions.
For now we still need to write too many solutions using tool specific SPARQL extensions.

Comments