Christopher Guy Yocum

ORCID: 0000-0002-7241-3264

The first post in this series covered the select query form and used individuals with the name Báeth as a guide. We will continue our sojourn through SPARQL with the list of individuals named Báeth collected in the first post in this instalment where the construct query form will be introduced and discussed.

Simply but opaquely stated, the construct query form takes one sub-graph and returns a new sub-graph which is transformed from the original. This means that you can search the dataset for a particular pattern then transform it into another pattern that the user supplies. While this might not sound very useful for IrishGen and, by extension, Báeth, what it allows the user to do is to extract interesting and useful parts of the overall dataset such as Báeth’s children or, even more interestingly, Báeth’s genealogy as reconstructed by the reasoner.

We will start by examining Báeth’s children so that we can show their relationship to Báeth. In this, we will need to choose a particular Báeth, using the query below:

prefix foaf:  <http://xmlns.com/foaf/0.1/>
prefix rel: <http://purl.org/vocab/relationship/>

select ?x (count(distinct ?child) as ?children)
from <tag:stardog:api:context:all>
where {
	?x foaf:name ?y;
        rel:parentOf ?child
    filter regex(?y, "^B[áa]eth$", "i")
}
group by ?x
order by desc(?children)

The above query (with reasoning enabled) returns 15 results as shown below:

x children
http://example.com/Rawl_B502/clann_aingeda.trig#Báeth 12
http://example.com/Rawl_B502/genelach_úa_m-bairrche.trig#Bóeth 2
http://example.com/Rawl_B502/genelach_benntraige.trig#Báeth 1
http://example.com/LL/de_genelach_dail_messi_corbb.trig#Baeth-a245b020 1
http://example.com/LL/genelach_h_n-enechglais.trig#Baeth-50d79733 1
http://example.com/Rawl_B502/genelach_úa_m-bairrche.trig#Báeth 1
http://example.com/Rawl_B502/item_úi_máeli_rubae_la_laigniu.trig#Báeth 1
http://example.com/Rawl_B502/genelach_ciannachta.trig#Báeth 1
http://example.com/LL/genelach_h_mugroin_i_m-maig_liphi.trig#Baeth 1
http://example.com/Rawl_B502/genelach_ceníuil_báeth.trig#Báeth 1
http://example.com/LL/forthart_fea.trig#Baeth 1
http://example.com/Rawl_B502/úib_luchta.trig#Báeth 1
http://example.com/Rawl_B502/do_primforslointib_Lagen_inso.trig#Báeth-604282d0 1
http://example.com/Rawl_B502/genelach_úa_rossa_i_úa_n_dícolla.trig#Báeth 1
http://example.com/LL/eoganachta_casil.trig#Buith 1

We discover that http://example.com/Rawl_B502/clann_aingeda.trig#Báeth has 12 children that are directly related to that URL, which means that they are all described as his children in a particular text and in one particular MS for which see the URL above, and who is well suited to demonstrate the basic construct facility. The construct query below will return all twelve of these and Stardog Studio will render the graph into a nice visual format.

prefix rel: <http://purl.org/vocab/relationship/>

construct {
    <http://example.com/Rawl_B502/clann_aingeda.trig#Báeth> rel:parentOf ?child
} 
from <tag:stardog:api:context:all>
where {
    <http://example.com/Rawl_B502/clann_aingeda.trig#Báeth> rel:parentOf ?child
}

construct queries generally have have two bodies. The first is just after the construct keyword. This first body is the target RDF the user wishes to create. The from keyword here serves a similar purpose as it does in the select query form and denotes that the user is interested in having the Triplestore consider the entire dataset when running the query. The where keyword denotes the pattern that the Triplestore should match. In this case, the pattern is the same as the RDF that the construct clause constructs. The effect of this is to extract the sub-graph of all of Báeth’s children from the entire graph of all of IrishGen. This makes it possible to visualise. It is no use attempting to visualise the entire graph as a user would most likely get lost or confused. Extracting and visualising sub-graphs allows the user to target persons of interest and their genealogical information without becoming overloaded by extraneous information. Additionally, attempting to render visualisations of vast graphs is a huge computational task and most user’s machines would not be powerful enough to accomplish it.

As an aside, the above query can be written more concisely by using the single body form of construct, which can be useful when the user wishes to have a small subset extracted from the graph then used in visualisation. The single body form of construct can only be used with the constructed graph pattern exactly matches the pattern used in the where clause:

prefix rel: <http://purl.org/vocab/relationship/>

construct 
from <tag:stardog:api:context:all>
where {
    <http://example.com/Rawl_B502/clann_aingeda.trig#Báeth> rel:parentOf ?child
}

The visualisation that is returned by Stardog Studio is below.

As one can see, this produces a star-like structure of children around Báeth. At this point one can start exploring the wider graph by right-clicking on the nodes and choosing “Expand from node”, which will pull from the entire graph (the construct query being just a starting point). However, in Stardog Studio there is a problem where unless the “Query All Graphs” property is set to true in Stardog, it will not expand. Briefly, GraphDB has “Query All Graphs” essentially on by default and is not a property that the user can set. Stardog has the property off by default which can cause confusion when a user is encountering both Triplestores. A second and more pressing problem is that expanding nodes in the face of reasoning can cause a query that a user’s laptop cannot finish in a reasonable time or finish at all in the face of the query timeout restrictions. Although, in larger server based installations, it would be possible. In Stardog, this can be overcome by turning reasoning off when doing an exploration of the graph if the query does not return in a reasonable time but this limits what is possible. It is up to the user’s discretion to understand what their query involves and what compromises they are willing to tolerate.

To set the Stardog parameter “Query All Graphs” to true in Stardog Studio, the user will need to go to “Databases” then select their database from the list of databases. In the right-hand pane, the user will need to choose “Properties”. In the search box directly below the “Properties” header, type “query all”. This will cause the option to appear and the user can then check the checkbox to enable “Query All Graphs”.

To contrast this with a select query that returns the same information but in a textual format, the query below recreates the construct query.

prefix rel: <http://purl.org/vocab/relationship/>

select ?child
from <tag:stardog:api:context:all>
where {
    <http://example.com/Rawl_B502/clann_aingeda.trig#Báeth> rel:parentOf ?child
}

with the result being:

child
http://example.com/Rawl_B502/úib_luchta.trig#Comdellach
http://example.com/Rawl_B502/úib_luchta.trig#Fogartach
http://example.com/Rawl_B502/úib_luchta.trig#Díummassach
http://example.com/Rawl_B502/úib_luchta.trig#Flann
http://example.com/Rawl_B502/úib_luchta.trig#Flanngus
http://example.com/Rawl_B502/úib_luchta.trig#Lassirne
http://example.com/Rawl_B502/clann_eichlich.trig#Nárgusa
http://example.com/Rawl_B502/úib_luchta.trig#Aingid
http://example.com/Rawl_B502/úib_luchta.trig#Eichlech
http://example.com/Rawl_B502/úib_luchta.trig#Airechtach
http://example.com/Rawl_B502/úib_luchta.trig#Dóelgus
http://example.com/Rawl_B502/úib_luchta.trig#Abiél

The same set of URLs has been returned, but merely as a list which cannot be further visualised or queried.

Continuing with the graph visualsation created above, one can explore the graph a bit at a time as one can see below, with reasoning turned off to ensure that the query can finish:

A much more ambitious graph to construct is the graph of Báeth’s ancestors or to reconstruct his genealogical line from the database. This query introduces a few new constructs which are handy when working with SPARQL.

prefix rel: <http://purl.org/vocab/relationship/>

construct {
    ?s rel:childOf ?parent.
    ?y rel:parentOf ?baeth.
} 
from <tag:stardog:api:context:all>
where {
    values ?baeth {<http://example.com/Rawl_B502/clann_aingeda.trig#Báeth> }

    {
      ?s rel:ancestorOf ?baeth;
         rel:childOf ?parent
    } union {
      ?y rel:parentOf ?baeth
    }
}

The construct form first shows what will be the product of the query. The values keyword is used to store static URLs which will be used repeatedly within the query. This cuts down on the visual complexity of the query overall. The next part of the query is the meat of the process. Expressed informally, this says: “return all people who are the ancestors of Báeth and their parent-child relationship and return the parent of Báeth as well”. These are then reconstructed in the construct form to create the reconstructed genealogy of Báeth. The union is used to get Báeth themselves attached to the output through their parent which makes it slightly easier to interpret the visualisation.

One will note that the genealogy only goes back to Imchad. This is due to the fact that Imchad is not linked to any other part of the genealogies. The reason for this is outside the scope of this post but could be as simple as a missing owl:sameAs link from elsewhere. The curators tend to be conservative and if there is no clear evidence that two individuals are the same, the policy is to leave them as is. Finally, a note about performance. This query took 21000 milliseconds or about 21 seconds to run. Reconstructing this by hand would take much longer and creating a visualisation even longer still. That it results in an interactive visualisation is just one more additional benefit.

The graph created by the construct query above is an amalgam of data from all the genealogical collections in the corpus. A graph from one single tradition would require the use of the from named portion of the query and will be the subject of another post because from named requires special treatment as it interacts with named graphs in some non-obvious ways.

As we can see, construct queries allow the transformation of RDF graphs found in the Triplestore into other RDF graphs. Using this, we can extract interesting sub-graphs from the dataset and visualize them with both Stardog Studio and GraphDB supporting this functionality. This allows the reconstruction of genealogical lines and visual exploration of the data that is latent in the dataset through a Triplestore’s reasoning capabilities.