SHACL: an introduction

The SHapes Constraint Language (SHACL) is a language in which you define constraints over RDF graphs. You can write things like: "every person needs to know someone" and the RDF graph can conform to that constraint or not. Such a SHACL constraint is referred to as a shape. In practice you often write a set of these shapes which define your expectations of some RDF graph. This set of shapes is referred to as a SHACL shapes graph. The name contains "graph" because these shapes are themselves written as an RDF graph. To distinguish a SHACL shapes graph from the data it validates (which is of course also RDF) we refer to the data as the data graph.

The above description of SHACL immediatly brings us to its main function: checking conformance. Given a shapes graph and a data graph, we want to know if the data graph adheres to all shapes defined in the shapes graph. This is clearly a useful task in practice as you want to know if your data is in the right format for you to process.

It is useful to consider the rough anatomy of a shape. It consists of three main components:

(I deviate slightly from the specification which defines a shape more broadly. However, for the purpose of this exposition, I like my definition better) Consider the example:

:SocialShape a sh:PropertyShape ;
    sh:path schema:knows ;
    sh:minCount 1 .

:SocialShape sh:targetClass schema:Person .

Here, we have one shape called :SocialShape which states that the focus nodes in the data graph need to have at least one outgoing schema:knows edge. Informally, "focus node" just means the node that is considered to adhere to this shape. The focus nodes here are given by the target declaration on the last line. The targeting statement describes all nodes that are of rdf:type schema:Person. So all nodes in the data graph that are declared to be of type schema:Person will be considered as focus node for :SocialShape.

Conformance captures the spirit of SHACL. However, when your graph does not conform to your constraints, you want to know where the problems occur. In this case, conformance is not good enough. We need some report stating which nodes do not satisfy which shape in the shapes graph. The task of generating such a report is called validation.

To be continued… I initially wrote this post just to give people some idea of what SHACL is, as the specification is too cryptic to be useful as a short introduction.