SemwidgQL
SemwidgQL is transcompiled into SPARQL – the standard query language for Linked Data – and can thus be used to query almost any public Linked Data endpoint. The intention behind SemwidgQL is not to replace or extend SPARQL. It is merely intended to simplify requesting instance data for users who are not familiar with the techniques of the Semantic Web.
As a path query language SemwidgQL traverses the Linked Data Graph. The traversal is indicated by the dot notation, which is reminiscent of the syntax of object-oriented programming. The image below shows the simplified basic structure of a SemwidgQL query. Usually a query starts with a resource followed by one or more properties. To further filter the result set, properties can be restricted. Filters are enclosed in parentheses and are appended the property they restrict. The left-hand side of a filter expression is typically a property (or a property path) that refers to the property to restrict outside of the parentheses. The right-hand side specifies a filter value that can be a literal, IRI, or even a nested query. Between them stands a relational operator. Further filter expressions can be combined by logical operators. Several filter expressions can be combined by logical operators. Furthermore, SemwidgQL allows wildcard selectors, inverse property selections, and multiple property selections.
Additional Links
- Formal Syntax definition (EBNF and Railroad Diagram)
- SemwidgQL query examples and their translation into SPARQL
- Psudo code of the SemwidgQL-to-SPARQL transcompiler
- ANTLR v3 Grammar of SemwidgQL's syntax
- Download
Getting Started
Basic Features
The basic features of SemwidgQL are introduced below. All examples can be used to query data from DBpedia’s public SPARQL endpoint.
Named Resources: SemwidgQL supports a naming mechanism for resources. Instead of fully qualified resource URLs or URLs prefixed by a namespace, a user can specify a substitute name and use it within all SemwidgQL queries. For our examples we will define a resource named rome with the URI http://dbpedia.org/resource/Rome. The definition is not part of a SemwidgQL query. The definition is part of a SemwidgJS configuration or can directly be passed as parameter to the SemwidgQL library.
rome = http://dbpedia.org/resource/Rome
Path Navigation: The following SemwidgQL query requests all labels of the resource Rome. Below the corresponding translation into SPARQL is shown. In order to save space and increase readability, we replaced the fully qualified resource name of Rome by its prefixed name (dbpedia:Rome) in the SPARQL query.
rome.rdfs:label
SELECT DISTINCT (dbpedia:Rome AS ?uri) ?label
WHERE {
dbpedia:Rome rdfs:label ?label .
}
Inverse Property Selection: Linked Data is represented by a directed graph. Therefore, it is possible that a resource is not aware of all connections that exist between itself and other resources. In order to obtain properties of resources that are linked unidirectionally to the initial resource, it is possible to invert the property selection by a prepended caret symbol (^). For example, the DBpedia database does not contain any links between a city and the people who were born in it. But it contains connections between people and their places of birth. To request a list of people who were born in Rome, the property selection must be inverted. The following query requests the labels of all individuals whose birthplace is Rome.
rome.^dbpedia-owl:birthPlace.rdfs:label
SELECT DISTINCT (dbpedia:Rome AS ?uri) ?birthPlace ?label
WHERE {
?birthPlace dbpedia-owl:birthPlace dbpedia:Rome .
?birthPlace rdfs:label ?label .
}
Property Filter: Object properties can be filtered by values of linked properties. The left-hand side of a filter expression specifies a property that references to the previously defined object property or resource in the query path. The rest corresponds to the usual SPARQL syntax for filter expressions. As a multilingual database DBpedia contains the name of Rome in different languages (Rome, Roma, Rom, 罗马). One of the most common actions when querying multilingual texts for displaying purposes is consequently the filtering for a specific language. Therefore, we have a simple language filter integrated directly into our query language. The language filter is prefaced with the keyword @lang. The following query contains two filters and requests the English label of all soccer players who were born in Rome.
rome.^dbpedia-owl:birthPlace(rdf:type = dbpedia-owl:SoccerPlayer).rdfs:label(@lang = 'en')
SELECT DISTINCT (dbpedia:Rome AS ?uri) ?birthPlace ?label
WHERE {
?birthPlace dbpedia-owl:birthPlace dbpedia:Rome .
?birthPlace rdf:type ?filtertype .
Filter (?filtertype = dbpedia-owl:SoccerPlayer) .
?birthPlace rdfs:label ?label .
Filter (lang(?label) = '' || langMatches(lang(?label), 'en')) .
}
Multiple Property Selection: In some cases, certain properties are inseparable. Hence it is possible to select them simultaneously in a single query. Selection of multiple properties is only allowed as last part of a SemwidgQL query.
rome.[geo:lat, geo:long]
SELECT DISTINCT (dbpedia:Rome AS ?uri) ?lat ?long
WHERE {
dbpedia:Rome geo:lat ?lat ;
geo:long ?long .
}
Advanced Features
Wildcards: Resources and properties can be replaced by a wildcard (*). A wildcard request is often combined with a type filter. Following query requests the resource URIs of all cities.
*(rdf:type = dbpedia-owl:City)
SELECT DISTINCT ?wildcard
WHERE {
?wildcard rdf:type ?type .
FILTER (?type = dbpedia-owl:City) .
}
Nested Queries: The right-hand side value of a conditional expression inside a filter expression can be represented by an independent nested SemwidgQL query. Following query requests the resource URIs of all cities that have a higher population than Rome.
*(@type = dbpedia-owl:City && dbpedia-owl:populationTotal > {rome.dbpedia-owl:populationTotal})
SELECT DISTINCT ?wildcard
WHERE {
?wildcard rdf:type ?type .
?wildcard dbpedia-owl:populationTotal ?populationTotal_wildcard .
dbpedia:Rome dbpedia-owl:populationTotal ?populationTotal_Rome .
FILTER (?type = dbpedia-owl:City && ?populationWildcard > ?populationTotal_Rome ) .
}
Filters and Pseudo-Filters: Several filter and pseudo-filter keywords exist which, among others, simplify restricting language of string literals or allow aggregation of results. Also, they facilitate querying of timesequential data with fexibly specified sampling intervals. Filter and pseudo-filter keyword expressions can be combined with normal SemwidgQL filter expressions and with each other as well. While filter expressions in SemwidgQL result in filter expressions in SPARQL, pseudo-filter expressions can have an impact on different parts of the translated query. An overview of these expressions is given below.
-
Filter Expressions:
- @lang:
- With this keyword the language of the property can be filtered by the given language code.
- @self:
- This keyword refers to the property to restrict itself. Instead of filtering a property that is related to the property to restrict, it can be filtered directly.
- @timestart / @timeend:
- These keywords allow the filtering of values after, before, or (when combined) between two points of time. The right-hand side of the expression can be an absolute date or a relative point in time, depending on the time of the query execution. The expression is parsed as an equation, whose first part is a timestamp or the term now followed by the amount of time that has to be added or subtracted. This can be expressed in seconds, minutes, hours, day, weeks, or a combination of these (e.g. now - 1h 5m).
- @type:
- This keyword is equivalent to the property rdf:type.
-
Pseudo-Filter Expressions:
- @aggregate:
- This keyword allows to apply an aggregate function to the variable of the property within the SELECT statement. Allowed values are COUNT, SUM, MIN, MAX, AVG, and SAMPLE.
- @hide:
- If set to true, the variable of the property will not be part of the SELECT statement.
- @optional:
- If set to true, the triple pattern, in which the property is created, will be enclosed in an OPTIONAL statement.
- @predicate:
- Typically, the predicate of a triple pattern is not part of the SELECT statement. If set to true, the predicate of the triple pattern, in which the property is created, will be added to the SELECT statement.
- @timeinterval:
- This keyword is used to group and aggregate time-sequential values. On the right-hand side of the expression, a sampling interval can be defined. All returned values within this interval will be aggregated. By default, the Sample aggregate function will be applied to all variables, but different functions can be specified by the @aggregate keyword. Similar to @timestart and @timeend, the length of the interval can be expressed in seconds, minutes, hours, day, weeks, or a combination of these.
Further Query Examples
We provide further SemwidgQL query examples and their translation into SPARQL on a separate page.