Knowledge Graphs interlink entities
RDF for describing Knowledge Graphs
-
RDF
Resource Description framework
-
Recommended by W3C since 1999
-
Recommended by W3C since 2014
-
Upcoming in 2026
Facts are represented as RDF triples

Multiple RDF triples form RDF datasets
:Alice :knows :Bob .
:Alice :knows :Carol .
:Alice :name "Alice" .
:Bob :name "Bob" .
:Bob :knows :Carol .
SPARQL querying over RDF datasets
- SPARQL: language to read and update RDF datasets via declarative queries.
-
Different query forms:
SELECT: selecting values in tabular form
CONSTRUCT: construct new triples
- …
SPARQL 1.1: https://www.w3.org/TR/sparql-query/
SPARQL 1.2: Upcoming in 2026
Find all artists born in York
| name | deathDate |
| Albert Joseph Moore | |
| Charles Francis Hansom | 1888 |
| David Reed (comedian) | |
| Dustin Gee | |
| E Ridsdale Tate | 1922 |
Basic Graph Patterns enable graph pattern matching
Steps in SPARQL query processing
-
1. Parsing
Transform a SPARQL query string into an algebra expression
-
2. Optimization
Transform algebra expression into a query plan → Our focus
-
3. Evaluation
Executes query plan to obtain query results
Why is optimization needed?
-
Queries are declarative
They say what needs to be done, not how.
Transformation into executable code can happen in many different ways.
→ different query plans
Example: query optimization for a BGP
How, and in what order should triple patterns be executed?
1. ?person a dbpedia-owl:Artist.
2. ?person rdfs:label ?name.
3. ?person dbpedia-owl:birthPlace ?birthPlace.
Obtain cardinality of each pattern
1. ?person a dbpedia-owl:Artist.
200
2. ?person rdfs:label ?name.
10.000
3. ?person dbpedia-owl:birthPlace ?birthPlace.
1.000
Can be pre-computed, or estimated at runtime.
Select appropriate join algorithms
Usually happens in pairs of two operators (n, m)
-
Nested-loop join
Double for-loop → O(n*m)
Works well for small triple patterns due to the lack of preprocessing.
-
Hash join
Precomputes hash tables for all mappings in one triple patterns.
Iterates over other triple patternss with lookups to hash table → O(n+m)
Not possible with OPTIONAL
-
Sort-merge join
Sorted triple patterns.
Iterates over both triple patterns in order → O(n+m)
Sorting can become expensive with larger triple patterns
Join orders and algorithms influence worst-case complexity
-
All nested-loop join
((1 ⋈ 2) ⋈ 3) = ((200 * 10.000) * 1.000) = 2.000.000.000
-
All hash join
((1 ⋈ 2) ⋈ 3) = ((200 + 10.000) + 1.000) = 11.200
-
Nested-loop join and hash join
((1 ⋈ 3) ⋈ 2) = ((200 * 1.000) + 10.000) = 210.000
→ Query engine aims to identify query plan with lowest cost
Conclusions
-
Basic Graph Patterns (BGPs): foundational building block for SPARQL
They enable graph pattern matching over RDF datasets
-
SPARQL is a declarative query language
Different query plans are possible for the same query
-
Query optimization is crucial for performance
Identifies the query plan with the lowest cost
Competences
-
Initial competences
Understanding Web fundamentals (TCP/IP, DNS, …)
Understanding of algorithms and datastructures (trees, complexities, …)
?Basic understanding of Knowledge Graphs?
-
Final competences
?Arguing the need for Knowledge Graphs?
?Contrasting SPARQL to other query languages?
Understanding how queries are parsed
Understanding how to optimize queries
Critically interpreting existing query plans
Understanding how to evaluate a query plan