An Introduction to Querying over Knowledge Graphs on the Web

Ruben Taelman

Data & Software Engineering, Universitas Gadjah Mada, Indonesia, 27 August 2025 (announcement)

An Introduction to Querying
over Knowledge Graphs
on the Web

Ghent University – imec – IDLab, Belgium

Storing data as Knowledge Graphs

1989, CERN Switzerland

Tim Berners-Lee
inventor of the World Wide Web

The Web's foundational ideas

1990-... World-wide adoption

Not just for researchers anymore

The Web is a global information space

a.k.a. The World Wide Web (WWW)

News
Social
Media

Web is focused on humans

Achieving tasks requires manual effort

What if intelligent agents
could do these tasks for us?

I Have a Dream

A Web where intelligent
agents are able to achieve
our day-to-day tasks.
2001

The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.

The dream: Getting there step by step

2015: Agents are being introduced that can perform simple tasks

Google Assistant Alexa Siri
Google Assistant, Alexa, Siri, ...

How do these agents work?

→ Execute Queries over Knowledge Graphs

Query = Structured Question
Knowledge Graph = Structured information

→ Structured questions over structured information

A Knowledge Graph is
a collection of structured information

Semantic Web technologies

Generative AI models

2022: LLMs offer human-like conversations based on crawled Web data

Differences to Knowledge Graphs:

ChatGPT Copilot DeepSeek

RDF enables data exchange on the Web

Facts are represented as RDF triples

RDF triple
RDF triple

Multiple RDF triples form RDF datasets

:Alice :knows :Bob .
:Alice :knows :Carol .
:Alice :name "Alice" .
:Bob :name "Bob" .
:Bob :knows :Carol .

Multiple RDF triples form RDF datasets

RDF Resources are identified by IRIs

https://example.org/Alice

Multiple syntaxes exist for RDF

Turtle

PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

dbr:Alice foaf:knows dbr:Bob .
dbr:Alice foaf:name "Alice" .

JSON-LD

{
    "@context": {
        "dbr": "http://dbpedia.org/resource/",
        "foaf": "http://xmlns.com/foaf/0.1/"
    },
    "@id": "dbr:Alice",
    "foaf:knows": "dbr:Bob",
    "foaf:name": "Alice"
}

SPARQL querying over RDF datasets

Specification: https://www.w3.org/TR/sparql-query/

Find all artists born in York

namedeathDate
Albert Joseph Moore
Charles Francis Hansom1888
David Reed (comedian)
Dustin Gee
E Ridsdale Tate1922

How do query engines process a query?

RDF dataset + SPARQL query

...

query results

How do query engines process a query?

RDF dataset + SPARQL query

SPARQL query processing

query results

Basic Graph Patterns enable graph pattern matching

Steps in SPARQL query processing

Publishing Knowledge Graphs as SPARQL Endpoints

SPARQL endpoint: API that accepts SPARQL queries, and replies with results.

SPARQL processing over centralized data

Centralization not always possible

How to query over decentralized data?

Approaches for querying over decentralized data

Client distributes query over query APIs

Federation over SPARQL endpoints

SELECT ?drug ?title WHERE {
  ?drug db:drugCategory dbc:micronutrient.
  ?drug db:casRegistryNumber ?id.
  ?keggDrug rdf:type kegg :Drug.
  ?keggDrug bio2rdf:xRef ?id.
  ?keggDrug purl:title ?title.
}


SELECT ?drug ?title WHERE {
  SERVICE <http://example.com/drb> {
    ?drug db:drugCategory dbc:micronutrient.
    ?drug db:casRegistryNumber ?id.
  }
  SERVICE <http://example.com/kegg> {
    ?keggDrug rdf:type kegg :Drug.
    ?keggDrug bio2rdf:xRef ?id.
    ?keggDrug purl:title ?title.
  }
}

Federation over heterogeneous sources

Limitations of federated querying

Example: decentralized address book

Example: Find Alice's contact names

SELECT ?name WHERE {
    <https://alice.pods.org/profile#me>
        foaf:knows ?person.
    ?person foaf:name ?name.
}

Query process:

  1. Start from Alice's address book
  2. Follow links to profiles of Bob and Carol
  3. Query over union of all profiles
  4. Find query results: [ { "name": "Bob" }, { "name": "Carol" } ]

Notable SPARQL engines

Popular SPARQL endpoints

Example Federated queries

Requires prior knowledge of sources.

Example Link Traversal queries

Discover sources on the fly.

Conclusions