Link Traversal-based Query Processing

Ruben Taelman

Web Fundamentals course 2023, 4 May 2023

Link Traversal-based
Query Processing

Query execution by following links between documents

Ghent University – imec – IDLab, Belgium

SPARQL processing over centralized data

How to query over decentralized data?

Example: decentralized address book

Example: Find Alice's contact names

        foaf:knows ?person.
    ?person foaf:name ?name.

Query process:

  1. Start from Alice's profile
  2. Follow links to profiles of Bob and Carol
  3. Query over union of all profiles
  4. Find query results: [ { "name": "Bob" }, { "name": "Carol" } ]

Link Traversal has practical problems

But solutions exist!

Process query during link queue handling

→ Cope with infinitely growing link queues

Pre-filter links before they enter queue

→ Cope with large number of links

Different filtering strategies exist

Follow if triple matches query: example

    <> foaf:knows ?person.
    ?person foaf:name ?name.

Contents of
✅ </profile#me> :knows <>.
✅ </profile#me> :knows <>.
❌ </profile#me> :likes <>.

Filtering links and query semantics

Zero-knowledge query planning

→ Cope with existing query planning algorithms not working

Score-based heuristics for determining triple pattern order in BGPs

Slow query execution

Query never terminates

Potential security issues