Requirements and Challenges for Query Execution across Decentralized Environments

Ruben Taelman

DeSeRe 2024, 13 May 2024

Requirements and Challenges for Query Execution across Decentralized Environments

Ghent University – imec – IDLab, Belgium

Decentralized Knowledge Graphs (DKGs)

Solid Logo Mastodon Logo BlueSky Logo

Difficulties for app developers

Abstract access to DKGs

Hide the complexities of reading and writing for app developers

Application Gear Globe

Image credit

How can query engines abstract access to DKGs?

Focus on Solid

Solid Logo

Personal data pods

Full control of where your pod is stored and who can access it

Solid Pods

Pods can store any kind of data

Personal data, photo's, friends, ...

→ Massive decentralization of data across documents and pods
Solid Pods Storage

Requirements

  1. Execution of arbitrary structured queries
  2. Discovery of data within pods
  3. Discovery of data across pods
  4. Handling location heterogeneity
  5. Handling schema heterogeneity
  6. Handling API heterogeneity
  7. Authentication
  8. User-perceived performance

1. Execution of structured queries

SELECT ?messageId ?messageCreationDate ?messageContent WHERE {
  ?message snvoc:hasCreator pods:6597069767117/profile/card#me;
    rdf:type snvoc:Post;
    snvoc:content ?messageContent;
    snvoc:creationDate ?messageCreationDate;
    snvoc:id ?messageId.
}
      

2. Discovery of data within pods

Solid Pods Storage

3. Discovery of data across pods

Follow links

4. Handling location heterogeneity

Pod structure

5. Handling schema heterogeneity

Vocabularies

6. Handling API heterogeneity



Linked Data Fragments Axis Linked Data Fragments Axis

7. Authentication

Solid Pods Sharing

8. User-perceived performance

Response time limits

Requirements

  1. Execution of arbitrary structured queries
  2. Discovery of data within pods
  3. Discovery of data across pods
  4. Handling location heterogeneity
  5. Handling schema heterogeneity
  6. Handling API heterogeneity
  7. Authentication
  8. User-perceived performance

Link Traversal Query Processing for Solid

Taelman, R., Verborgh, R.: Link Traversal Query Processing over Decentralized Environments with Structural Assumptions. In: Proceedings of the 22nd International Semantic Web Conference (2023).

SPARQL link traversal

ESPRESSO: keyword search over pods

Ragab, M., Savateev, Y., et al.: ESPRESSO: A Framework for Empowering Search on Decentralized Web. In: International Conference on Web Information Systems Engineering. pp. 360–375. Springer (2023).

ESPRESSO

POD-QUERY: query agent on pod

Vandenbrande, M., et al.: POD-QUERY: Schema Mapping and Query Rewriting for Solid Pods. In: ISWC2023, the International Semantic Web Conference (2023).

POD-QUERY

No approach meets all requirements

Requirement LTQP Solid ESPRESSO POD-QUERY
Execution of arbitrary structured queries
Discovery of data within pods
Discovery of data across pods ~
Handling location heterogeneity
Handling schema heterogeneity
Handling API heterogeneity
Authentication
User-perceived performance ~

Open challenges for the future

Conclusions