a Web-based decentralization ecosystem
Users are in full control over their own data
Built on Web standards
→ How to query efficiently?
Personal data pods
Full control of where your pod is stored and who can access it
Pods can store any kind of data
Personal data, photo's, friends, ...
Data become decoupled from apps
Today: data and app are tightly coupled
No choice over where and how data is stored, and who can access it
Solid: data and app are decoupled
Apps require read/write permissions from the user
A paradigm shift in app design
Storage of data is decentralised
Data is stored in the user's pod instead of in the app
Combining multiple data pods
Apps become views over one or more data pods
Explicit access control
Apps can only view or modify (parts of) your data after explicit approval
Solid spreads data across many sources.
How to build efficient Solid apps?
Queries as abstraction layer
Hide the complexities of reading and writing for app developers
Say what needs to happen, not how
Declarative queries hide complexities of data retrieval
Queries are reusable
Queries are not API-specific
Execution via a generic, reusable query engine
Abstracts away complexities for executing queries
Comunica is a meta query engine
Collection of building blocks
Independent modules implement specific functionality
Runs anywhere (Web browser, server-side, ...)
For research and production-ready
Open governance under the Comunica Association
How to query within the Solid ecosystem?
Link Traversal-based Query Processing
Query engine follows links based on the Linked Data principles
Traditional link traversal has problems
Many documents on the Web with many links to follow
Worst-case: download full Web for each query execution
Following links is expensive: setup TCP/HTTP connection
No indexing and query planning
Depending on seed documents, different results may be obtained
Link traversal works quite well for Solid
Exploit structural properties of Solid pods
Pods use Linked Data Platform
Guaranteed to find all files in a user's pod
Make use of document-based indexes
Solid type index: links to all resources of a certain class
Inefficient query plans
Traditionally: number of links is bottleneck for link traversal
Due to structural properties of Solid pods, this is less of a problem
Query engines must use heuristics for query planning
No statistics available prior to query execution
Hartig, Olaf. "Zero-knowledge query planning for an iterator implementation of link traversal based query execution."
Need for adaptive query planning
Modify query plan during traversal
Discovery of cardinality estimates and indexes
Hybrid query execution
Solid pods are currently document-based
Collection of Linked Data documents
Pods could expose more expressive interfaces
SPARQL endpoints, TPF, SPF, ...
Need for query execution over heterogeneous sources
How to do this in an adaptive manner?
Query engine only discover this interface during query execution
Exploit structural information
Users can structure their pod in a certain way
Place all photos in directories based on country
Query engines may exploit this information
If pods expose this information
Relevant for query planning
Pruning of documents and prioritization
Reasoning at query time
Different pods/apps may use different vocabularies
Schema.org, FOAF, Wikidata, ...
Apps issue queries in a single vocabulary
Query engine should perform schema alignment
Reasoning over partial and streaming knowledge
How to do this efficiently?
Summarization across multiple pods
Data may be aggregated across multiple Solid pods
Usage within family context, work place, ...
Query engines can exploit these summaries
Query planning and source selection
Link traversal is promising for Solid
Improvements are required to make it usable in practise
Adaptive query planning, hybrid query execution, ...
Investigating different techniques, but more work is needed
Open for collaborations