Workshop

Towards Hybrid Link Traversal: Challenges and Research Directions for Heterogeneous Dataspaces

In Proceedings of the 4th International Workshop on Semantics in Dataspaces (2026)

Decentralized dataspaces preserve data sovereignty by keeping data at its source, but querying a large number (thousands) of autonomous sources with heterogeneous interfaces introduces significant challenges. Traditional Federated Query Processing (FQP) engines fail to scale to this size, while existing Link Traversal-based Query Processing (LTQP) systems can only handle Linked Data documents. By ignoring the (query) capabilities of alternative data interfaces, current engines fail to take advantage of the performance gains offered by more expressive interfaces. To address this, we advocate for Hybrid FQP-LTQP, an execution strategy that combines the dynamic runtime discovery of link traversal with the performance benefits of delegating complex sub-queries to capable server-side interfaces. This paper reviews the state-of-the-art in hybrid traversal and identifies the critical challenges hindering its implementation: establishing exclusive groups without prior knowledge, integrating dynamically identified sub-queries into query plans, enabling reliable data access interface discovery, and deduplicating results obtained from these interfaces. Solving the challenges identified in this paper is a strict prerequisite for making hybrid decentralized querying practically viable across heterogeneous dataspaces. Consequently, future research must prioritize these open problems to engineer the next generation of scalable, hybrid FQP-LTQP engines.