Efficient Knowledge Graph Construction Based on Optimized Plans

Freund M, Schmid SJ, Harth A (2025)


Publication Type: Book chapter / Article in edited volumes

Publication year: 2025

Publisher: IOS Press Ebooks

Edited Volumes: Studies on the Semantic Web Volume 62: Linking Meaning: Semantic Technologies Shaping the Future of AI

Series: Studies on the Semantic Web

ISBN: 9781643686165

DOI: 10.3233/SSW250005

Abstract

Purpose:

Existing approaches for generating Knowledge Graphs (KGs) from file-based, non-RDF data using declarative mappings are either limited by language-specific engines or lack optimization with language-independent relational algebra backends, resulting in suboptimal performance. This research proposes an integrated framework that tightly couples logical and physical plan optimizations, enabling high-performance, language-agnostic RDF graph construction.


Methodology:

We formalize the KG construction process using relational algebra with a dedicated RDF term generation function within the projection operator, resulting in one of two canonicalized logical plans, one with a join and one without. We then introduce tightly coupled physical operators used to define concrete execution pipelines. We propose and evaluate two optimizations, logical-level constant-folding to reduce redundant computations and a physical-level heuristic scheduling strategy to optimize concurrent execution. We implemented the optimizations in a new backend engine called konverter and benchmarked the engine with an RML frontend against two comparable engines, Morph-KGC and FlexRML.


Findings:

Empirical results show that constant-folding improves performance by approximately 7.4% and heuristic scheduling by approximately 14.7% compared to a worst case scenario, with minimal additional memory overhead. Overall, konverter outperforms the current state-of-the-art in performance FlexRML, reducing execution time by 61.5% and peak memory usage by 25.1%, though it currently only supports CSV files, a limitation we aim to address in future work.


Value:

The proposed framework and optimizations provide a formal and practically validated approach to optimizing the execution of declarative mappings for KG construction. The konverter engine demonstrates the potential for building high-performance, language-agnostic engines for enterprise KG construction.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Freund, M., Schmid, S.J., & Harth, A. (2025). Efficient Knowledge Graph Construction Based on Optimized Plans. In Studies on the Semantic Web Volume 62: Linking Meaning: Semantic Technologies Shaping the Future of AI. IOS Press Ebooks.

MLA:

Freund, Michael, Sebastian Josef Schmid, and Andreas Harth. "Efficient Knowledge Graph Construction Based on Optimized Plans." Studies on the Semantic Web Volume 62: Linking Meaning: Semantic Technologies Shaping the Future of AI. IOS Press Ebooks, 2025.

BibTeX: Download