Haller D, Lenz R (2020)
Publication Language: English
Publication Type: Conference contribution, Original article
Publication year: 2020
Publisher: Springer
City/Town: Cham
Pages Range: 112--124
Conference Proceedings Title: Machine Learning and Knowledge Discovery in Databases: International Workshops of ECML PKDD 2019, Würzburg, Germany, September 16--20, 2019, Proceedings, Part II
ISBN: 978-3-030-43887-6
URI: https://link.springer.com/chapter/10.1007/978-3-030-43887-6_10
DOI: 10.1007/978-3-030-43887-6_10
The practical advantage of a data lake depends on the semantic understanding of its data. This knowledge is usually not externalized, but present in the minds of the data analysts who have used a great deal of cognitive effort to understand the semantic relationships of the heterogeneous data sources. The SQL queries they have written contain this hidden knowledge and should therefore serve as the foundation for a self-learning system. This paper proposes a methodology for extracting knowledge fragments from SQL queries and representing them in an RDF-based knowledge graph. The feasibility of this approach is demonstrated by a prototype implementation and evaluated using example data. It is shown that a query-driven knowledge graph is an appropriate tool to approximate the semantics of the data contained in a data lake and to incrementally provide interactive feedback to data analysts to help them with the formulation of queries.
APA:
Haller, D., & Lenz, R. (2020). Pharos: Query-Driven Schema Inference for the Semantic Web. In Machine Learning and Knowledge Discovery in Databases: International Workshops of ECML PKDD 2019, Würzburg, Germany, September 16--20, 2019, Proceedings, Part II (pp. 112--124). Würzburg, DE: Cham: Springer.
MLA:
Haller, David, and Richard Lenz. "Pharos: Query-Driven Schema Inference for the Semantic Web." Proceedings of the International Workshops of ECML PKDD 2019, Würzburg Cham: Springer, 2020. 112--124.
BibTeX: Download