Title: Automatic Knowledge Graph Entity Refinement Based on Social Networks
Abstract: Knowledge Graphs (KGs) like Wikidata or DBpedia allow to store, process and visualize knowledgefacts about real-world entities (nodes) and the interrelations between them (edges). Theseallow users to visualize consolidated knowledge fromheterogeneous data resources in a unifiedand clustered graph. In this work, we propose several contributions to the fields of KnowledgeGraph construction and enrichment. The overarching contribution of this thesis is to introducenew methods to increase the coverage of certain entities in a KG. The four main technicalcontributions of this thesis are as follows: (1) Comparative study that comparesthe state-of-the-art profile matching methods on Online Social Network (OSNs).(2) We propose new techniques to refine specificentities such as academic entities (e.g., authors) and social event entities (e.g., festivals). Foracademic entities, we introduce new methods to identify their corresponding Online SocialNetwork (OSN) links such as Facebook links. To address this objective, we investigate methodsfor matching OSN user profiles based on novel features. For social events, we introduce in addition anew approach to evaluate the overall public sentiment related to these events over time collectedfrom OSNs. We evaluate the performance of our methods on several real-world datasets andshow that they outperform the state of the art and produce high-quality results. (3) We introducea number of measures to explore the user profile scope on multiple OSNs. Through thesemeasures, we analyze three axes: (a) the user profile attributes, (b) user profile content, and (c)user social network. (4) We introduce a novel user profile matching methodto interlink usersacross multiple OSNs thatleverages two fundamental matching features: (a) life events and(b) profile biographies. Life events (e.g., graduation) are used to improve the content matchingprocess, and biographies (a short description that OSN users write about themselves) are usedto improve the attribute matching process. In conclusion, we show how leveraging data frommultiple OSNs is important to complete missing information of many entities inside KGs.