logo Idiap Research Institute        
 [BibTeX] [Marc21]
Entity Matching Across Small Networks Using Node Attributes
Type of publication: Conference paper
Citation: Ahmadi_ECAI-PAIS2024_2024
Publication status: Accepted
Booktitle: ECAI 2024 - 27th European Conference on Artificial Intelligence, October 19-24, 2024, Santiago de Compostela, Spain - Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS 2024), Proceedings
Year: 2024
DOI: https://dx.doi.org/10.3233/FAIA241054
Abstract: Entity matching, also known as user identity linkage, is a critical task in data integration. While established techniques primarily focus on large-scale networks, there are several applications where small networks pose challenges due to limited training data and sparsity. This study addresses entity matching in the field of criminology, where small networks are common and the number of known matching nodes is restricted. To support this research, we exploit a multimodal dataset, collected as part of a security-related project, consisting of an intercepted telephone calls network (i.e., ROXSD data) and a network of social forum interactions (i.e., ROXHOOD data) collected in a simulated environment, although following real investigation scenario. To improve accuracy and efficiency, we propose a novel approach for entity matching across these two small networks using node attributes. Existing techniques often merely focus on topology consistency between two networks and overlook valuable information, such as network node attributes, making them vulnerable to structural changes. Inspired by the remarkable success of deep learning, we present UGC-DeepLink, an end-to-end semi-supervised learning framework that leverages user-generated content. UGC-DeepLink encodes network nodes into vector representations, capturing both local and global network structures to align anchor nodes using deep neural networks. A dual learning paradigm and the policy gradient method transfer knowledge and update the linkage. Additionally, node attributes, such as call contents and forum exchanged texts, enhance the ranking of matching nodes. Experimental results on ROXSD and ROXHOOD demonstrate that UGC-DeepLink surpasses baselines and state-of-the-art methods in terms of identity-match ranking.
Keywords:
Projects EC H2020-ROXANNE
Authors Ahmadi, Zahra
Zhang, Zijian
Nguyen, Hoang H.
Burdisso, Sergio
Madikeri, Srikanth
Motlicek, Petr
Dikici, Erinc
Backfried, Gerhard
Kovac, Marek
Kudenko, Daniel
Added by: [UNK]
Total mark: 0
Attachments
  • Ahmadi_ECAI-PAIS2024_2024.pdf
Notes