CONF OtroshiShahreza_MLSP_2025/IDIAP Generating Synthetic Face Recognition Datasets Using Brownian Identity Diffusion and a Foundation Model Otroshi Shahreza, Hatef Marcel, Sébastien Face Recognition Foundation Model Synthetic Data EXTERNAL https://publications.idiap.ch/attachments/papers/2025/OtroshiShahreza_MLSP_2025.pdf PUBLIC 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP) 2025 https://ieeexplore.ieee.org/abstract/document/11204248 URL 10.1109/MLSP62443.2025.11204248 doi Training face recognition models requires a large amount of identity-labeled face images, which are often collected by crawling the web, and therefore have ethical and privacy concerns. Recently, generating synthetic face datasets and training face recognition models using synthetic datasets has emerged to be a viable solution. This paper presents BIF-Face, a new framework to generate synthetic face recognition datasets. We use the Brownian identity diffusion to generate synthetic identities, and then build synthetic face recognition datasets by generating different samples per each identity using a foundation model. In our experiments, we use the generated face datasets to train face recognition models and evaluate them on several real benchmarking dataset. Our experimental results show that face recognition models trained with BIF-Face achieve competitive performance with face recognition models trained on state-of-the-art synthetic face recognition datasets.