Personabench: Evaluating ai models on understanding personal information through accessing (synthetic) private user data

Abstract

Personalization is essential for AI assistants, especially in private AI settings where models are expected to interpret users’ personal data (eg, conversations, app usage) to understand their background, preferences, and social context. However, due to privacy concerns, existing academic research lacks direct access to such data, making benchmarking difficult. To fill this gap, we propose a synthetic data pipeline that generates realistic user profiles and private documents, enabling the creation of PersonaBench—a benchmark for evaluating models’ ability to understand personal information. Using this benchmark, we assess Retrieval-Augmented Generation (RAG) pipelines on personalized questions and find that current models struggle to accurately extract and answer questions even when provided with the full set of user documents, highlighting the need for improved personalization methods.

Type
Publication
Findings of the Association for Computational Linguistics: ACL 2025
Liangwei Yang
Liangwei Yang
Research Scientist at Salesforce Research

My research interests include Agent, Data Mining and Efficient Modeling.