LLM

Personabench: Evaluating ai models on understanding personal information through accessing (synthetic) private user data
Personalization is essential for AI assistants, especially in private AI settings where models are expected to interpret users’ …
Apigen: Automated pipeline for generating verifiable and diverse function-calling datasets
The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an …