Papers3

#persistent memory

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

This paper introduces MM-Lifelong, a 181-hour, multi-scale video dataset designed to test AI on true long-term (lifelong) understanding across days to months.

#multimodal lifelong understanding#long video reasoning#working memory bottleneck

Not triaged yet

Agents of Chaos

Beginner

Natalie Shapira, Chris Wendler et al.Feb 23arXiv

This paper put real AI agents into a safe, live playground and asked expert testers to mess with them to see what breaks.

#AI agents#red teaming#identity verification

Not triaged yet

Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models

Intermediate

Anmol Goel, Cornelius Emde et al.Jan 21arXiv

Benign fine-tuning meant to make language models more helpful can accidentally make them overshare private information.

#contextual privacy#privacy collapse#fine-tuning

Not triaged yet