Command Palette
Search for a command to run...
RHELM Long-Term Memory Assessment Dataset
Date
Paper URL
License
CC BY 4.0
RHELM is a long-range memory assessment dataset released by Microsoft in 2026. Related research papers include... Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term MemoryThe aim is to enhance the long-term memory, multi-hop reasoning, and temporal information synthesis capabilities of large models in complex and dynamic scenarios. It is widely used in research scenarios such as long-term temporal memory evaluation of large language models, verification of long-term interaction capabilities of AI assistants, multi-hop reasoning of large models, temporal information fusion, and hallucination detection. The dataset contains 10 sets of virtual character profiles, 1,305 question-answer pairs, 629 JSON-formatted conversations, 625 TXT-formatted email threads, and 1,053 MD and HTML-formatted attachment documents. The accompanying questions cover seven core types: attachment referencing, mixed reasoning, fact-finding, illusion detection, information aggregation, time-series analysis, and misleading questions.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.