Date

13 days ago

Organization

Paper URL

2605.31086

License

CC BY 4.0

Tags

LLM

Intelligent Question Answering

Benchmarks

RHELM is a long-range memory assessment dataset released by Microsoft in 2026. Related research papers include... Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term MemoryThe aim is to enhance the long-term memory, multi-hop reasoning, and temporal information synthesis capabilities of large models in complex and dynamic scenarios. It is widely used in research scenarios such as long-term temporal memory evaluation of large language models, verification of long-term interaction capabilities of AI assistants, multi-hop reasoning of large models, temporal information fusion, and hallucination detection. The dataset contains 10 sets of virtual character profiles, 1,305 question-answer pairs, 629 JSON-formatted conversations, 625 TXT-formatted email threads, and 1,053 MD and HTML-formatted attachment documents. The accompanying questions cover seven core types: attachment referencing, mixed reasoning, fact-finding, illusion detection, information aggregation, time-series analysis, and misleading questions.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Global Climate & Energy Transition 2000 – 2026 Global Climate and Energy Dataset

a day ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset Discuss on Discord

Date

13 days ago

Organization

Paper URL

2605.31086

License

CC BY 4.0

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

RHELM Long-Term Memory Assessment Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

RHELM Long-Term Memory Assessment Dataset

Related Datasets

Global Climate & Energy Transition 2000 – 2026 Global Climate and Energy Dataset

Verbatim Spans Query Condition Evidence Extraction Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

FigureBench Scientific Illustration Generation Benchmark Dataset

TACK Targeted Chimera Knowledge Base Dataset

EAVSD E-commerce Advertising Video Storyboard Dataset

DeepCrack Infrastructure Crack Detection Dataset

ViMU Video Metaphor Understanding Dataset

MemLens Multimodal Long Context Benchmark Dataset

VisCoR-55K Visual Inference Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

Emotion-probes Emotion Detection Dataset

OpenMementos Context Memory Compressed Dataset

MIA Multistep Inference and Decision Trajectory Dataset

GPT-5.4-step-by-step-reasoning Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

RHELM Long-Term Memory Assessment Dataset

Related Datasets

Global Climate & Energy Transition 2000 – 2026 Global Climate and Energy Dataset

Verbatim Spans Query Condition Evidence Extraction Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

FigureBench Scientific Illustration Generation Benchmark Dataset

TACK Targeted Chimera Knowledge Base Dataset

EAVSD E-commerce Advertising Video Storyboard Dataset

DeepCrack Infrastructure Crack Detection Dataset

ViMU Video Metaphor Understanding Dataset

MemLens Multimodal Long Context Benchmark Dataset

VisCoR-55K Visual Inference Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

Emotion-probes Emotion Detection Dataset

OpenMementos Context Memory Compressed Dataset

MIA Multistep Inference and Decision Trajectory Dataset

GPT-5.4-step-by-step-reasoning Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

Global Climate & Energy Transition 2000 – 2026 Global Climate and Energy Dataset

Verbatim Spans Query Condition Evidence Extraction Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

FigureBench Scientific Illustration Generation Benchmark Dataset

TACK Targeted Chimera Knowledge Base Dataset

EAVSD E-commerce Advertising Video Storyboard Dataset

DeepCrack Infrastructure Crack Detection Dataset

ViMU Video Metaphor Understanding Dataset

MemLens Multimodal Long Context Benchmark Dataset

VisCoR-55K Visual Inference Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

Emotion-probes Emotion Detection Dataset

OpenMementos Context Memory Compressed Dataset

MIA Multistep Inference and Decision Trajectory Dataset

GPT-5.4-step-by-step-reasoning Dataset

COCO-2017-Vietnamese Vietnamese Image Detection Dataset

Related Datasets

Global Climate & Energy Transition 2000 – 2026 Global Climate and Energy Dataset

Verbatim Spans Query Condition Evidence Extraction Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

FigureBench Scientific Illustration Generation Benchmark Dataset