Command Palette
Search for a command to run...
OpenMementos Context Memory Compressed Dataset
Date
License
MIT
OpenMementos is a context-memory compression dataset released by Microsoft in 2026, designed for modeling long-chain inference and context management capabilities of large models. This dataset aims to train models to perform context compression and continuous inference, thereby supporting complex multi-step inference tasks within a limited context window. It is widely applicable to research scenarios such as long-chain inference modeling, memory-enhanced model training, and efficient generation. This dataset is built on the OpenThoughts inference dataset and contains 228,557 structured inference tracks, including 123,333 math tracks, 61,485 science tracks, and 43,739 programming tracks. The average number of sentences per track is 187.
Data Structure
This dataset provides two subsets: default: Used for training and supervised fine-tuning (SFT).
- problem(string): Problem statement (input)
- response (string): A Memento-formatted inference response containing block/summary tags.
- domain (string): The domain to which the data belongs (e.g., code, math, science).
- source (string): The original source of the data (from OpenThoughts-v3)
- difficulty(int): The difficulty level of the problem full: used for in-depth research or pipeline processing In addition to the fields mentioned above, it also contains detailed information about the intermediate processing steps:
- sentences(list[string]): A list of sentences derived from the response, used for fine-grained modeling and analysis.
blocks(list[list[int]]): Boundary indices of the inference blocks, each element being[start_idx, end_idx], representing the sentence range corresponding to this block.- block_summaries(list[string]): A summary of each block's stages, reflecting the reasoning process of progressive compression and abstraction.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.