Mixture-of-Thoughts Reasoning Dataset
Mixture-of-Thoughts is a multi-domain reasoning dataset that integrates high-quality reasoning tracks from three major fields: mathematics, programming, and science. It aims to train large language models (LLMs) to perform reasoning step by step. Each sample in this dataset contains messages Fields store the reasoning process in the form of multiple rounds of dialogue (such as: question → thinking steps → answer), supporting the model's ability to learn step-by-step reasoning.
Dataset structure:
- Mathematics: 93.7k math problem reasoning traces
- Programming: 83.1k reasoning tracks for competitive programming problems in Python and C++
- Science: 173k Reasoning tracks for scientific questions
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.