MegaScience Scientific Reasoning Dataset
Date
Size
Paper URL
License
CC BY-NC-SA 3.0
MegaScience is a scientific reasoning dataset released by Shanghai Jiao Tong University in 2025. The related paper results are "MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning", the dataset contains 1.25 million instances and is designed to support natural language processing (NLP) and machine learning models, especially in the field of scientific research, such as literature retrieval, information extraction, automatic summarization and citation analysis tasks.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.