DuReader Large-scale open-domain Chinese Machine Reading Comprehension Dataset
Date
Size
Publish URL
Paper URL
License
Other
Tags

DuReader is a large-scale open-domain Chinese dataset for machine reading comprehension, which can be used to train or evaluate machine reading comprehension models and systems.
The dataset consists of 200,000 questions, 420,000 answers, and 1 million documents. The questions and documents are based on Baidu Search and Baidu Knows, and the answers are manually generated. The dataset also provides annotations on the question type, and each question is manually labeled with its classification: Entity, Description, YesNo, Fact or Opinion.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.