HotpotQA Question Answering Dataset
Date
Size
Publish URL
Paper URL
License
CC BY-SA 4.0

The HotpotQA dataset is a large-scale question-answering dataset collected on English Wikipedia, including 113,000 crowdsourced questions. To answer these questions, you need to refer to the introduction paragraphs of two Wikipedia articles. Each question contains two gold paragraphs and a list of sentences in some paragraphs. The supporting facts provided in these sentence lists are considered necessary to answer the question.
The dataset has the following characteristics:
- Questions require looking up and reasoning across multiple supporting documents to answer;
- The problems are diverse and not constrained by any pre-existing knowledge base or knowledge schema;
- The dataset provides sentence-level supporting facts required for reasoning, allowing QA systems to reason and explain predictions under strong supervision;
- This dataset provides a new type of fact comparison problem to test the ability of QA systems to extract relevant facts and make necessary comparisons.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.