Question Answering On Narrativeqa
Metrics
BLEU-1
BLEU-4
METEOR
Rouge-L
Results
Performance results of various models on this benchmark
| Paper Title | |||||
|---|---|---|---|---|---|
| Oracle IR Models | 54.60/55.55 | 26.71/27.78 | - | - | The NarrativeQA Reading Comprehension Challenge |
| Masque (NarrativeQA + MS MARCO) | 54.11 | 30.43 | 26.13 | 59.87 | Multi-style Generative Reading Comprehension |
| Masque (NarrativeQA only) | 48.7 | 20.98 | 21.95 | 54.74 | Multi-style Generative Reading Comprehension |
| DecaProp | 44.35 | 27.61 | 21.80 | 44.69 | Densely Connected Attention Propagation for Reading Comprehension |
| MHPGM + NOIC | 43.63 | 21.07 | 19.03 | 44.16 | Commonsense for Generative Multi-Hop Question Answering Tasks |
| ConZNet | 42.76 | 22.49 | 19.24 | 46.67 | Cut to the Chase: A Context Zoom-in Network for Reading Comprehension |
| BiAttention + DCU-LSTM | 36.55 | 19.79 | 17.87 | 41.44 | Multi-Granular Sequence Encoding via Dilated Compositional Units for Reading Comprehension |
| FiD+Distil | 35.3 | 7.5 | 11.1 | 32 | Distilling Knowledge from Reader to Retriever for Question Answering |
| BiDAF | 33.45 | 15.69 | 15.68 | 36.74 | Bidirectional Attention Flow for Machine Comprehension |
| BERT-QA with Hard EM objective | - | - | - | 58.8 | A Discrete Hard EM Approach for Weakly Supervised Question Answering |
0 of 10 row(s) selected.