Mandar Joshi; Omer Levy; Daniel S. Weld; Luke Zettlemoyer

Abstract
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| coreference-resolution-on-conll-2012 | c2f-coref + BERT-large | Avg F1: 76.9 |
| coreference-resolution-on-ontonotes | BERT-large | F1: 76.9 |
| coreference-resolution-on-ontonotes | BERT-base | F1: 73.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.