4 months ago

Dataset for Automatic Summarization of Russian News

View Paper Details

Ilya Gusev

Dataset for Automatic Summarization of Russian News

Abstract

Automatic text summarization has been studied in a variety of domains and languages. However, this does not hold for the Russian language. To overcome this issue, we present Gazeta, the first dataset for summarization of Russian news. We describe the properties of this dataset and benchmark several extractive and abstractive models. We demonstrate that the dataset is a valid task for methods of text summarization for Russian. Additionally, we prove the pretrained mBART model to be useful for Russian text summarization.

Code Repositories

IlyaGusev/summarus

Official

pytorch

Mentioned in GitHub

IlyaGusev/gazeta

Official

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
text-summarization-on-gazeta	Finetuned mBART	BLEU: 12.4 Meteor: 25.7 ROUGE-1: 32.1 ROUGE-2: 14.2 ROUGE-L: 27.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

4 months ago

Dataset for Automatic Summarization of Russian News

View Paper Details

Ilya Gusev

Dataset for Automatic Summarization of Russian News

Abstract

Automatic text summarization has been studied in a variety of domains and languages. However, this does not hold for the Russian language. To overcome this issue, we present Gazeta, the first dataset for summarization of Russian news. We describe the properties of this dataset and benchmark several extractive and abstractive models. We demonstrate that the dataset is a valid task for methods of text summarization for Russian. Additionally, we prove the pretrained mBART model to be useful for Russian text summarization.

Code Repositories

IlyaGusev/summarus

Official

pytorch

Mentioned in GitHub

IlyaGusev/gazeta

Official

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
text-summarization-on-gazeta	Finetuned mBART	BLEU: 12.4 Meteor: 25.7 ROUGE-1: 32.1 ROUGE-2: 14.2 ROUGE-L: 27.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Dataset for Automatic Summarization of Russian News | Papers | HyperAI