20 Newsgroups Newsgroup Document Dataset
Date
Size
Publish URL
License
Non-Commercial
20 Newsgroups is a dataset consisting of approximately 20,000 news documents and has become a popular dataset for experiments on text applications in machine learning.
The dataset is evenly distributed among 20 different newsgroups and is one of the international standard datasets used for text classification, text mining, and information retrieval research.
The 20 Newsgroups dataset was published by Ken Lang in the Proceedings of the 12th International Conference on Machine Learning in 1995. The related paper is Newsweeder: Learning to filter netnews.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.