NLPCC2016 News Dataset
Date
Size
Publish URL
License
Other
The NLPCC2016 dataset is different from the popular news dataset and uses more informal text from Sina Weibo. The training and test data consists of microblogs from different topics, such as finance, sports, entertainment, etc. This dataset is utf-8 encoded and can be used for Chinese word segmentation tasks.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.