PKU Simplified Chinese Word Segmentation Dataset
Date
Size
Publish URL
Paper URL
SIGHAN 2005 The International Chinese Automatic Word Segmentation Evaluation (SIGHAN Evaluation) integrates word segmentation datasets from multiple institutions. This dataset was jointly released by Microsoft Research China, Peking University, City University of Hong Kong, and Academia Sinica in Taiwan, and is used for training and evaluating Chinese word segmentation models. PKU is a simplified Chinese word segmentation dataset.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.