ESD Emotional Speech Dataset
Date
Publish URL
Paper URL
License
Non-Commercial

ESD stands for Emotional Speech Database, which is an emotional speech dataset for speech conversion research. The dataset consists of 350 parallel utterances spoken by 10 native English speakers and 10 native Chinese speakers, covering 5 emotion categories (neutral, happy, angry, sad, and surprised). More than 29 hours of speech data were recorded in a controlled acoustic environment. This dataset is suitable for multilingual and cross-lingual emotional speech conversion research.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.