HyperAIHyperAI

Command Palette

Search for a command to run...

MDPBench Multilingual Document Parsing Benchmark Dataset

MDPBench is a benchmark dataset for parsing multilingual digital and photographic documents; related research papers include... MDPBench: A Benchmark for Multilingual Document Parsing in Real-World ScenariosThe aim is to evaluate and improve the model's ability to parse multilingual documents in real-world, complex scenarios. The dataset contains 3,400 document images covering 17 languages, including Simplified Chinese, Traditional Chinese, English, Arabic, German, Spanish, French, Hindi, Indonesian, Italian, Japanese, Korean, Portuguese, Russian, Thai, and Vietnamese. The images underwent a rigorous process of expert model annotation, manual correction, and manual verification to achieve high-quality annotations.

Dataset Example
Dataset Example

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp