Command Palette
Search for a command to run...
OmniParsingBench Multimodal Parsing Capability Evaluation Dataset
Date
Paper URL
License
Apache 2.0
OmniParsingBench is a benchmark dataset released by Alibaba in 2026 for evaluating the unified parsing capabilities of multimodal large models (MLLM). Related research papers include... Logics-Parsing-Omni Technical ReportIt aims to break through the limitations of traditional single-task evaluation, systematically evaluate the model's capabilities throughout the entire process from perception to cognition, and is widely used in scenarios such as multimodal understanding, structured information extraction, and research on complex reasoning abilities. This dataset contains approximately 5,294 samples, covering six modalities (natural images, graphics, documents, audio, natural video, and text-intensive video), and introduces three levels of evaluation metrics: perception (Perc.), cognition (Cog.), and overall (Ovr.). Each dataset includes an image or audio/video input and a corresponding structured parsing task.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.