Multi-LMentry Multilingual Basic Task Benchmark Dataset
Multi-LMentry is a multilingual benchmark dataset released in 2025, designed to systematically evaluate the cross-lingual generalization ability of large language models (LLMs) for low-level language understanding and basic reasoning tasks in multilingual environments.
This dataset covers nine languages: English, Catalan, German, Spanish, Basque, Galician, Korean, Italian, and Brazilian Portuguese. The tasks were manually redesigned by native speakers, similar in form to the original LMentry framework, but not as direct translations, to ensure naturalness and cultural fit.
Dataset structure
- The dataset is organized into folders by language.
- In each language folder, each task corresponds to a JSON file.
- Each JSON file contains input hints and expected outputs for the task.
- The task types include simple sentence construction, contextual vocabulary selection, and letter reasoning.
- Some tasks are language-specific; for example, rhyming tasks are excluded in languages where they are not applicable.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.