MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

2 months ago

Information

Artificial Intelligence

MiniCPM5-1B is an open-source language model with 1 billion parameters, designed for edge deployment and resource-constrained scenarios. It is the first model in the MiniCPM5 series. Based on the standard Llama architecture, it introduces features including... A hybrid inference paradigm based on tags. Furthermore, this model leverages advanced RL+OPD training techniques to significantly improve core performance while effectively eliminating output redundancy. It natively supports ultra-long contexts of up to 131K characters.It has achieved a 1B-level open-source state-of-the-art (SOTA) level in complex tasks such as agent invocation and code synthesis.This model effectively avoids the latency and privacy dilemmas of cloud-based inference, providing an ideal solution for building an efficient local AI platform.

The HyperAI website now features "MiniCPM5-1B: A High-Efficiency 1B LLM for Edge Applications." Give it a try!

Online use:https://go.hyper.ai/OBlhv

Welcome to visit our official website for more information:

https://hyper.ai

A quick overview of updates on the hyper.ai website from May 30th to June 5th:

* High-quality public datasets: 6

* A selection of high-quality tutorials: 5

* Community article analysis: 1 article

* Popular encyclopedia entries: 5

Visit the official website:hyper.ai

Selected public datasets

1. chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

chi-bench is a healthcare agent evaluation dataset released by Actava AI in 2026. This dataset constructs a high-fidelity healthcare business simulation environment, integrating 20 healthcare application systems through the MCP (Model Context Protocol) open interface and providing a knowledge base containing 1,279 healthcare operation documents. The evaluation scenarios cover three major areas in the US healthcare system: pre-authorization management, citation management for health insurance/insurance providers, and population care management.

Online use:https://go.hyper.ai/j8pCr

2. SMOL Multilingual Translation Parallel Dataset

SMOL is a professional translation dataset released by Google in 2025. This dataset includes professionally translated texts in 221 languages, including Amharic, Swahili, and Afar, as well as less commonly annotated languages/regional languages with scarce data. It covers a wide range of language pairs, including both professional translations and texts contributed by volunteers, and adds vertical data and factual annotations related to the medical field for some languages.

Online use:https://go.hyper.ai/84QS4

3. TACK Targeted Chimera Knowledge Base Dataset

TACK is a standardized knowledge base dataset and benchmark set released by AI Laboratory for Molecular Engineering in 2026. It aims to address the problems of data scarcity, lack of rigorous evaluation, and limited coverage in existing PROTAC machine learning benchmarks. It is widely used in fields such as PROTAC degradation activity prediction, targeted protein degradation (TPD) research, AI-assisted drug discovery (AIDD), computer-aided drug design (CADD), virtual drug screening, multi-task learning, molecular property prediction, graph neural network research, and machine learning benchmark testing.

Online use:https://go.hyper.ai/7gDJu

4. EAVSD E-commerce Advertising Video Storyboard Dataset

EAVSD is an e-commerce advertising video storyboard dataset released by a team from Peking University in 2026. It aims to support subject-oriented multi-image generation and narrative planning tasks. This dataset is widely used in subject-oriented multi-image generation and narrative planning tasks, with a core focus on e-commerce advertising video storyboard generation and controllable long-range visual consistency research.

Online use:https://go.hyper.ai/hyzLx

5. DeepCrack Infrastructure Crack Detection Dataset

DeepCrack is a benchmark dataset for infrastructure crack detection provided by the Computer Vision and Remote Sensing Laboratory of Wuhan University. It aims to provide standardized and high-precision supervised learning data support for crack detection algorithm research. It can be directly used for training and evaluation of deep learning models such as U-Net, DeepLab, and SegNet, and is widely used in research directions such as structural health monitoring, road inspection, and building defect identification.

Online use:https://go.hyper.ai/88zlH

6. World Air Pollution and AQI Dataset

The World Air Pollution and AQI is a global air quality dataset for research and data analysis. This dataset contains monthly city-level observation data from 2014 to 2025, totaling 331,920 records, covering 24 countries across 5 continents, including China, the United States, the United Kingdom, France, Germany, Japan, and South Korea. It includes 24 features, encompassing air pollutant concentrations, air quality index, meteorological variables, and social and environmental indicators.

Online use:https://go.hyper.ai/QL8VK

Selected Public Tutorials

1. MiniCPM5-1B: High-efficiency 1B LLM for edge-side applications

MiniCPM5-1B is the first model in the MiniCPM5 series released by the OpenBMB team. It is designed for edge deployment and resource-constrained scenarios. It adopts a 1B parameter-dense Transformer architecture and achieves state-of-the-art performance among open-source models of the same size. It is particularly good at agentic tool calls, code generation, and challenging inference tasks.

Run online:https://go.hyper.ai/OBlhv

2. HiDream-O1-Image Image Generation System

HiDream-O1-Image is a native unified image generation foundation model, launched by the HiDream.ai team in 2026. The model is built on a pixel-level unified Transformer (UiT) architecture. Unlike traditional models, it does not rely on external VAEs or separate text encoders, but instead natively encodes pixels and text within a single, shared token space.

Run online:https://go.hyper.ai/XkyGK

3. X2SAM: A unified model for arbitrary image and video segmentation

X2SAM, released in April 2026 by Sun Yat-sen University, Pengcheng Laboratory, and Meituan team, is a multimodal large model for unified image and video segmentation scenarios. The core feature of this project is that it unifies text prompts, visual prompts, and image/video segmentation into a single interactive process.

Run online:https://go.hyper.ai/OAndb

4. LocateAnything-3B: A fast, high-quality visual language localization model

Released by NVIDIA in 2026, LocateAnything-3B is a 3B-parameter visual language localization model in the Eagle VLM series, designed for tasks such as open object detection, point expression localization, OCR text localization, GUI element localization, and pointing in images and videos. The core feature of this model is Parallel Box Decoding: it predicts complete bounding box coordinates as structured blocks in parallel, rather than generating coordinates through token-by-token autoregression, thereby improving localization throughput while maintaining geometric consistency.

Run online:https://go.hyper.ai/DxUFC

5. Granite 4.1 8B: Supports dialogue, encoding, RAG, and tool calls.

Granite 4.1 language models are a new generation of open-source foundational models launched by IBM in 2026, encompassing dense decoder architectures at three scales: 3B, 8B, and 30B. Granite 4.1 8B, as the high-performance version in this series, achieves the superior performance required for enterprise applications while maintaining a lightweight parameter scale. This model natively supports multilingual capabilities, a wide range of encoding tasks, Retrieval Enhancement Generation (RAG), tool usage, and structured JSON output, providing robust technical support for real-world applications.

Run online:https://go.hyper.ai/Fpzl7

Community article interpretation

1. The National University of Singapore proposes an AI-computational chemistry collaborative process to accelerate the repositioning of drugs for diabetic wound healing, reducing the R&D cycle by over 701 TP3T!

A research team at the National University of Singapore has proposed a collaborative computational nanomedicine research process that combines artificial intelligence and computational chemistry (AI-CC). This process deeply couples literature mining driven by large language models (qualitative insight) with multi-stage molecular simulation dominated by computational chemistry (quantitative verification), constructing a closed-loop research system for drug-protein nano-interactions and accelerating the repositioning and development of drugs for diabetic wound healing.

View the full report:https://go.hyper.ai/OXs3N

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

2 months ago

Information

Artificial Intelligence

The HyperAI website now features "MiniCPM5-1B: A High-Efficiency 1B LLM for Edge Applications." Give it a try!

Online use:https://go.hyper.ai/OBlhv

Welcome to visit our official website for more information:

https://hyper.ai

A quick overview of updates on the hyper.ai website from May 30th to June 5th:

* High-quality public datasets: 6

* A selection of high-quality tutorials: 5

* Community article analysis: 1 article

* Popular encyclopedia entries: 5

Visit the official website:hyper.ai

Selected public datasets

1. chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

Online use:https://go.hyper.ai/j8pCr

2. SMOL Multilingual Translation Parallel Dataset

Online use:https://go.hyper.ai/84QS4

3. TACK Targeted Chimera Knowledge Base Dataset

Online use:https://go.hyper.ai/7gDJu

4. EAVSD E-commerce Advertising Video Storyboard Dataset

Online use:https://go.hyper.ai/hyzLx

5. DeepCrack Infrastructure Crack Detection Dataset

Online use:https://go.hyper.ai/88zlH

6. World Air Pollution and AQI Dataset

Online use:https://go.hyper.ai/QL8VK

Selected Public Tutorials

1. MiniCPM5-1B: High-efficiency 1B LLM for edge-side applications

Run online:https://go.hyper.ai/OBlhv

2. HiDream-O1-Image Image Generation System

Run online:https://go.hyper.ai/XkyGK

3. X2SAM: A unified model for arbitrary image and video segmentation

Run online:https://go.hyper.ai/OAndb

4. LocateAnything-3B: A fast, high-quality visual language localization model

Run online:https://go.hyper.ai/DxUFC

5. Granite 4.1 8B: Supports dialogue, encoding, RAG, and tool calls.

Run online:https://go.hyper.ai/Fpzl7

Community article interpretation

View the full report:https://go.hyper.ai/OXs3N

Command Palette

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Selected public datasets

Selected Public Tutorials

Community article interpretation

Popular Encyclopedia Articles

Command Palette

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Selected public datasets

Selected Public Tutorials

Community article interpretation

Popular Encyclopedia Articles

Related News

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

ICML 26 Outstanding Papers: Tsinghua JustGRPO Overcomes the dLLM Inference Bottleneck; Say Goodbye to Simple Instruction Tests: Agents Last Exam Comprehensively Evaluates the long-range Professional Capabilities of Intelligent agents.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Google Releases TabFM-1.0.0-PyTorch: a zero-shot Prediction Model Designed for Mixed Tabular Data; NVIDIA open-sources Multinational Synthetic Character Dataset, With Tens of Millions of Characters available.

Dataset Summary | NVIDIA Open Sources Nemotron Datasets: Over 10TB of Tokens + 40M Training Samples, Covering Mathematical Reasoning, Code Generation, and Multilingual dialogue.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Command Palette

MiniCPM5-1B, Trained Using RL+OPD, Achieves state-of-the-art (SOTA) Performance on Multiple Complex Tasks; the CHI-Bench Dataset for Evaluating Medical Agents, Designed for Automation of Complex Healthcare Processes, Has Been released.

Selected public datasets

Selected Public Tutorials

Community article interpretation

Popular Encyclopedia Articles

Related News

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

ICML 26 Outstanding Papers: Tsinghua JustGRPO Overcomes the dLLM Inference Bottleneck; Say Goodbye to Simple Instruction Tests: Agents Last Exam Comprehensively Evaluates the long-range Professional Capabilities of Intelligent agents.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Google Releases TabFM-1.0.0-PyTorch: a zero-shot Prediction Model Designed for Mixed Tabular Data; NVIDIA open-sources Multinational Synthetic Character Dataset, With Tens of Millions of Characters available.

Dataset Summary | NVIDIA Open Sources Nemotron Datasets: Over 10TB of Tokens + 40M Training Samples, Covering Mathematical Reasoning, Code Generation, and Multilingual dialogue.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Related News

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

ICML 26 Outstanding Papers: Tsinghua JustGRPO Overcomes the dLLM Inference Bottleneck; Say Goodbye to Simple Instruction Tests: Agents Last Exam Comprehensively Evaluates the long-range Professional Capabilities of Intelligent agents.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Google Releases TabFM-1.0.0-PyTorch: a zero-shot Prediction Model Designed for Mixed Tabular Data; NVIDIA open-sources Multinational Synthetic Character Dataset, With Tens of Millions of Characters available.

Dataset Summary | NVIDIA Open Sources Nemotron Datasets: Over 10TB of Tokens + 40M Training Samples, Covering Mathematical Reasoning, Code Generation, and Multilingual dialogue.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.

Related News

4-step Image output/4K quality/6x Speedup, PiD Uses Pixel Diffusion to Unify Decoding and super-resolution Output; SA-3DAO: a Dataset Containing 1000 Pairs of Real Images Paired With Handcrafted 3D Meshes by artists.

Tencent open-sources Hy-MT1.5 Translation Model: 440MB Achieves top-tier Translation Capabilities; MIT Jointly Releases MathNet: a Multimodal Mathematical Inference Benchmark Covering 27,000 Real Olympiad Math problems.

Can Emojis Control Speech Generation? Irodori-TTS Is a Japanese TTS Based on the RF-DiT Architecture; Eczema and Tinea Skin Disease Datasets: Supporting Medical Image Classification and Transfer learning.

ICML 26 Outstanding Papers: Tsinghua JustGRPO Overcomes the dLLM Inference Bottleneck; Say Goodbye to Simple Instruction Tests: Agents Last Exam Comprehensively Evaluates the long-range Professional Capabilities of Intelligent agents.

Extremely Lightweight, yet With Undiminished Image Quality! ERNIE-Image-Turbo: Say Goodbye to Long Waits, lightning-fast Speed; Introducing dual-dimensional Metrics of Perception and Cognition: Alibaba's Unified Multimodal Parsing and Evaluation Dataset OmniParsingBench Is Now online.

Google Releases TabFM-1.0.0-PyTorch: a zero-shot Prediction Model Designed for Mixed Tabular Data; NVIDIA open-sources Multinational Synthetic Character Dataset, With Tens of Millions of Characters available.

Dataset Summary | NVIDIA Open Sources Nemotron Datasets: Over 10TB of Tokens + 40M Training Samples, Covering Mathematical Reasoning, Code Generation, and Multilingual dialogue.

MIT/IBM Has Released ChartNet, the Largest Synthetic Chart Dataset to Date, Generating 1.5 Million Diverse Chart samples.

Supports live-action/animation/animal-driven Video Generation; Meituan's open-source multi-style audio-driven Video Generation Framework LongCat 1.5 Enhances VLM's Chart Reconstruction and Table Extraction Capabilities Using the million-level Chart Understanding Dataset ChartNet.