HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Text Generation

Po-Nien Kung, Linfeng Song, Dawsen Hwang, et al.

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Visual Question Answering

Yucheng Zhou, Wei Tao, Yiwen Guo, et al.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Image Generation

Multimodal Representation

Yuval Golbari, Navve Wasserman, Matias Cosarinsky, et al.

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Reinforcement Learning

Lei Yang, Siyu Ding, Deyi Xiong

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Object Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, et al.

Trust Region On-Policy Distillation

Text Generation

Xingrun Xing, Haoqing Wang, Boyan Gao, et al.

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Retrieval-Augmented Generation

Intelligent Question Answering

Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, et al.

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$: Vision Language Models Are Native 3D Learners

Depth Estimation

Zhipeng Cai, Zhuang Liu, Yunyang Xiong, et al.

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Retrieval-Augmented Generation

Pengcheng Jiang, Zhiyi Shi, Kelly Hong, et al.

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

Semantic Segmentation

Image Segmentation

Yahui Liu, Lian Yao, Xiaohu Lu, et al.

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Video Generation

Diffusion Model

Hidir Yesiltepe, Jiazhen Hu, Tuna Han Salih Meral, et al.

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Text Generation

Haodi Lei, Yafy Li, Haoran Zhang, et al.

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

Nahyun Lee, Dongkeun Yoon, Guijin Son, et al.

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Tomer Keren, Nitay Calderon, Asaf Yehudai, et al.

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Mind Lab, Song Cao, Vic Cao, et al.

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Image Generation

Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, et al.

TACK: A statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset

Stefano Ribes, Nils Dunlop, Rocío Mercado

Narrative Weaver: Towards Controllable Long-Range Visual Consistency with Multi-Modal Conditioning

Video Generation

Zhengjian Yao, Yongzhi Li, Xinyuan Gao, et al.

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

Minhua Lin, Juncheng Wu, Zijun Wang, et al.

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Reinforcement Learning

Nianyi Lin, Jiajie Zhang, Lei Hou, et al.

Trust-Region Behavior Blending for On-Policy Distillation

Reinforcement Learning

Daniil Plyusov, Alexey Gorbatovski, Alexey Malakhov, et al.

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Ruiqi Li, Yu Zhang, Changhao Pan, et al.

Representation Forcing for Bottleneck-Free Unified Multimodal Models

Image Generation

Yuqing Wang, Zhijie Lin, Ceyuan Yang, et al.

GrepSeek: Training Search Agents for Direct Corpus Interaction

Alireza Salemi, Chang Zeng, Atharva Nijasure, et al.

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Tianyi Zhou, Dongrui Liu, Leitao Yuan, et al.

Agentic Systems as Boosting Weak Reasoning Models

Varun Sunkaraneni, Pierfrancesco Beneventano, Riccardo Neumarker, et al.

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Video Generation

Diffusion Model

You-Zhe Xie, Yu-Hsuan Li, Jie-Ying Lee, et al.

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Video Generation

Diffusion Model

Min Zhao, Hongzhou Zhu, Bokai Yan, et al.

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

Diffusion Model

Image Generation

Fangtai Wu, Hailong Guo, Shijie Huang, et al.

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Retrieval-Augmented Generation

Intelligent Question Answering

Jinheon Baek, Soyeong Jeong, Sangwoo Park, et al.

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qiuyue Wang, Mingsheng Li, Jian Guan, et al.

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Text Generation

Po-Nien Kung, Linfeng Song, Dawsen Hwang, et al.

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Visual Question Answering

Yucheng Zhou, Wei Tao, Yiwen Guo, et al.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Image Generation

Multimodal Representation

Yuval Golbari, Navve Wasserman, Matias Cosarinsky, et al.

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Reinforcement Learning

Lei Yang, Siyu Ding, Deyi Xiong

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Object Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, et al.

Trust Region On-Policy Distillation

Text Generation

Xingrun Xing, Haoqing Wang, Boyan Gao, et al.

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Retrieval-Augmented Generation

Intelligent Question Answering

Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, et al.

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$: Vision Language Models Are Native 3D Learners

Depth Estimation

Zhipeng Cai, Zhuang Liu, Yunyang Xiong, et al.

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Retrieval-Augmented Generation

Pengcheng Jiang, Zhiyi Shi, Kelly Hong, et al.

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

Semantic Segmentation

Image Segmentation

Yahui Liu, Lian Yao, Xiaohu Lu, et al.

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Video Generation

Diffusion Model

Hidir Yesiltepe, Jiazhen Hu, Tuna Han Salih Meral, et al.

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Text Generation

Haodi Lei, Yafy Li, Haoran Zhang, et al.

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

Nahyun Lee, Dongkeun Yoon, Guijin Son, et al.

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Tomer Keren, Nitay Calderon, Asaf Yehudai, et al.

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Mind Lab, Song Cao, Vic Cao, et al.

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Image Generation

Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, et al.

TACK: A statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset

Stefano Ribes, Nils Dunlop, Rocío Mercado

Narrative Weaver: Towards Controllable Long-Range Visual Consistency with Multi-Modal Conditioning

Video Generation

Zhengjian Yao, Yongzhi Li, Xinyuan Gao, et al.

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

Minhua Lin, Juncheng Wu, Zijun Wang, et al.

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Reinforcement Learning

Nianyi Lin, Jiajie Zhang, Lei Hou, et al.

Trust-Region Behavior Blending for On-Policy Distillation

Reinforcement Learning

Daniil Plyusov, Alexey Gorbatovski, Alexey Malakhov, et al.

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Ruiqi Li, Yu Zhang, Changhao Pan, et al.

Representation Forcing for Bottleneck-Free Unified Multimodal Models

Image Generation

Yuqing Wang, Zhijie Lin, Ceyuan Yang, et al.

GrepSeek: Training Search Agents for Direct Corpus Interaction

Alireza Salemi, Chang Zeng, Atharva Nijasure, et al.

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Tianyi Zhou, Dongrui Liu, Leitao Yuan, et al.

Agentic Systems as Boosting Weak Reasoning Models

Varun Sunkaraneni, Pierfrancesco Beneventano, Riccardo Neumarker, et al.

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Video Generation

Diffusion Model

You-Zhe Xie, Yu-Hsuan Li, Jie-Ying Lee, et al.

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Video Generation

Diffusion Model

Min Zhao, Hongzhou Zhu, Bokai Yan, et al.

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

Diffusion Model

Image Generation

Fangtai Wu, Hailong Guo, Shijie Huang, et al.

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Retrieval-Augmented Generation

Intelligent Question Answering

Jinheon Baek, Soyeong Jeong, Sangwoo Park, et al.

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qiuyue Wang, Mingsheng Li, Jian Guan, et al.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Trust Region On-Policy Distillation

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$ : Vision Language Models Are Native 3D Learners

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Draft-OPD: On-Policy Distillation for Speculative Draft Models

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

TACK: A statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset

Narrative Weaver: Towards Controllable Long-Range Visual Consistency with Multi-Modal Conditioning

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Trust-Region Behavior Blending for On-Policy Distillation

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Representation Forcing for Bottleneck-Free Unified Multimodal Models

GrepSeek: Training Search Agents for Direct Corpus Interaction

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Agentic Systems as Boosting Weak Reasoning Models

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Trust Region On-Policy Distillation

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

MAI-Thinking-1: Building a Hill-Climbing Machine

$VLM^3$ : Vision Language Models Are Native 3D Learners

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

DeepCrack: A deep hierarchical feature learning architecture for crack segmentation

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Draft-OPD: On-Policy Distillation for Speculative Draft Models

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

TACK: A statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset

Narrative Weaver: Towards Controllable Long-Range Visual Consistency with Multi-Modal Conditioning

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Trust-Region Behavior Blending for On-Policy Distillation

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Representation Forcing for Bottleneck-Free Unified Multimodal Models

GrepSeek: Training Search Agents for Direct Corpus Interaction

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Agentic Systems as Boosting Weak Reasoning Models

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments