HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

Video Generation

Diffusion Model

Jiwen Liu, Shujuan Li, Zhixue Fang, et al.

InterleaveThinker: Reinforcing Agentic Interleaved Generation

InterleaveThinker: Reinforcing Agentic Interleaved Generation

Image Generation

Dian Zheng, Harry Lee Manyuan Zhang, Kaituo Feng, et al.

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Supervised Fine-Tuning

Jiacheng Chen, Xinyu Zhang, Shunkai Zhang, et al.

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Seokju Cho, Ryo Hachiuma, Abhishek Badki, et al.

WEAVEBENCH: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Wanli Li, Bowen Zhou, Yunyao Yu, et al.

MiniMax Sparse Attention

Xunhao Lai, Weiqi Xu, Yufeng Yang, et al.

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Jundong Xu, Qingchuan Li, Jiaying Wu, et al.

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

Diffusion Model

Video Generation

Jen-Hao Cheng, Yipeng Wang, Hao Zhang, et al.

Modality Forcing for Scalable Spatial Generation

Diffusion Model

Image Generation

Bardienus Pieter Duisterhof, Deva Ramanan, Jeffrey Ichnowski, et al.

From AGI to ASI

Artificial Intelligence

Tim Genewein, Matija Franklin, Alexander Lerchner, et al.

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Diffusion Model

Hao Zhang, Mohamed El Banani, Jen-Hao Cheng, et al.

Regularized f-Divergence Kernel Tests

Mónica Ribero, Antonin Schrab, Arthur Gretton

Pretraining Recurrent Networks without Recurrence

Trajectory-Refined Distillation

Reinforcement Learning

Li Jiang, Haoran Xu, Yichuan Ding, et al.

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Video Understanding

Cong Chen, Guo Gan, Kaixiang Ji, et al.

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao, et al.

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Wenbo Pan, Shujie Liu, Chin-Yew Lin, et al.

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Xucong Wang, Ziyu Ma, Shidong Yang, et al.

ABot-Earth 0.5: Generative 3D Earth Model

Ming Qian, Tianjian Ouyang, Mingchao Sun, et al.

Kwai Keye-VL-2.0 Technical Report

Video Understanding

Kwai Keye Team, Bin Wen, Changyi Liu, et al.

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Multimodal Representation

Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, et al.

If LLMs have human-like attributes, then so does Age of Empires II

Adrian de Wynter

The Last Human-Written Paper: Agent-Native Research Artifacts

Jiachen Liu, Jiaxin Pei, Jintao Huang, et al.

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Yan Wang, Qifan Zhang, Jiachen Yu, et al.

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Aofan Yu, Chenyu Zhou, Tianyi Xu, et al.

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Video Generation

Jiangtao Wu, Jiaming Wang, Yiwen He, et al.

Latent Spatial Memory for Video World Models

Video Generation

Diffusion Model

Weijie Wang, Haoyu Zhao, Yifan Yang, et al.

On the Geometry of On-Policy Distillation

Zhennan Shen, Yanshu Li, Qingyu Yin, et al.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Code Generation

Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, et al.

VoxCPM2 Technical Report

Diffusion Model

Video Generation

Meituan LongCat Team

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

Visual Question Answering

Jovana Kondic, Pengyuan Li, Dhiraj Joshi, et al.

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

Video Generation

Diffusion Model

Jiwen Liu, Shujuan Li, Zhixue Fang, et al.

InterleaveThinker: Reinforcing Agentic Interleaved Generation

InterleaveThinker: Reinforcing Agentic Interleaved Generation

Image Generation

Dian Zheng, Harry Lee Manyuan Zhang, Kaituo Feng, et al.

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Supervised Fine-Tuning

Jiacheng Chen, Xinyu Zhang, Shunkai Zhang, et al.

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Seokju Cho, Ryo Hachiuma, Abhishek Badki, et al.

WEAVEBENCH: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Wanli Li, Bowen Zhou, Yunyao Yu, et al.

MiniMax Sparse Attention

Xunhao Lai, Weiqi Xu, Yufeng Yang, et al.

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Jundong Xu, Qingchuan Li, Jiaying Wu, et al.

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

Diffusion Model

Video Generation

Jen-Hao Cheng, Yipeng Wang, Hao Zhang, et al.

Modality Forcing for Scalable Spatial Generation

Diffusion Model

Image Generation

Bardienus Pieter Duisterhof, Deva Ramanan, Jeffrey Ichnowski, et al.

From AGI to ASI

Artificial Intelligence

Tim Genewein, Matija Franklin, Alexander Lerchner, et al.

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Diffusion Model

Hao Zhang, Mohamed El Banani, Jen-Hao Cheng, et al.

Regularized f-Divergence Kernel Tests

Mónica Ribero, Antonin Schrab, Arthur Gretton

Pretraining Recurrent Networks without Recurrence

Trajectory-Refined Distillation

Reinforcement Learning

Li Jiang, Haoran Xu, Yichuan Ding, et al.

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Video Understanding

Cong Chen, Guo Gan, Kaixiang Ji, et al.

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao, et al.

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Wenbo Pan, Shujie Liu, Chin-Yew Lin, et al.

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Xucong Wang, Ziyu Ma, Shidong Yang, et al.

ABot-Earth 0.5: Generative 3D Earth Model

Ming Qian, Tianjian Ouyang, Mingchao Sun, et al.

Kwai Keye-VL-2.0 Technical Report

Video Understanding

Kwai Keye Team, Bin Wen, Changyi Liu, et al.

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Multimodal Representation

Zhengpeng Feng, Clement Atzberger, Sadiq Jaffer, et al.

If LLMs have human-like attributes, then so does Age of Empires II

Adrian de Wynter

The Last Human-Written Paper: Agent-Native Research Artifacts

Jiachen Liu, Jiaxin Pei, Jintao Huang, et al.

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Yan Wang, Qifan Zhang, Jiachen Yu, et al.

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Aofan Yu, Chenyu Zhou, Tianyi Xu, et al.

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Video Generation

Jiangtao Wu, Jiaming Wang, Yiwen He, et al.

Latent Spatial Memory for Video World Models

Video Generation

Diffusion Model

Weijie Wang, Haoyu Zhao, Yifan Yang, et al.

On the Geometry of On-Policy Distillation

Zhennan Shen, Yanshu Li, Qingyu Yin, et al.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Code Generation

Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, et al.

VoxCPM2 Technical Report

Diffusion Model

Video Generation

Meituan LongCat Team

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

Visual Question Answering

Jovana Kondic, Pengyuan Li, Dhiraj Joshi, et al.

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

WEAVEBENCH: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

MiniMax Sparse Attention

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

Modality Forcing for Scalable Spatial Generation

From AGI to ASI

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Regularized f-Divergence Kernel Tests

Pretraining Recurrent Networks without Recurrence

Trajectory-Refined Distillation

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

ABot-Earth 0.5: Generative 3D Earth Model

Kwai Keye-VL-2.0 Technical Report

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

If LLMs have human-like attributes, then so does Age of Empires II

The Last Human-Written Paper: Agent-Native Research Artifacts

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Latent Spatial Memory for Video World Models

On the Geometry of On-Policy Distillation

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

VoxCPM2 Technical Report

LongCat-Video-Avatar 1.5 Technical Report

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

WEAVEBENCH: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

MiniMax Sparse Attention

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

Modality Forcing for Scalable Spatial Generation

From AGI to ASI

World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible

Regularized f-Divergence Kernel Tests

Pretraining Recurrent Networks without Recurrence

Trajectory-Refined Distillation

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

ABot-Earth 0.5: Generative 3D Earth Model

Kwai Keye-VL-2.0 Technical Report

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

If LLMs have human-like attributes, then so does Age of Empires II

The Last Human-Written Paper: Agent-Native Research Artifacts

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

CoVEBench: Can Video Editing Models Handle Complex Instructions?

Latent Spatial Memory for Video World Models

On the Geometry of On-Policy Distillation

SWE-Explore: Benchmarking How Coding Agents Explore Repositories

VoxCPM2 Technical Report

LongCat-Video-Avatar 1.5 Technical Report

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding