Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Budget-Aware Tool-Use Enables Effective Agent Scaling

In-Video Instructions: Visual Signals as Generative Control































Budget-Aware Tool-Use Enables Effective Agent Scaling

In-Video Instructions: Visual Signals as Generative Control






























DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
Computer-Use Agents as Judges for Generative User Interface
AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning
General Agentic Memory Via Deep Research
VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation
MIST: Mutual Information Via Supervised Training
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Flow Map Distillation Without Data
Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion
HunyuanOCR Technical Report
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine
Solving Spatial Supersensing Without Spatial Supersensing
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
O-Mem: Omni Memory System for Personalized, Long Horizon Self-Evolving Agents
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story
SAM 3: Segment Anything with Concepts
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
HiPO: Hybrid Policy Optimization for Dynamic Reasoning in LLMs
SERES: Semantic-Aware Neural Reconstruction from Sparse Views
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation
MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
Computer-Use Agents as Judges for Generative User Interface
AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning
General Agentic Memory Via Deep Research
VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation
MIST: Mutual Information Via Supervised Training
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Flow Map Distillation Without Data
Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion
HunyuanOCR Technical Report
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine
Solving Spatial Supersensing Without Spatial Supersensing
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
O-Mem: Omni Memory System for Personalized, Long Horizon Self-Evolving Agents
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story
SAM 3: Segment Anything with Concepts
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
HiPO: Hybrid Policy Optimization for Dynamic Reasoning in LLMs
SERES: Semantic-Aware Neural Reconstruction from Sparse Views
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation
MultiPL-MoE: Multi-Programming-Lingual Extension of Large Language Models through Hybrid Mixture-of-Experts
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models