Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

World Action Models: The Next Frontier in Embodied AI































AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

World Action Models: The Next Frontier in Embodied AI






























World Action Models are Zero-shot Policies
ResearchMath-14K: Scaling Research-Level Mathematics via Agents
Self-Improving Language Models with Bidirectional Evolutionary Search
From Pixels to Words -- Towards Native One-Vision Models at Scale
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
Agent Harness Engineering: A Survey
D^2-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
Language Models Need Sleep
ECHO: Terminal Agents Learn World Models for Free
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Foundation Protocol: A Coordination Layer for Agentic Society
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
Macaron-A2UI: A Model for Generative UI in Personal Agents
DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning
ViMU: Benchmarking Video Metaphorical Understanding
SMOL: Professionally translated parallel data for 115 under-represented languages
Chi-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs
World Action Models are Zero-shot Policies
ResearchMath-14K: Scaling Research-Level Mathematics via Agents
Self-Improving Language Models with Bidirectional Evolutionary Search
From Pixels to Words -- Towards Native One-Vision Models at Scale
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
Agent Harness Engineering: A Survey
D^2-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
Language Models Need Sleep
ECHO: Terminal Agents Learn World Models for Free
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Foundation Protocol: A Coordination Layer for Agentic Society
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
Macaron-A2UI: A Model for Generative UI in Personal Agents
DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning
ViMU: Benchmarking Video Metaphorical Understanding
SMOL: Professionally translated parallel data for 115 under-represented languages
Chi-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?
Combining On-Policy Optimization and Distillation for Long-Context Reasoning in Large Language Models
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs