Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching

Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs































DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching

Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs






























DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
EditThinker: Unlocking Iterative Reasoning for Any Image Editor
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment
WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing
PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Algorithmic Thinking Theory
Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach
OneThinker: All-in-one Reasoning Model for Image and Video
ViDiC: Video Difference Captioning
PretrainZero: Reinforcement Active Pretraining
DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
EditThinker: Unlocking Iterative Reasoning for Any Image Editor
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment
WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing
PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Algorithmic Thinking Theory
Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach
OneThinker: All-in-one Reasoning Model for Image and Video
ViDiC: Video Difference Captioning
PretrainZero: Reinforcement Active Pretraining