HyperAI

Cache-to-Cache (C2C) was proposed in October 2025 by a research team from Tsinghua University, the Chinese University of Hong Kong, and other universities and institutions, including Wuwenxinqiong. The relevant research results were published in the paper "Cache-to-Cache: Direct Semantic Communication Between Large Language Models".

C2C is a novel direct semantic communication paradigm between LLMs. It uses a neural network to project and fuse the key-value cache of the source model into the cache of the target model to achieve direct semantic transmission. Compared to text communication, C2C leverages the deep, specialized semantics of both models while avoiding explicit intermediate text generation, making it a practical alternative to token-based communication and highlighting its potential in scalable, low-latency multi-LLM systems.

Command Palette

Cache-to-Cache (C2C)

Build AI with AI

HyperAI Newsletters

Command Palette

Cache-to-Cache (C2C)

Related Wiki

SAC Flow

MultiPL-MoE Architecture

SERES Semantic Aware Sparse View Reconstruction Framework

Gated Attention

Guess – Think – Answer

Group Variance Strategy Optimization GVPO

Multi-agent Workflow CudaForge

DiDi-Instruct Post-Training Method

Discriminative Constraint Optimization Framework (DisCO)

Build AI with AI

HyperAI Newsletters

Command Palette

Cache-to-Cache (C2C)

Related Wiki

SAC Flow

MultiPL-MoE Architecture

SERES Semantic Aware Sparse View Reconstruction Framework

Gated Attention

Guess – Think – Answer

Group Variance Strategy Optimization GVPO

Multi-agent Workflow CudaForge

DiDi-Instruct Post-Training Method

Discriminative Constraint Optimization Framework (DisCO)

Build AI with AI

HyperAI Newsletters

Related Wiki

SAC Flow

MultiPL-MoE Architecture

SERES Semantic Aware Sparse View Reconstruction Framework

Gated Attention

Guess – Think – Answer

Group Variance Strategy Optimization GVPO

Multi-agent Workflow CudaForge

DiDi-Instruct Post-Training Method

Discriminative Constraint Optimization Framework (DisCO)

Related Wiki

SAC Flow

MultiPL-MoE Architecture

SERES Semantic Aware Sparse View Reconstruction Framework

Gated Attention

Guess – Think – Answer

Group Variance Strategy Optimization GVPO

Multi-agent Workflow CudaForge

DiDi-Instruct Post-Training Method

Discriminative Constraint Optimization Framework (DisCO)