Cache-to-Cache (C2C)
Cache-to-Cache (C2C) was proposed in October 2025 by a research team from Tsinghua University, the Chinese University of Hong Kong, and other universities and institutions, including Wuwenxinqiong. The relevant research results were published in the paper "Cache-to-Cache: Direct Semantic Communication Between Large Language Models".
C2C is a novel direct semantic communication paradigm between LLMs. It uses a neural network to project and fuse the key-value cache of the source model into the cache of the target model to achieve direct semantic transmission. Compared to text communication, C2C leverages the deep, specialized semantics of both models while avoiding explicit intermediate text generation, making it a practical alternative to token-based communication and highlighting its potential in scalable, low-latency multi-LLM systems.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.