Leonid Pugachev Mikhail Burtsev

Abstract
Recent techniques for the task of short text clustering often rely on word embeddings as a transfer learning component. This paper shows that sentence vector representations from Transformers in conjunction with different clustering methods can be successfully applied to address the task. Furthermore, we demonstrate that the algorithm of enhancement of clustering via iterative classification can further improve initial clustering performance with different classifiers, including those based on pre-trained Transformer language models.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| short-text-clustering-on-stackoverflow | Deep ECIC | Acc: Deep ECIC |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.