Mapping Memes to Words for Multimodal Hateful Meme Classification
Mapping Memes to Words for Multimodal Hateful Meme Classification
Giovanni Burbi Alberto Baldrati Lorenzo Agnolucci Marco Bertini Alberto Del Bimbo

Abstract
Multimodal image-text memes are prevalent on the internet, serving as aunique form of communication that combines visual and textual elements toconvey humor, ideas, or emotions. However, some memes take a malicious turn,promoting hateful content and perpetuating discrimination. Detecting hatefulmemes within this multimodal context is a challenging task that requiresunderstanding the intertwined meaning of text and images. In this work, weaddress this issue by proposing a novel approach named ISSUES for multimodalhateful meme classification. ISSUES leverages a pre-trained CLIPvision-language model and the textual inversion technique to effectivelycapture the multimodal semantic content of the memes. The experiments show thatour method achieves state-of-the-art results on the Hateful Memes Challenge andHarMeme datasets. The code and the pre-trained models are publicly available athttps://github.com/miccunifi/ISSUES.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| hateful-meme-classification-on-harmeme | ISSUES | AUROC: 92.83 Accuracy: 81.64 |
| meme-classification-on-hateful-memes | ISSUES | ROC-AUC: 0.855 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.