TextDetox — Multilingual Text Detoxification

01 — About

Detoxifying Language, One Sentence at a Time

Text detoxification is a Text Style Transfer (TST) task: given a toxic sentence, produce a rewritten version that is non-toxic, meaning-preserving, and fluent. Our research line — pioneered with parallel corpora for English and Russian — has grown into the largest multilingual detoxification ecosystem, covering 14 languages and hosting two international shared tasks at CLEF 2024 & 2025.

☠ Toxic Input

"What the hell are you doing, you absolute idiot? Get out of my sight!"

→

✦ Detoxified Output

"What are you doing? Please leave."

15

Languages covered — English, Russian, Ukrainian, German, French, Spanish, Italian, Chinese, Japanese, Arabic, Hebrew, Hindi, Hinglish, Tatar, Amharic

3.6k+

Parallel toxic↔neutral sentence pairs in the multilingual dataset, crowd-sourced and human-validated

11

Open-source models on HuggingFace — classifiers, detoxification baselines, and LLM-based evaluators

3

International shared tasks: RUSSE 2022, TextDetox CLEF 2024 & 2025 — with teams from 20+ countries

Supported Languages

🇬🇧 English🇷🇺 Russian 🇺🇦 Ukrainian🇩🇪 German 🇪🇸 Spanish🇫🇷 French 🇮🇹 Italian🇨🇳 Chinese 🇯🇵 Japanese🇸🇦 Arabic 🇮🇱 Hebrew🇮🇳 Hindi🇮🇳 Hinglish Tatar🇪🇹 Amharic

02 — Shared Tasks

Benchmarking the Community

2025

TextDetox @ CLEF 2025

The second edition of the multilingual detoxification shared task at PAN/CLEF. Extended coverage, updated parallel corpora, and new LLM-based evaluation protocols. Teams compete across 9 languages with both automatic and human judgments.

→ Task Website → Starter Kit 🤗

2024

TextDetox @ CLEF 2024

First edition of the multilingual detoxification shared task at PAN 2024. Attracted international participants working on English, Russian, Ukrainian, German, Spanish, Chinese, Arabic, Hindi, and Amharic. Benchmark included crowdsourced parallel corpora and human evaluation.

→ Task Website → Overview Paper

2022

RUSSE Detoxification 2022

The first text detoxification shared task in Russian, held at the Dialogue 2022 conference. Featured the first parallel Russian detoxification corpus and manual human evaluation. Pioneered the crowdsourcing methodology later extended to 14 languages.

→ GitHub Repository

03 — Datasets

Parallel Corpora for Detoxification

Multilingual · 15 Languages

MultiParaDetox

The flagship multilingual parallel corpus with toxic↔neutral pairs for 15 languages. Collected via crowd-sourcing pipeline extending the original ParaDetox methodology.

3.6k+ pairs15 languagesParallel

→ 🤗 textdetox/multilingual_paradetox

Multilingual · Test Benchmark

MultiParaDetox Test Set

Official held-out test benchmark used across CLEF 2024 & 2025 shared tasks. Human-validated gold references for 9+ languages.

9k samplesMulti-reference

→ 🤗 textdetox/multilingual_paradetox_test

English

ParaDetox (EN)

The first parallel English detoxification dataset. 10,000+ toxic sentences paired with human-written neutral paraphrases. Introduced at ACL 2022 — the foundation that started it all.

10k+ pairsEnglishACL 2022

→ 🤗 s-nlp/paradetox → GitHub

Russian

Russian ParaDetox

The first parallel Russian detoxification corpus, enabling the RUSSE 2022 shared task and the first seq2seq detoxification models for Russian.

RussianParallelRUSSE 2022

→ 🤗 s-nlp/ru_paradetox

Multilingual · Toxicity

Multilingual Toxicity Dataset

Large monolingual toxicity classification dataset across multiple languages. Used for training and fine-tuning the multilingual classifiers.

71.4k samplesClassification

→ 🤗 textdetox/multilingual_toxicity_dataset

Multilingual · XAI

Multilingual Toxic Spans & Lexicon

Fine-grained explainability resources: span-level toxic word annotations and a multilingual lexicon of toxic terms across 14 languages — enabling the explainable detox pipeline.

Toxic Spans: 8.79kLexicon: 176k

→ Toxic Spans → Toxic Lexicon

Multilingual · XAI

Multilingual Toxicity Explained

Human-annotated dataset providing natural language explanations for why a sentence is considered toxic — the first such resource for multiple languages.

8.24k samplesExplanations

→ 🤗 textdetox/multilingual_toxicity_explained

Spanish

ES ParaDetox

Parallel Spanish detoxification corpus collected via the MultiParaDetox pipeline, part of the multilingual expansion effort.

SpanishParallel

→ 🤗 textdetox/es_paradetox

04 — Models

Open-Source Model Zoo

Model	Task	Size	Downloads	Link
xlmr-large-toxicity-classifier	Classification	0.3B	1.3k	🤗 HF
xlmr-large-toxicity-classifier-v2	Classification	0.6B	934	🤗 HF
roberta_toxicity_classifier	Classification	0.3B	49k	🤗 HF
russian_toxicity_classifier	Classification	0.3B	7k	🤗 HF
bert-multilingual-toxicity-classifier	Classification	0.2B	591	🤗 HF
glot500-toxicity-classifier	Classification	0.4B	689	🤗 HF
twitter-xlmr-toxicity-classifier	Classification	0.6B	107	🤗 HF
bart-base-detox	Generation	0.1B	615	🤗 HF
ruT5-base-detox	Generation	0.2B	7	🤗 HF
mbart-detox-baseline	Generation	0.6B	9	🤗 HF
mt5-xl-detox-baseline	Generation	4B	8	🤗 HF
Llama-pairwise-toxicity-evaluator	Evaluation	8B	5	🤗 HF
Llama-pairwise-content-evaluator	Evaluation	8B	—	🤗 HF

05 — Evaluation

How We Measure Quality

🎯

Style Transfer Accuracy

STA

Measures whether the output text is non-toxic. Computed via multilingual toxicity classifiers (XLM-R based). Range: 0 → 1.

📐

Meaning Preservation

SIM

Semantic similarity between toxic input and detoxified output. Computed using multilingual sentence encoders. Ensures no content is lost.

✍️

Fluency

FL

Measures grammaticality and naturalness. Computed as inverse perplexity from a language model. High FL = natural-sounding output.

J = STA × SIM × FL JOINT SCORE — primary ranking metric across all shared task editions

LLM-as-Judge Evaluation (2025)

TextDetox 2025 introduces pairwise LLM-based evaluation using fine-tuned Llama 3 8B models: one judging toxicity style transfer, another judging content preservation — providing human-like assessment at scale. Models are available open-source at textdetox/Llama-pairwise-toxicity-evaluator.

06 — Publications

Research Papers

CLEF 2025

Overview of the Multilingual Text Detoxification Task at PAN 2025

Daryna Dementieva, Vitaly Protasov, Nikolay Babakov, Naquee Rizwan, Ilseyar Alimova, Caroline Brun, Vasily Konovalov, Arianna Muti, Chaya Liebeskind, Marina Litvak, Debora Nozza, Shehryaar Shah Khan, Sotaro Takeshita, Natalia Vanetik, Abinew Ali Ayele, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Ashraf Elnagar, Animesh Mukherjee, and Alexander Panchenko

2025 edition of the Shared task with new languages: French, Italian, Hebrew, Hinglish, Japanese, and Tatar.

PDF

COLING 2025

Multilingual and Explainable Text Detoxification with Parallel Corpora

Daryna Dementieva, Nikolay Babakov, Amit Ronen, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Daniil Moskovskiy, Elisei Stakovskii, Eran Kaufman, Ashraf Elnagar, Animesh Mukherjee, Alexander Panchenko

Extended multilingual corpora (DE, ZH, AR, HI, AM) + explainable analysis + Chain-of-Thought detox prompting

PDF / ACL

CLEF Working Notes 2024

Overview of the Multilingual Text Detoxification Task at PAN 2024

Daryna Dementieva, Daniil Moskovskiy, Nikolay Babakov, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Dmitry Ustalov, Elisei Stakovskii, Alisa Smirnova, Ashraf Elnagar, Animesh Mukherjee, Alexander Panchenko

Comprehensive overview of TextDetox CLEF2024: participants, systems, results, and findings

PDF

NAACL 2024

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

First automated multilingual parallel corpus collection pipeline; state-of-the-art detox in 9 languages

PDF / ACL

IJCNLP-AACL 2023

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

Daryna Dementieva, Daniil Moskovskiy, David Dale, Alexander Panchenko

First study of simultaneous translation + detoxification; new automatic evaluation metrics with higher human correlation

PDF

ACL 2022

ParaDetox: Detoxification with Parallel Data

Varvara Logacheva*, Daryna Dementieva*, Sergey Ustyantsev, Daniil Moskovskiy, David Dale, Irina Krotova, Nikita Semenov, Alexander Panchenko (* equal contribution)

First parallel English detoxification dataset. 10k+ crowdsourced pairs. 98+ citations. The paper that started the field.

PDF / ACL

ACL Student Workshop 2022

Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models

Daniil Moskovskiy, Daryna Dementieva, Alexander Panchenko

First investigation of multilingual and cross-lingual detoxification behavior in large pretrained models

PDF

Dialogue 2022

RUSSE-2022: Findings of the First Russian Detoxification Shared Task Based on Parallel Corpora

Daryna Dementieva, Varvara Logacheva, Irina Nikishina, Alena Fenogenova, David Dale, Irina Krotova, Nikita Semenov, Tatiana Shavrina, Alexander Panchenko

First Russian detox shared task; analysis of automatic vs. human evaluation; crowdsourcing pipeline methodology

GitHub

Multimodal Technologies & Interaction 2021

Methods for Detoxification of Texts for the Russian Language

Daryna Dementieva, Daniil Moskovskiy, Varvara Logacheva, David Dale, Olga Kozlova, Nikita Semenov, Alexander Panchenko

First study of automatic Russian text detoxification using BERT-based editing and GPT-2 seq2seq approaches. 81+ citations.

PDF