A multilingual research initiative to automatically transform toxic language into neutral, fluent text — building safer digital spaces across 15 languages worldwide.
| Model | Task | Size | Downloads | Link |
|---|---|---|---|---|
| xlmr-large-toxicity-classifier | Classification | 0.3B | 1.3k | 🤗 HF |
| xlmr-large-toxicity-classifier-v2 | Classification | 0.6B | 934 | 🤗 HF |
| roberta_toxicity_classifier | Classification | 0.3B | 49k | 🤗 HF |
| russian_toxicity_classifier | Classification | 0.3B | 7k | 🤗 HF |
| bert-multilingual-toxicity-classifier | Classification | 0.2B | 591 | 🤗 HF |
| glot500-toxicity-classifier | Classification | 0.4B | 689 | 🤗 HF |
| twitter-xlmr-toxicity-classifier | Classification | 0.6B | 107 | 🤗 HF |
| bart-base-detox | Generation | 0.1B | 615 | 🤗 HF |
| ruT5-base-detox | Generation | 0.2B | 7 | 🤗 HF |
| mbart-detox-baseline | Generation | 0.6B | 9 | 🤗 HF |
| mt5-xl-detox-baseline | Generation | 4B | 8 | 🤗 HF |
| Llama-pairwise-toxicity-evaluator | Evaluation | 8B | 5 | 🤗 HF |
| Llama-pairwise-content-evaluator | Evaluation | 8B | — | 🤗 HF |
The TextDetox initiative is led by Daryna Dementieva (postdoctoral researcher at TU Munich), in collaboration with an international team of NLP researchers from TU Munich, Skoltech, IIT Kharagpur, University of Hamburg, UAE University, Bar-Ilan University, and others.
All contributors: Daryna Dementieva, Nikolay Babakov, Vitaly Protasov, Elisei Stakovskii, Debora Nozza, Caroline Brun, Chaya Liebeskind, Arianna Muti, Sotaro Takeshita, Alisa Smirnova, Daniil Moskovskiy, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Abinew Ali Ayele, Dmitry Ustalov, Ashraf Elnagar, Animesh Mukherjee, Alexander Panchenko.
We are happy to extend our research to more languages, cultures, and dimensions! 🌍 Contact: Daryna Dementieva on HuggingFace · TUM Profile · ACL Anthology