Papers2

#soft labels

Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation

Krzysztof Wróbel, Jan Maria Kowalski et al.Feb 8arXiv

Bielik Guard is a pair of small but strong Polish language safety models that check text for five kinds of risky content: hate/aggression, vulgar language, sexual content, crime, and self-harm.

#Polish NLP#content moderation#safety classifier

Not triaged yet

Grounding and Enhancing Informativeness and Utility in Dataset Distillation

Intermediate

Shaobo Wang, Yantai Yang et al.Jan 29arXiv

This paper tackles dataset distillation by giving a clear, math-backed way to keep only the most useful bits of data, so models can learn well from far fewer images.

#dataset distillation#data condensation#Shapley value

Not triaged yet