Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation
IntermediateKrzysztof Wrรณbel, Jan Maria Kowalski et al.Feb 8arXiv
Bielik Guard is a pair of small but strong Polish language safety models that check text for five kinds of risky content: hate/aggression, vulgar language, sexual content, crime, and self-harm.
#Polish NLP#content moderation#safety classifier