Zen3 Guard

Zen3 safety moderation model for multilingual content classification and filtering.

Overview

Zen Guard models provide multilingual content safety classification with three severity tiers: Safe, Controversial, and Unsafe — across 9 safety categories and 119 languages.

Developed by Hanzo AI and the Zoo Labs Foundation.

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import re

model_id = "zenlm/zen3-guard"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

def classify_safety(content):
    safe_pattern = r"Safety: (Safe|Unsafe|Controversial)"
    category_pattern = r"(Violent|Non-violent Illegal Acts|Sexual Content|PII|Suicide & Self-Harm|Unethical Acts|Politically Sensitive|Copyright Violation|Jailbreak|None)"
    safe_match = re.search(safe_pattern, content)
    label = safe_match.group(1) if safe_match else None
    categories = re.findall(category_pattern, content)
    return label, categories

messages = [{"role": "user", "content": "How do I learn programming?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
result = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
label, categories = classify_safety(result)
print(f"Safety: {label}, Categories: {categories}")

Model Details

Attribute	Value
Parameters	8B
Architecture	Zen MoDE
Context	32K tokens
Languages	119
License	Apache 2.0

License

Apache 2.0

Downloads last month: 1

Safetensors

Model size

6B params

Tensor type

BF16