Unlike ShieldGemma, it accepts policy at inference time and makes judgments based on reasoning. This means when policy content changes, it can be immediately reflected without model retraining. Flexible and explainable, but slower and higher compute cost using CoT
Introducing gpt-oss-safeguard
New open safety reasoning models (120b and 20b) that support custom safety policies.
https://openai.com/index/introducing-gpt-oss-safeguard/

gpt-oss-safeguard - a openai Collection
gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss
https://huggingface.co/collections/openai/gpt-oss-safeguard

Seonglae Cho