Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/
AI Hacking
Search

AI Hacking

Creator
Creator
Seonglae Cho
Created
Created
2023 Jul 10 12:23
Editor
Editor
Seonglae Cho
Edited
Edited
2024 Dec 21 14:49
Refs
Refs
AI Security
AI Alignment
offsecml
5stars217 • Updated 2025 Mar 10 17:13
  • For training data
  • For inference
AI Hacking Methods
AI Red teaming
LLM extraction attack
AI Reward Hacking
AI Hackathon
AI Bias
 
 
 
 
PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
We will show in this article how one can surgically modify an open-source model, GPT-J-6B, and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.
PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news/
PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Problem/
AI Hacking
Copyright Seonglae Cho