<feed xmlns="http://www.w3.org/2005/Atom"> <id>https://blog.rotem.click/</id><title>Rotem's Security Research</title><subtitle>Research blog exploring LLM safety alignment, bias detection, and AI security. Breaking models to understand them.</subtitle> <updated>2026-04-08T20:16:56+03:00</updated> <author> <name>Rotem Levi</name> <uri>https://blog.rotem.click/</uri> </author><link rel="self" type="application/atom+xml" href="https://blog.rotem.click/feed.xml"/><link rel="alternate" type="text/html" hreflang="en" href="https://blog.rotem.click/"/> <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator> <rights> © 2026 Rotem Levi </rights> <icon>/assets/img/favicons/favicon.ico</icon> <logo>/assets/img/favicons/favicon-96x96.png</logo> <entry><title>Breaking Gemma 4 Safety Alignment</title><link href="https://blog.rotem.click/posts/breaking-gemma4-safety-alignment/" rel="alternate" type="text/html" title="Breaking Gemma 4 Safety Alignment" /><published>2026-04-08T12:00:00+03:00</published> <updated>2026-04-08T20:16:16+03:00</updated> <id>https://blog.rotem.click/posts/breaking-gemma4-safety-alignment/</id> <content type="text/html" src="https://blog.rotem.click/posts/breaking-gemma4-safety-alignment/" /> <author> <name>Rotem Levi</name> </author> <category term="AI Security" /> <category term="LLM Safety" /> <summary>I extended the Silenced Biases (AAAI-26) research to Google's Gemma 4. Activation steering failed. Prompt-level attacks broke every bias category.</summary> </entry> </feed>
