
yueliu1999/Awesome-Jailbreak-on-LLMs
π¦ Open Source Projectyueliu1999
A comprehensive, curated collection of state-of-the-art jailbreak research, datasets, and evaluation tools for LLM security.
Awesome-Jailbreak-on-LLMs provides a structured overview of the rapidly evolving field of LLM adversarial robustness. The repository categorizes various jailbreak techniques, ranging from prompt engineering and social engineering attacks to gradient-based optimization and automated red-teaming. It offers deep insights into how models can be manipulated to bypass safety guardrails, providing researchers with the necessary tools to analyze vulnerabilities. By aggregating papers, evaluation benchmarks, and practical implementation code, the project facilitates a better understanding of the cat-and-mouse game between model developers and adversarial actors. It covers both text-based LLMs and Vision-Language Models (VLMs), ensuring that security professionals stay updated on the latest threats across multimodal architectures. This repository is a critical resource for building more resilient, secure, and trustworthy AI systems by highlighting common failure modes and defensive strategies.
π‘Highlights
- ββCurated SOTA jailbreak research
- ββIncludes datasets and evaluation tools
- ββCovers both LLMs and VLMs
π―For
- ββAI Security Researchers
- ββRed Teamers
- ββLLM Developers
πLinks
- ββGitHub Repository