Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Seyed Hamed Hassani, Eric Wong, JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.
Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Seyed Hamed Hassani, Eric Wong, JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.