Name: From Text to Vision: Ensuring Responsibility and Safety in Modern AI
Start: 2025-05-27T14:00:00Z
End: 2025-05-27T16:00:00Z
Location: Scuola Normale Superiore di Pisa

From Text to Vision: Ensuring Responsibility and Safety in Modern AI

May 27, 2025·

Samuele Poppi

· 1 min read

Abstract

As generative AI systems grow in complexity and adoption, ensuring their safety and alignment becomes both more critical and more challenging. This talk explores the fragility of modern AI models—from text-only large language models (LLMs) to multimodal architectures—when exposed to adversarial manipulation. We begin by examining redteaming techniques that reveal how easily current LLMs can be jailbroken through methods such as prompt injection, character roleplay, and fine-tuning attacks. We then present recent findings showing that multilingual safety alignment can be compromised even by monolingual fine-tuning, highlighting the presence of language-agnostic safety parameters. To investigate this phenomenon, we introduce Safety Information Localization (SIL), a method to identify and manipulate the minimal set of parameters encoding safety-critical behavior. Finally, we transition to multimodal models and demonstrate Safe-CLIP, a framework that directly edits the CLIP embedding space to suppress harmful visual-textual associations. Together, these insights suggest that model safety must be embedded at a fundamental level—across languages, modalities, and representations—to ensure robust and responsible AI systems.

Date

May 27, 2025 2:00 PM — 4:00 PM

Event

Responsible Generative AI - National PhD Program in AI For Society

Location

Scuola Normale Superiore di Pisa

7 Piazza dei Cavalieri, Pisa, Tuscany 56126

Click on the Slides button above to view the built-in slides feature.

Slides can be added in a few ways:

Last updated on May 27, 2025

Authors

Samuele Poppi

Postdoctoral Associate - AI Safety