Ph.D. Candidate @ Dottorato Nazionale in Intelligenza Artificiale, University of Pisa and University of Modena and Reggio Emilia - AImageLab


Job offer info:  Email  |  CV

Social:  Google Scholar  |  LinkedIn  |  GitHub

About me

My name is Samuele Poppi, and I am a dedicated researcher in Responsible AI and AI Safety, with a focus on generative multimodal AI systems. I am about to complete my Ph.D. in Artificial Intelligence, at the University of Pisa and the University of Modena and Reggio Emilia, where I have developed extensive expertise in responsible and safe AI.
Additionally, I gained valuable experience as a Research Scientist Summer Intern at Meta GenAI Safety Alignment in 2024, where I worked for six months on advancing safety frameworks for AI models.

Education
  • Ph.D. in Artificial Intelligence, University of Pisa and University of Modena and Reggio Emilia, present
  • MSc - 110 CUM LAUDE, University of Modena and Reggio Emilia, February 2021
  • BSc, University of Modena and Reggio Emilia, December 2017
  • Relevant Experience
  • Research Scientist Summer Intern @ Meta GenAI Safety Alignment
    Menlo Park, 94025 CA USA, May - November 2024
    Mentors Cristian Canton Ferrer, Oliver Aobo Yang, Jianfeng Chi
    Topics Responsible AI for LLMs, Jailbreaking Attacks for LLMs, Fragility of Multilingual LLMs
  • Research Fellow @ HiPeRTLab - University of Modena And Reggio Emilia
    Modena, 41125 MO Italy, June - September 2021
    Mentors Marko Bertogna, Micaela Verucchi
    Topics Computer vision algorithms for underwater and pick and place
  • Research activities

    Authored publications:


    Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks

    Samuele Poppi*,2,3, Zheng-Xin Yong*,4, Yifei He5, Bobbie Chern1, Han Zhao5, Aobo Yangโ€ ,1, Jianfeng Chiโ€ ,1

    Findings of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), Albuquerque, NM, USA 2025 โ€” Poster

    Affiliations:*,1Meta,2University of Pisa, 3University of Modena and Reggio Emilia, 4Brown University, 5University of Illinois Urbana-Champaign

    Project Page  |  GitHub  |  ๐Ÿ“ arXiv  |  bibtex

    We show that fine-tuning in a single language compromises safety across all languages (cross-lingual generalization). We hypothesize that multilingual LLM safety relies on both language-specific and language-agnostic parameters. To study this, we introduce Safety Information Localization (SIL), which identifies the subset of model weights responsible for safety alignment. Our analysis finds that 20% of parameters encode most language-agnostic safety knowledge, with substantial cross-lingual overlap. Yet, freezing them fails to block attacks, revealing alternative learning pathways [3]. Finally, stitching these parameters into another safety-aligned model is enough to jailbreak it, confirming their effectiveness and transferability.

    * Work done during internship at Meta     โ€  Equal advising

    Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

    Samuele Poppi*,1,2, Tobia Poppi*,1,2, Federico Cocchi*,1,2, Marcella Cornia1, Lorenzo Baraldi1, Rita Cucchiara1

    European Conference on Computer Vision, Milano, Italy 2024 - Poster

    Affiliations:*,1University of Modena and Reggio Emilia,2University of Pisa

    Project Page  |  ๐Ÿค— Hugging Face  |  GitHub  |  ๐Ÿ“ arXiv  |  bibtex

    We tackle toxicity in generative AI by aligning CLIP encoders with safety standards. We generate the the ViSU dataset, obtained by jailbreaking LLaMA2-7B-Chat, and it pairs quadruplets of safe and NSFW image-text examples. Leveraging the power of ViSU, Safe-CLIP applies knowledge editing techniques to remove unsafe associations in CLIPโ€™s embeddings, thus ensuring outputs remain safe even with NSFW inputs, advancing responsible multimodal AI. These safe encoders can be attached and detached from any generation pipeline, due to their modularity.

    * Equal contribution

    Unlearning Vision Transformers without Retaining Data via Low-Rank Decompositions

    Samuele Poppi1,2, Sara Sarto1, Marcella Cornia1, Lorenzo Baraldi1, Rita Cucchiara1

    International Conference on Pattern Recognition, Kolkata, India 2024 - Poster

    Affiliations:1University of Modena and Reggio Emilia,2University of Pisa

    ๐Ÿ“ arXiv  |  bibtex

    The GDPR and CCPA have spurred interest in removing sensitive information from pre-trained models without retraining. Standard unlearning approaches use a two-sided loss function with retaining and forgetting, relying on a forget set (Df) and a retaining set (Dr), to preserve knowledge and ensure accuracy. These methods (1) face scalability challenges, (2) are resource-intensive, and (3) become impractical without access to retaining training data. This paper introduces a trainable low-rank decomposition to enable targeted information removal without a retaining dataset, significantly reducing computational and memory costs.