Andrea Cavallaro

Expertise

I focus on machine learning for audio-visual sensing to enable systems to serve agreed purposes and improve their abilities through interactions with the environment, with other systems and with people.

The goal of my research is to create the next-generation machine perception models for the effective use of sensory data, and the safe operation of autonomous systems that gain information about their environment and make decisions in partnership with humans or independently. These models are transforming the ability of autonomous systems to see, hear and confidently act in previously unseen scenarios.

Keywords
Machine Learning, Artificial Intelligence, Computer Vision, Audio Processing, Robot Perception, Privacy. 
 

Keynotes (2025)
The pursuit of privacy in the AI age
   ACM SAC, April
Safe and trustworthy AI systems
   WAIC-S, July
Vision-language models for embodied AI
   ICIAP-W, September
Images, perception, and the subjective space of privacy
   EUVIP, October

Invited Talks (2025)
Multi-modal models: from text to images and beyond
   ELLIS Winter School, January
The pursuit of privacy in the AI age
   AMLD, February
Images, perception, and the subjective space of privacy
   Sapienza Univ., May
From data to culture and back: building trustworthy learning systems
   IFOSS Summer School, July 

Panels (2025)
Journalism in the era of AI and cyber threat
   AMLD, February
Agency, autonomy, and recursive self-improvement: governance and safety concerns
   AAEC, November
Challenges & opportunities of emerging technologies in humanitarian action
   SRUTHA, December

Strategic AI briefings (2025)
Telecommunications & multimedia 
Energy & utilities 
Digital education
Clinical decision support
Chambers of Commerce 

Andrea Cavallaro is a Full Professor at EPFL and a Fellow of the Higher Education Academy, a Fellow of the International Association for Pattern Recognition, and a Fellow of the European Laboratory for Learning and Intelligent Systems. His research focuses on machine learning for multimodal perception, computer vision, machine listening, and information privacy. 

Andrea has received numerous awards, including a Research Fellowship with British Telecommunications, the Royal Academy of Engineering Teaching Prize, a Turing Fellowship, and four paper awards. He also served as an IEEE Signal Processing Society Distinguished Lecturer and as Chair of the IEEE Image, Video and Multidimensional Signal Processing Technical Committee. He also served as member of the Technical Directions Board of the IEEE Signal Processing Society and as elected member of the IEEE Multimedia Signal Processing Technical Committee and chair of the Awards committee of the IEEE Signal Processing Society, Image, Video, and Multidimensional Signal Processing Technical Committee. 

Andrea has published over 350 scientific papers, a monograph on video tracking, and three edited books covering topics such as intelligent multimedia, multimedia content analysis, and multi-camera networks.

Awards

Fellow

European Laboratory for Learning and Intelligent Systems

2025

Distinguished Lecturer

IEEE Signal Processing Society

2020

Turing Fellow

The Alan Turing Institute

2018

Fellow

International Association for Pattern Recognition (IAPR)

2018

Fellow

Higher Education Academy (HEA)

2016

Best paper award (with N. Anjum)

IEEE AVSS

2009

Student paper award (with T. Popkin)

IEEE ICASSP

2009

Engineering Teaching Prize

Royal Academy of Engineering

2007

Student paper award (with E. Maggio)

IEEE ICASSP

2007

Student paper award (with E. Maggio)

IEEE ICASSP

2005

Research Fellow

British Telecom Research and Venturing

2004

Research

Aligning LLMs with Societal Values

The core of the project AlignAI is the alignment of LLMs with human values, identifying relevant values and methods for alignment implementation. Two principles provide a foundation for the approach: (1) explainability is a key enabler for all aspects of trustworthiness, accelerating development, promoting usability, and facilitating human oversight and auditing of LLMs; (2) fairness is a key aspect of trustworthiness, facilitating access to AI applications and ensuring equal impact of AI-driven decision-making. The practical relevance of the project is ensured by three use cases in education, positive mental health, and news consumption. I am the Local PI and Lead of "Identification of societal values and user preferences" (WP3).
Project Link

CORSMAL

Project Coordinator of CORSMAL (Collaborative object recognition, shared manipulation and learning) that explores the fusion of multiple sensing modalities (touch, sound, and first and third person vision) to accurately and robustly estimate the physical properties of objects in noisy and potentially ambiguous environments. We aim to develop a framework for recognition and manipulation of objects via cooperation with humans by mimicking human capability of learning and adapting across a set of different manipulators, tasks, sensing configurations and environments. Our focus is to define learning architectures for multimodal sensory data and for aggregated data from different environments. The goal is to continually improve the adaptability and robustness of the learned models, and to generalise capabilities across tasks and sites.
Project Link

GraphNEx

Project Coordinator of GraphNEx (Graph Neural Networks for Explainable Artificial Intelligence) that employs concepts and tools from graph signal processing and graph machine learning to extrapolate semantic concepts and meaningful relationships from sub-graphs within the knowledge base that can be used for semantic reasoning. By enforcing sparsity and domain-specific priors between concepts we will promote human interpretability.
Project Link.

Teaching & PhD

PhD Students

Yung-Chen Tang, Ante Maric, Cen Lu, Haruki Shirakami, Lei Xu, Dina El Zein, Jose Rafael Espinosa Mena, Shashi Kumar, Martin Schonger, Olena Hrynenko, Darya Baranouskaya

Courses

Deep learning

EE-559

This course explores how to design reliable discriminative and generative neural networks, the ethics of data acquisition and model deployment, as well as modern multi-modal models.

Deep learning: course & project

This course explores how to design reliable discriminative and generative neural networks, the ethics of data acquisition and model deployment, as well as modern multi-modal models.
EE-559 - Group Project.
Theme: Deep learning to foster safer online spaces
Scope: The group mini-project aims to support a safer online environment by tackling hate speech in various forms, ranging from text and images to memes, videos, and audio content.
Objective: To develop deep learning models that help foster healthier online interactions by automatically identifying hate speech across diverse content formats. These deep learning models shall be carefully designed to prioritize accuracy and context comprehension, ensuring they differentiate between harmful hate speech and legitimate critical discourse or satire.
Context: Developing deep learning models that help prevent the surfacing of hateful rhetoric will lead to a more respectful online environment where diverse voices can coexist and thrive.
Open Poster Session on 28 May 2025
Time: 8:30am-1:30pm
Location: MED hall, EPFL
Additional Information
Moodle Page.