Andrea Cavallaro

Edit profile

andrea.cavallaro@epfl.ch +41 21 693 90 04

Google Scholar ID

Linkedin ID

EPFL STI IEM LIDIAP
ELD 241 (Bâtiment ELD)
Station 11
1015 Lausanne

+41 21 693 90 04
Office: ELD 239
EPFL > STI > IEM > LIDIAP

Web site: Web site: https://idiap.epfl.ch/

+41 21 693 90 04
EPFL > STI > STI-SEL > SEL-ENS

vCard
Administrative data

Fields of expertise

I focus on machine learning for audio-visual sensing to enable systems to serve agreed purposes and improve their abilities through interactions with the environment, with other systems and with people.

The goal of my research is to create the next-generation machine perception models for the effective use of sensory data, and the safe operation of autonomous systems that gain information about their environment and make decisions in partnership with humans or independently. These models are transforming the ability of autonomous systems to see, hear and confidently act in previously unseen scenarios.

This research is particularly important to maintain trust in machine learning, now that our society is facing an unprecedented pace of technological change.

Keywords: Machine Learning, Artificial Intelligence, Computer Vision, Audio Processing, Robot Perception, Privacy.

Invited Talks and Keynotes (2025)
- Multi-modal models: from text to images and beyond, ELLIS Winter School (January)
- The pursuit of privacy in the AI age, AMLD (February)
- The pursuit of privacy in the AI age, Keynote at ACM SAC (April)
- Images, perception, and the subjective space of privacy, Sapienza Univ. (May)
- From data to culture and back: building trustworthy learning systems, IFOSS Summer School (July)
- Images, perception, and the subjective space of privacy, Keynote at EUVIP (October)

Biography

Andrea Cavallaro is the Idiap Director and a Full Professor at EPFL. He is a Fellow of the Higher Education Academy, a Fellow of the International Association for Pattern Recognition, and an ELLIS Fellow. His research interests include machine learning for multimodal perception, computer vision, machine listening, and information privacy.

Andrea received his PhD in Electrical Engineering from EPFL in 2002. He was a Research Fellow with British Telecommunications in 2004 and was awarded the Royal Academy of Engineering Teaching Prize in 2007; three student paper awards on target tracking and perceptually sensitive coding at IEEE ICASSP in 2005, 2007 and 2009; and the best paper award at IEEE AVSS 2009. In 2010, he was promoted to Full Professor at Queen Mary University of London. He founded and directed the Centre for Intelligent Sensing and from 2012 to 2023 he was Director of Research of the School of Electronic Engineering and Computer Science. Andrea was a Turing Fellow (2018-2023) at The Alan Turing Institute, the UK National Institute for Data Science and Artificial Intelligence.

He was selected as IEEE Signal Processing Society Distinguished Lecturer (2020-2021) and served as Chair of the IEEE Image, Video, and Multidimensional Signal Processing Technical Committee (2020-2021). He also served as member of the Technical Directions Board of the IEEE Signal Processing Society and as elected member of the IEEE Multimedia Signal Processing Technical Committee and chair of the Awards committee of the IEEE Signal Processing Society, Image, Video, and Multidimensional Signal Processing Technical Committee.

Andrea served as Senior Area Editor for the IEEE Transactions on Image Processing and as Editor-in-Chief of Signal Processing: Image Communication (2020-2023); as Area Editor for the IEEE Signal Processing Magazine (2012-2014); and as Associate Editor for the IEEE Transactions on Image Processing (2011-2015), IEEE Transactions on Signal Processing (2009-2011), IEEE Transactions on Multimedia (2009-2010), IEEE Signal Processing Magazine (2008-2011) and IEEE Multimedia (2016-2018). He also served as Guest Editor the IEEE Transactions on Multimedia (2019), IEEE Transactions on Circuits and Systems for Video Technology (2017, 2011), Pattern Recognition Letters (2016), IEEE Transactions on Information Forensics and Security (2013), International Journal of Computer Vision (2011), IEEE Signal Processing Magazine (2010), Computer Vision and Image Understanding (2010), Annals of the British Machine Vision Association (2010), Journal of Image and Video Processing (2010, 2008), and Journal on Signal, Image and Video Processing (2007).

He published a monograph on Video tracking (2011, Wiley) and three edited books: Multi-camera networks (2009, Elsevier); Analysis, retrieval and delivery of multimedia content (2012, Springer); and Intelligent multimedia surveillance (2013, Springer).

Research

Aligning LLMs with Societal Values

The core of the project AlignAI is the alignment of LLMs with human values, identifying relevant values and methods for alignment implementation. Two principles provide a foundation for the approach: (1) explainability is a key enabler for all aspects of trustworthiness, accelerating development, promoting usability, and facilitating human oversight and auditing of LLMs; (2) fairness is a key aspect of trustworthiness, facilitating access to AI applications and ensuring equal impact of AI-driven decision-making. The practical relevance of the project is ensured by three use cases in education, positive mental health, and news consumption. I am the Local PI and Lead of "Identification of societal values and user preferences" (WP3).

Project Link

GraphNEx

Project Coordinator of GraphNEx (Graph Neural Networks for Explainable Artificial Intelligence) that employs concepts and tools from graph signal processing and graph machine learning to extrapolate semantic concepts and meaningful relationships from sub-graphs within the knowledge base that can be used for semantic reasoning. By enforcing sparsity and domain-specific priors between concepts we will promote human interpretability.

Project Link.

CORSMAL

Project Coordinator of CORSMAL (Collaborative object recognition, shared manipulation and learning) that explores the fusion of multiple sensing modalities (touch, sound, and first and third person vision) to accurately and robustly estimate the physical properties of objects in noisy and potentially ambiguous environments. We aim to develop a framework for recognition and manipulation of objects via cooperation with humans by mimicking human capability of learning and adapting across a set of different manipulators, tasks, sensing configurations and environments. Our focus is to define learning architectures for multimodal sensory data and for aggregated data from different environments. The goal is to continually improve the adaptability and robustness of the learned models, and to generalise capabilities across tasks and sites.

Project Link

Teaching & PhD

Teaching

Electrical and Electronics Engineering

PhD Students

Baranouskaya Darya, El Zein Dina, Espinosa Mena Jose Rafael, Hrynenko Olena, Kumar Shashi, Lu Cen, Maric Ante, Schonger Martin, Shirakami Haruki, Tang Yung-Chen, Xu Lei,

Deep learning: course & project

This course explores how to design reliable discriminative and generative neural networks, the ethics of data acquisition and model deployment, as well as modern multi-modal models.

EE-559 - Group Project.

Theme: Deep learning to foster safer online spaces

Scope: The group mini-project aims to support a safer online environment by tackling hate speech in various forms, ranging from text and images to memes, videos, and audio content.

Objective: To develop deep learning models that help foster healthier online interactions by automatically identifying hate speech across diverse content formats. These deep learning models shall be carefully designed to prioritize accuracy and context comprehension, ensuring they differentiate between harmful hate speech and legitimate critical discourse or satire.

Context: Developing deep learning models that help prevent the surfacing of hateful rhetoric will lead to a more respectful online environment where diverse voices can coexist and thrive.

Open Poster Session on 28 May 2025
Time: 8:30am-1:30pm
Location: MED hall, EPFL

Additional Information

Moodle Page.

Courses

Deep learning

(Coursebook not yet approved by the section)