Andrea Cavallaro
EPFL STI IEM LIDIAP
ELD 241 (Bâtiment ELD)
Station 11
1015 Lausanne
Web site: Web site: https://idiap.epfl.ch/
+41 21 693 90 04
EPFL
>
STI
>
STI-SEL
>
SEL-ENS
Fields of expertise
Keywords: Machine Learning, Artificial Intelligence, Computer Vision, Audio Processing, Robot Perception, Privacy.
Biography
Andrea Cavallaro is the Idiap Director and a Full Professor at EPFL. He is a Fellow of the Higher Education Academy, a Fellow of the International Association for Pattern Recognition, and an ELLIS Fellow. His research interests include machine learning for multimodal perception, computer vision, machine listening, and information privacy.Andrea received his PhD in Electrical Engineering from EPFL in 2002. He was a Research Fellow with British Telecommunications in 2004 and was awarded the Royal Academy of Engineering Teaching Prize in 2007; three student paper awards on target tracking and perceptually sensitive coding at IEEE ICASSP in 2005, 2007 and 2009; and the best paper award at IEEE AVSS 2009. In 2010, he was promoted to Full Professor at Queen Mary University of London, where he was the founding Director of the Centre for Intelligent Sensing and the Director of Research of the School of Electronic Engineering and Computer Science. He was a Turing Fellow (2018-2023) at The Alan Turing Institute, the UK National Institute for Data Science and Artificial Intelligence.
He was selected as IEEE Signal Processing Society Distinguished Lecturer (2020-2021) and served as Chair of the IEEE Image, Video, and Multidimensional Signal Processing Technical Committee (2020-2021). He also served as member of the Technical Directions Board of the IEEE Signal Processing Society and as elected member of the IEEE Multimedia Signal Processing Technical Committee and chair of the Awards committee of the IEEE Signal Processing Society, Image, Video, and Multidimensional Signal Processing Technical Committee.
He serves as Senior Area Editor for the IEEE Transactions on Image Processing and served as Editor-in-Chief of Signal Processing: Image Communication (2020-2023); as Area Editor for the IEEE Signal Processing Magazine (2012-2014); and as Associate Editor for the IEEE Transactions on Image Processing (2011-2015), IEEE Transactions on Signal Processing (2009-2011), IEEE Transactions on Multimedia (2009-2010), IEEE Signal Processing Magazine (2008-2011) and IEEE Multimedia (2016-2018). He also served as Guest Editor the IEEE Transactions on Multimedia (2019), IEEE Transactions on Circuits and Systems for Video Technology (2017, 2011), Pattern Recognition Letters (2016), IEEE Transactions on Information Forensics and Security (2013), International Journal of Computer Vision (2011), IEEE Signal Processing Magazine (2010), Computer Vision and Image Understanding (2010), Annals of the British Machine Vision Association (2010), Journal of Image and Video Processing (2010, 2008), and Journal on Signal, Image and Video Processing (2007).
He published a monograph on Video tracking (2011, Wiley) and three edited books: Multi-camera networks (2009, Elsevier); Analysis, retrieval and delivery of multimedia content (2012, Springer); and Intelligent multimedia surveillance (2013, Springer).
Research
Aligning LLM Technologies with Societal Values
The core of the project AlignAI is the alignment of LLMs with human values, identifying relevant values and methods for alignment implementation. Two principles provide a foundation for the approach: (1) explainability is a key enabler for all aspects of trustworthiness, accelerating development, promoting usability, and facilitating human oversight and auditing of LLMs; (2) fairness is a key aspect of trustworthiness, facilitating access to AI applications and ensuring equal impact of AI-driven decision-making. The practical relevance of the project is ensured by three use cases in education, positive mental health, and news consumption. I am the Local PI and Lead of "Identification of societal values and user preferences" (WP3).Project Link
GraphNEx
Project Coordinator of GraphNEx (Graph Neural Networks for Explainable Artificial Intelligence) that employs concepts and tools from graph signal processing and graph machine learning to extrapolate semantic concepts and meaningful relationships from sub-graphs within the knowledge base that can be used for semantic reasoning. By enforcing sparsity and domain-specific priors between concepts we will promote human interpretability.Project Link.
CORSMAL
Project Coordinator of CORSMAL (Collaborative object recognition, shared manipulation and learning) that explores the fusion of multiple sensing modalities (touch, sound, and first and third person vision) to accurately and robustly estimate the physical properties of objects in noisy and potentially ambiguous environments. We aim to develop a framework for recognition and manipulation of objects via cooperation with humans by mimicking human capability of learning and adapting across a set of different manipulators, tasks, sensing configurations and environments. Our focus is to define learning architectures for multimodal sensory data and for aggregated data from different environments. The goal is to continually improve the adaptability and robustness of the learned models, and to generalise capabilities across tasks and sites.Project Link
Teaching & PhD
Teaching
Electrical and Electronics Engineering
PhD Students
Baranouskaya Darya, El Zein Dina, Espinosa Mena Jose Rafael, Hrynenko Olena, Kumar Shashi, Lu Cen, Maric Ante, Schonger Martin, Shirakami Haruki, Tang Yung-Chen, Xu Lei,Deep learning: course & project
This course explores how to design reliable discriminative and generative neural networks, the ethics of data acquisition and model deployment, as well as modern multi-modal models.EE-559 - Group Mini Project.
Theme: Deep learning to foster safer online spaces
Scope: The group mini-project aims to support a safer online environment by tackling hate speech in various forms, ranging from text and images to memes, videos, and audio content.
Objective: To develop deep learning models that help foster healthier online interactions by automatically identifying hate speech across diverse content formats. These deep learning models shall be carefully designed to prioritize accuracy and context comprehension, ensuring they differentiate between harmful hate speech and legitimate critical discourse or satire.
Context: Developing deep learning models that help prevent the surfacing of hateful rhetoric will lead to a more respectful online environment where diverse voices can coexist and thrive.
Some students' feedback (2025):
"awesome project"
“The professor is great, makes the class really engaging and interesting and is really good at explaining. The fact that the focus is not only on the technical part but also on the ethical and environmental concerns is a great bonus. The exercise sessions are nice and interesting.”
“The teacher is very nice and exactly knows how to answer the students' questions, and what I really like is the fact that we are not just learning about outdated techniques, but that the modern advancements that were discovered some weeks ago are also described and incorporated in the lecture.”
Moodle Page.