Victor Tolulope Olufemi

Il - He/him

Nationality: Nigerian

EPFLETUIN-SIN-E

EPFLICIINFCOMLIGHT

Expertise

Victor Tolulope Olufemi is an AI Researcher, Machine Learning Engineer, and the Co-Founder of LyngualLabs. He is currently pursuing his Master of Science in Engineering Artificial Intelligence at Carnegie Mellon University (CMU Africa) and is collaborating as a research project student at the Laboratory for Intelligent Global Health and Humanitarian Technologies (LIGHT) at EPFL.
Victor's research interests lie at the intersection of speech processing, natural language processing, and medical/humanitarian AI. He has co-authored publications at premier academic venues, including the ACM SIGIR main conference (Resources track), the Swiss Data Science Conference (SDS), and workshops at CVPR and NAACL.
As the lead researcher for the WAXALNet project, Victor directed the benchmarking of 57 open-source ASR models across 19 African languages, demonstrating how optimized edge models can outperform massive zero-shot foundation models. His work is driven by a commitment to democratization, open-source AI, and building highly localized technologies for underrepresented communities.

Core Areas of Expertise:
  • Speech & Audio Processing
  • Linguistic Code-Switching
  • Multimodal Deep Learning & Vision

Mission

To co-design and deploy scalable, robust AI systems that prioritize equity, language accessibility, and real-world impact. I am dedicated to bridging representation gaps in AI by building open-source systems for low-resource environments and integrating speech and multimodal models into diverse technologies.

Current Work

I am the Co-Founder and Technical Lead at LyngualLabs, where we build open-source speech and language technologies for low-resource languages.
My active research focuses on:
  • The WAXAL-Net Project: Directing the benchmarking and open-sourcing of 57 ASR models across 19 African languages (featured in our recent publication, The WAXAL ASR Benchmark: Fine-Tuned Edge Models Across 19 African Languages).
  • Speech-to-Speech (S2S) Translation: Developing end-to-end models that translate spoken languages directly without text bottlenecks, preserving vocal character and tone.
  • Conversational Code-Switching: Tuning acoustic models to robustly capture sudden code-switching boundaries. Exploring the possibilities of code switching from only monolingual data