Babak Falsafi

EPFL IC IINFCOM PARSA
INJ 233 (Bâtiment INJ)
Station 14
1015 Lausanne

Expertise

Computer architecture, datacenter systems, cloud-native server architecture.
Babak is a Professor in the School of Computer and Communication Sciences and the founder of EcoCloud, an industrial/academic consortium at EPFL investigating scalable sustainable information technology. He has made numerous contributions to computer system design and evaluation including a scalable multiprocessor architecture which was prototyped by Sun Microsystems (now Oracle), snoop filters incorporated into multi-socket x86 servers and IBM BlueGene supercomputers, spatial and temporal memory streaming that appear in ARM cores, and computer system performance evaluation methodologies that have been in use by AMD, HP and Google PerfKit . He has shown that hardware memory consistency models are neither necessary (in the 90's) nor sufficient (a decade later) to achieve high performance in servers. These results eventually led to fence speculation in modern CPUs. His work on cloud-native CPUs laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX. He is a recipient of an NSF CAREER award, IBM Faculty Partnership Awards, and an Alfred P. Sloan Research Fellowship. He is a fellow of ACM and IEEE.

NEWS

Online services are stuck in memory and DRAM is not scaling. AstriFlash at HPCA'23 presents a system to serve data directly out of Flash, reducing memory cost by 20x and meeting ms-scale SLO for online services at 95% of throughput compared to DRAM.
Network bandwidth is projected to grow at 20% a year for a decade thanks to optics. Logic density is lagging behind at 15% a year and slowing down resulting a "datacenter tax". Optimus Prime a data transformation accelerator, NebuLA a hardware-terminated network stack, and Cerebros an RPC processor are examples of how to mitigate the datacenter tax in the post-Moore era. Great to see that Google has followed up with their own data transformation accelerator in 2022.
See our paper on "Rebooting Virtual Memory with Midgard" for a novel approach to future-proof virtual memory. Here is a news snippet.
Numerical training of DNNs is converging on fixed point with orders of magnitude improvement in logic, memory, power and bandwidth. See our blog.

RESEARCH

Data has emerged as a currency for modern society and datacenters are now the backbone of IT offering large-scale cloud services at low costs benefiting from and exploiting the economies of scale. With silicon efficiency scaling having dwindled since 2004 and silicon density scaling, Moore's Law, slowing down, future digital platforms will rely on heterogeneous logic and memory to allow for IT scalability. Meanwhile, the demand for large-scale cloud services has grown dramatically faster than conventional silicon scaling making IT platform scalability a grand challenge. Future platforms will need hand-in-hand collaboration of application domain experts and platform designers to improve scalability. With many online services being in-memory and the minimum communication latency between the farthest nodes being microseconds, future server platforms will go through revolutionary changes in architecture and systems to enable seamless aggregation of logic and memory resources across nodes, breaking the conventional abstraction layers. Babak's research and educational activities center around post-Moore server design.
He investigates techniques to address these challenges in the context of the following projects:
  • CloudSuite: A Benchmark Suite for Scale-Out Workloads
  • ColTraIn: Co-Located Training and Inference DNN Accelerators
  • HARNESS: Heterogeneous Architectures for Next-Generation Server Systems
  • Midgard: Future-Proofing Virtual Memory
  • QFlex: Fast, Full-System Open-Source Server Simulation/Emulation
  • VISA: Cloud-Native CPUs

Selected Talks

Awards

Elected Fellow of Association for Computing Machinery (ACM)

2015

Elected Fellow of the Institute of Electrical and Electronics Engineers

2012

Sloan Research Fellowship

Alfred P. Sloan Foundation

2004

Infoscience

Teaching & PhD

PhD Students

Yuanlong Li, Simla Burcu Harma, Shashwat Shrivastava, Alexandros Poupakis, Shanqing Lin, Ayan Chakraborty, Pooria Poorsarvi Tehrani, Ali Ansari

Past EPFL PhD Students

Pejman Lotfi Kamran, Sotiria Fytraki, Dejan Novakovic, Stavros Volos, Djordje Jevdjic, Ilknur Cansu Kaynak, Yusuf Onur Koçberber, Javier Picorel Obando, Alexandros Daglis, Mario Paulo Drumond Lages De Oliveira, Arash Pourhabibi Zarandi, Mark Johnathon Sutherland, Siddharth Gupta, Ognjen Glamocanin, Dina Gamaleldin Ahmed Shawky Mahmoud

Past EPFL PhD Students as codirector

Stanko Novakovic, Tao Lin, Atri Bhattacharyya

Courses

Advanced multiprocessor architecture

CS-471

Multiprocessors are basic building blocks for all computer systems. This course covers the architecture and organization of modern multiprocessors, prevalent accelerators (e.g., GPU, TPU), and datacenters. It includes a research project on multiprocessors and post-Moore era datacenters.

Parallelism and concurrency in software

CS-302

From sensors,to smart phones,to the world's largest datacenters and supercomputers, parallelism & concurrency is ubiquitous in modern computing.There are also many forms of parallel & concurrent execution in modern platforms with varying degrees of ease of programmability,performance & efficiency.

Topics in Machine Learning Systems

CS-723

This course will cover the latest technologies, platforms and research contributions in the area of machine learning systems. The students will read, review and present papers from recent venues across the systems for ML spectrum.

Topics on Datacenter Design

CS-728

Modern datacenters with thousands of servers and multi-megawatt power budgets form the backbone of our digital universe. ln this course, we will survey a broad and comprehensive spectrum of datacenter design topics from workloads, to server architecture and infrastructure.