Babak Falsafi

babak.falsafi@epfl.ch http://parsa.epfl.ch/~falsafi

vCard

EPFL IC IINFCOM PARSA
INJ 233 (Bâtiment INJ)
Station 14
1015 Lausanne

+41 21 693 55 92
+41 21 693 13 93
Office: INJ 233
EPFL › IC › IINFCOM › PARSA

Website: https://parsa.epfl.ch/

+41 21 693 55 92
EPFL › IC › IC-SIN › SIN-ENS

Website: https://sin.epfl.ch

+41 21 693 55 92
EPFL › VPA › VPA-AVP-DLE › AVP-DLE-EDOC › EDIC-ENS

+41 21 693 55 92
EPFL › IC › IC-SSC › SSC-ENS

Website: https://ssc.epfl.ch

+41 21 693 55 92
EPFL › SB › SB-SMA › SMA-ENS

Website: https://sma.epfl.ch/

Expertise

Computer architecture, datacenter systems, cloud-native server architecture.

Biography Awards Publications Teaching & PhD

Babak is a Professor in the School of Computer and Communication Sciences and the founder of EcoCloud, an industrial/academic consortium at EPFL investigating scalable sustainable information technology. He has made numerous contributions to computer system design and evaluation including a scalable multiprocessor architecture which was prototyped by Sun Microsystems (now Oracle), snoop filters incorporated into multi-socket x86 servers and IBM BlueGene supercomputers, spatial and temporal memory streaming that appear in ARM cores, and computer system performance evaluation methodologies that have been in use by AMD, HP and Google PerfKit . He has shown that hardware memory consistency models are neither necessary (in the 90's) nor sufficient (a decade later) to achieve high performance in servers. These results eventually led to fence speculation in modern CPUs. His work on cloud-native CPUs laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX. He is a recipient of an NSF CAREER award, IBM Faculty Partnership Awards, and an Alfred P. Sloan Research Fellowship. He is a fellow of ACM and IEEE.

NEWS

Online services are stuck in memory and DRAM is not scaling. AstriFlash at HPCA'23 presents a system to serve data directly out of Flash, reducing memory cost by 20x and meeting ms-scale SLO for online services at 95% of throughput compared to DRAM.
Network bandwidth is projected to grow at 20% a year for a decade thanks to optics. Logic density is lagging behind at 15% a year and slowing down resulting a "datacenter tax". Optimus Prime a data transformation accelerator, NebuLA a hardware-terminated network stack, and Cerebros an RPC processor are examples of how to mitigate the datacenter tax in the post-Moore era. Great to see that Google has followed up with their own data transformation accelerator in 2022.
See our paper on "Rebooting Virtual Memory with Midgard" for a novel approach to future-proof virtual memory. Here is a news snippet.
Numerical training of DNNs is converging on fixed point with orders of magnitude improvement in logic, memory, power and bandwidth. See our blog.

RESEARCH

Data has emerged as a currency for modern society and datacenters are now the backbone of IT offering large-scale cloud services at low costs benefiting from and exploiting the economies of scale. With silicon efficiency scaling having dwindled since 2004 and silicon density scaling, Moore's Law, slowing down, future digital platforms will rely on heterogeneous logic and memory to allow for IT scalability. Meanwhile, the demand for large-scale cloud services has grown dramatically faster than conventional silicon scaling making IT platform scalability a grand challenge. Future platforms will need hand-in-hand collaboration of application domain experts and platform designers to improve scalability. With many online services being in-memory and the minimum communication latency between the farthest nodes being microseconds, future server platforms will go through revolutionary changes in architecture and systems to enable seamless aggregation of logic and memory resources across nodes, breaking the conventional abstraction layers. Babak's research and educational activities center around post-Moore server design.
He investigates techniques to address these challenges in the context of the following projects:

CloudSuite: A Benchmark Suite for Scale-Out Workloads
ColTraIn: Co-Located Training and Inference DNN Accelerators
HARNESS: Heterogeneous Architectures for Next-Generation Server Systems
Midgard: Future-Proofing Virtual Memory
QFlex: Fast, Full-System Open-Source Server Simulation/Emulation
VISA: Cloud-Native CPUs

Selected Talks

Integration, Specialization and Approximation: the "ISA" of Post-Moore Servers
HPCA Keynote, 2022.

Post-Moore AI Infrastructure
Facebook SysML Talk, 2021.

Post-Moore Server Architecture
ICS Keynote, 2020 (Video on YouTube!).

Server Architecture for the Post-Moore Era
HotDC Keynote, 2017.

Awards

Elected Fellow of Association for Computing Machinery (ACM)

2015

Elected Fellow of the Institute of Electrical and Electronics Engineers

2012

Sloan Research Fellowship

Alfred P. Sloan Foundation

2004

Infoscience

Teaching & PhD

Courses

Advanced multiprocessor architecture

CS-471

Multiprocessors are basic building blocks for all computer systems. This course covers the architecture and organization of modern multiprocessors, prevalent accelerators (e.g., GPU, TPU), and datacenters. It includes a research project on multiprocessors and post-Moore era datacenters.

Parallelism and concurrency in software

CS-302

From sensors,to smart phones,to the world's largest datacenters and supercomputers, parallelism & concurrency is ubiquitous in modern computing.There are also many forms of parallel & concurrent execution in modern platforms with varying degrees of ease of programmability,performance & efficiency.

Topics in Machine Learning Systems

CS-723

This course will cover the latest technologies, platforms and research contributions in the area of machine learning systems. The students will read, review and present papers from recent venues across the systems for ML spectrum.

Topics on Datacenter Design

CS-728

Modern datacenters with thousands of servers and multi-megawatt power budgets form the backbone of our digital universe. ln this course, we will survey a broad and comprehensive spectrum of datacenter design topics from workloads, to server architecture and infrastructure.