Babak Falsafi

Full Professor
babak.falsafi@epfl.ch +41 21 693 55 92 http://parsa.epfl.ch/~falsafi
EPFL IC IINFCOM PARSA
INJ 233 (Bâtiment INJ)
Station 14
1015 Lausanne
+41 21 693 55 92
+41 21 693 13 93
Office:
INJ 233
EPFL
>
IC
>
IINFCOM
>
PARSA
Web site: Web site: https://parsa.epfl.ch/
+41 21 693 55 92
EPFL
>
IC
>
IC-SIN
>
SIN-ENS
Web site: Web site: https://sin.epfl.ch
+41 21 693 55 92
EPFL
>
IC
>
IC-SSC
>
SSC-ENS
Web site: Web site: https://ssc.epfl.ch
+41 21 693 55 92
EPFL
>
SB
>
SB-SMA
>
SMA-ENS
Web site: Web site: https://sma.epfl.ch/
Fields of expertise
Biography
Babak is a Professor in the School of Computer and Communication Sciences and the founder of EcoCloud, an industrial/academic consortium at EPFL investigating scalable sustainable information technology. He has made numerous contributions to computer system design and evaluation including a scalable multiprocessor architecture which was prototyped by Sun Microsystems (now Oracle), snoop filters incorporated into multi-socket x86 servers and IBM BlueGene supercomputers, spatial and temporal memory streaming that appear in ARM cores, and computer system performance evaluation methodologies that have been in use by AMD, HP and Google PerfKit . He has shown that hardware memory consistency models are neither necessary (in the 90's) nor sufficient (a decade later) to achieve high performance in servers. These results eventually led to fence speculation in modern CPUs. His work on cloud-native CPUs laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX. He is a recipient of an NSF CAREER award, IBM Faculty Partnership Awards, and an Alfred P. Sloan Research Fellowship. He is a fellow of ACM and IEEE.NEWS
Online services are stuck in memory and DRAM is not scaling. AstriFlash at HPCA'23 presents a system to serve data directly out of Flash, reducing memory cost by 20x and meeting ms-scale SLO for online services at 95% of throughput compared to DRAM.Network bandwidth is projected to grow at 20% a year for a decade thanks to optics. Logic density is lagging behind at 15% a year and slowing down resulting a "datacenter tax". Optimus Prime a data transformation accelerator, NebuLA a hardware-terminated network stack, and Cerebros an RPC processor are examples of how to mitigate the datacenter tax in the post-Moore era. Great to see that Google has followed up with their own data transformation accelerator in 2022.
See our paper on "Rebooting Virtual Memory with Midgard" for a novel approach to future-proof virtual memory. Here is a news snippet.
Numerical training of DNNs is converging on fixed point with orders of magnitude improvement in logic, memory, power and bandwidth. See our blog.
RESEARCH
Data has emerged as a currency for modern society and datacenters are now the backbone of IT offering large-scale cloud services at low costs benefiting from and exploiting the economies of scale. With silicon efficiency scaling having dwindled since 2004 and silicon density scaling, Moore's Law, slowing down, future digital platforms will rely on heterogeneous logic and memory to allow for IT scalability. Meanwhile, the demand for large-scale cloud services has grown dramatically faster than conventional silicon scaling making IT platform scalability a grand challenge. Future platforms will need hand-in-hand collaboration of application domain experts and platform designers to improve scalability. With many online services being in-memory and the minimum communication latency between the farthest nodes being microseconds, future server platforms will go through revolutionary changes in architecture and systems to enable seamless aggregation of logic and memory resources across nodes, breaking the conventional abstraction layers. Babak's research and educational activities center around post-Moore server design.He investigates techniques to address these challenges in the context of the following projects:
- CloudSuite: A Benchmark Suite for Scale-Out Workloads
- ColTraIn: Co-Located Training and Inference DNN Accelerators
- HARNESS: Heterogeneous Architectures for Next-Generation Server Systems
- Midgard: Future-Proofing Virtual Memory
- QFlex: Fast, Full-System Open-Source Server Simulation/Emulation
- VISA: Cloud-Native CPUs
Selected Talks
Integration, Specialization and Approximation: the "ISA" of Post-Moore ServersHPCA Keynote, 2022.
Post-Moore AI Infrastructure
Facebook SysML Talk, 2021.
Post-Moore Server Architecture
ICS Keynote, 2020 (Video on YouTube!).
Server Architecture for the Post-Moore Era
HotDC Keynote, 2017.
Publications
Infoscience publications
Teaching & PhD
Teaching
Computer Science
Mathematics,Communication Systems