Babak is a Professor in the School of Computer and Communication Sciences and the founding director of the EcoCloud, an industrial/academic consortium at EPFL investigating scalable data-centric technologies. He has made numerous contributions to computer system design and evaluation including a scalable multiprocessor architecture which was prototyped by Sun Microsystems (now Oracle), snoop filters and memory streaming technologies that are incorporated into IBM BlueGene/P and Q and ARM cores, and computer system performance evaluation methodologies that have been in use by AMD, HP and Google PerKit . He has shown that hardware memory consistency models are neither necessary (in the 90's) nor sufficient (a decade later) to achieve high performance in multiprocessor systems. These results eventually led to fence speculation in modern microprocessors. His latest work on workload-optimized server processors laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX. He is a recipient of an NSF CAREER award, IBM Faculty Partnership Awards, and an Alfred P. Sloan Research Fellowship. He is a fellow of IEEE and ACM.
Medium AI picked our paper (from ColTraIn) on Hybrid Block Floating-Point as among the top at NeurIPS'18. See the news.
We have a released a new vulnerability, SMoTherSpectre, in all SMT-enabled microprocessors, which unlike prior side channels focusing the cache hierarchy, target shared resources in the pipeline.
CloudSuite 3.0 is now released. CloudSuite 3.0 is Docker-ready and integrated into Google PerfKit.
Cavium ThunderX, a recent ARM-based server processor, is the first scale-out processor which is workload-optimized based on our work in "Clearing the Clouds" and the first version of CloudSuite. See this article in EETimes from EEMBC and Cavium.
The EuroCloud Server project has been hailed as a "flagship" project to drive innovation in datacenter design by EU. See the ACM Tech report snippet.
Data has emerged as a currency for modern society and datacenters are now the backbone of IT offering large-scale cloud services at low costs benefiting from and exploiting the economies of scale. With silicon efficiency scaling having dwindled since 2004 and silicon density scaling, Moore's Law, slowing down, future digital platforms will rely on heterogeneous logic and memory to allow for IT scalability. Meanwhile, the demand for large-scale cloud services has grown dramatically faster than conventional silicon scaling making IT platform scalability a grand challenge. Future platforms will need hand-in-hand collaboration of application domain experts and platform designers to improve scalability. With many online services being in-memory and the minimum communication latency between the farthest nodes being microseconds, future server platforms will go through revolutionary changes in architecture and systems to enable seamless aggregation of logic and memory resources across nodes, breaking the conventional abstraction layers. Babak's research and educational activities center around post-Moore server design.He investigates techniques to address these challenges in the context of the following projects:
- CloudSuite: A Benchmark Suite for Scale-Out Workloads
- ColTraIn: Co-Located Training and Inference DNN Accelerators
- QFlex: Fast, Full-System Open-Source Server Simulation/Emulation
- Scale-Out NUMA: Rack-Scale Computer Architecture
- VISA: Server Processors for the Dark Silicon Era
Silicon Heterogeneity in the Cloud
DATE, March 2019, PDF.
Datacenter for the Post-Moore Era
Euro-Par Keynote, August 2018, PDF.
Public Clouds will Subsume (Most of) HPC
May 2017, PDF.
Memory-Centric Server Architecture
Talks at Columbia, Edinburgh and HKUST,
Big Data & Dark Silicon: Taming Two IT Trends on a Collision Course
HiPEAC CSW & IEEE CloudNet Keynotes,
October 2014, PDF.
Reliability in the Dark Silicon Era
IOLTS 2011 Keynote,
July 2011, PDF.
Dark Silicon & Its Implications on Server Chip Design
November 2010, PDF, Video.
TRUSS: Reliable, Scalable Server Architecture
Georgia Institute of Technology, College of Computing Colloquia,
April 2006, PDF.
Temporal Memory Streaming
University of Texas, Computer Science Department Colloquia,
December 2005, PDF.
Transactional Execution: Wait-Free Hardware Memory Ordering
Dagstuhl Seminar on "Hardware and Software Consistency
Models: Programmability and Performance",
October 2003, PDF.