Note: This content is accessible to all versions of every browser. However, this browser does not seem to support current Web standards, preventing the display of our site's design details.


Scalable Hardware and Software for Thousand-Core Chips

Multi-core chips will soon include a thousand cores and feature deep memory hierarchies with non-uniform latency characteristics. In such systems, the latency and energy overheads of remote memory accesses will dwarf the latency and energy overheads of computation. Hence, we need to revisit how we build and manage on-chip memory systems.

This talk will present key hardware and software techniques towards this goal. First, we will discuss how to build scalable memory hierarchies that provide quality of service (QoS) guarantees under all possible scenarios. In particular, we will focus on Vantage, a technique that implements fine-grain cache partitioning, enabling hundreds of threads to share a cache in a controlled manner, providing configurability and isolation. Next, we will describe GRAMPS, a scheduling and runtime system for pipeline-parallel programs that optimizes memory behavior while performing fine-grain dynamic load balancing with low overhead. Even on today's multi-core chips, GRAMPS outperforms the commonly used scheduling approaches such as task-stealing, GPGPU, and static streaming schedulers. Finally, we will present a simple hardware mechanism for core-to-core messaging that allows for the development of low-overhead, software-mostly runtime systems for fine-grain parallelism that scale efficiently to hundreds of hardware threads.

Type of Seminar:
IfA Seminar
Prof. Christos Kozyrakis
Electrical Engineering and Computer Science, Stanford University, Stanford
Jul 26, 2012   14:15

ETZ E 6, Gloriastr. 35
Contact Person:

Prof. John Lygeros
No downloadable files available.
Biographical Sketch:
Christos Kozyrakis is an Associate Professor of Electrical Engineering & Computer Science at Stanford University. He works on architectures, runtime environments, and programming models for parallel computing systems. At Berkeley, he developed the IRAM architecture, a novel media-processor system that combined vector processing with embedded DRAM technology. At Stanford, he co-led the Transactional Coherence and Consistency (TCC) project at Stanford that developed hardware and software mechanisms for programming with transactional memory. He also led the Raksha project, that developed practical hardware support and security policies to deter high-level and low-level security attacks against deployed software. Dr. Kozyrakis is currently working on hardware and software techniques for next-generation data centers. He is also a member of the Pervasive Parallelism Lab at Stanford, a multi-faculty effort to make parallel computing practical for the masses.

Christos received a BS degree from the University of Crete (Greece) and a PhD degree from the University of California at Berkeley (USA), both in Computer Science. He is the Willard R. and Inez Kerr Bell faculty scholar at Stanford and a senior member of the ACM and the IEEE. Christos has received the NSF Career Award, an IBM Faculty Award, the Okawa Foundation Research Grant, and a Noyce Family Faculty Scholarship.