Leakage Control of Components on SoCs for Power and Reliability Management

Contact:
Nicolas Genko, PhD Student, EPFL-IC-LSI

Partners:
DACYA, UCM Madrid, Spain
Micrel Lab, DEIS, University of Bologna, Italy ;

Keywords:
System on Chip, Power Management, on-Chip interconnect, Reliability, MTTF.

Presentation:
This project is motivated by the fact that we can use devices on chip with several power states. Power gating with data retention techniques provides us memories, which can be turned into a sleep mode without data losses. We propose to study a Power Management System (PMS) with a dedicated communication layer. This additional interconnect links components monitoring the traffic at the edges of the communication medium. The components attached to memories also have the possibility to control the power state of memories.


 

This PMS provide us a way to globally power manage on-chip processors and storage cores while taking into consideration the fact that waking up a core takes some time and implies a certain energy overhead. This power management mechanism provides energy gains without performances degradation with respect to power unaware SoCs.

In addition to power savings, we will study in this project the influence of PMS on reliability of the chip. In this context, to increase reliability, duplication of on-chip cores will be also investigated.

General Steps:
Define a communication system and architecture for the PMS:
In order to achieve power management without performances losses, we must design a PMS with a very low latency with respect to the data communication medium and which is scalable to any SoC configuration.

Assess scalability of the PMS and estimation of gains:
We want to develop a methodology, which shows that our PMS is scalable and estimates the gains for any SoC.

Power management policies investigation and implementation:
Our PMS is able to use global information about the power state of the on-chip components and the incoming traffic to cores. We will develop new policies to use this global data.

Thermal and reliability analysis:
In recent studies, we have seen that lowering the mean temperature of a chip increases its Mean Time To Failure (MTTF). Since power management implies a lower mean temperature, we want to develop a power management policy which targets a larger MTTF.

Use of redundancy:
In order to increase MTTF, we propose to use standby cores. The duplication of cores implies new design challenges for the communication medium. After studying those challenges, we will show our trade-offs between power, area and reliability.
 

 

Download project presentation file (192 KB pdf)