Thesis title: Design of Thermal Management Control Policies for Multiprocessors Systems on Chip
Thermal Aware Policies Design for MPSoCs
Francesco Zanini, PhD Student, EPFL-IC-LSI
Integrated Systems Laboratory, EPFL
Embedded Systems Laboratory, EPFL
Information Systems Laboratory, Stanford University
Automatic Control Laboratory, ETHZ
Micrel Lab, University of Bologna
Keywords: Nanotechnology, thermal management, policy, MPSOC, reliability, DVFS
Download project description file (401 KB pdf)
With the advance of technology, the number of functional units and cores integrated on a chip is increasing. Today, several commercial multicore architectures ranging from few cores to several tens of cores are starting to be available, such as IBM’s Cell, Sun’s Niagara and Tilera’s 64-core architecture. However, in order to implement these systems, the semiconductor industry is facing serious technological challenges. It is predicted that in the near future, peak power dissipation and consequent thermal implications will be a major performance bottleneck for multicore systems. Temperature gradients and hot-spots not only affect the performance of the system, but also lead to unreliable circuit operation and affect the life-time of the chip. Thus, thermal management for multicore architectures is a critical matter to tackle. In the last years, thermal management and balancing techniques received a lot of attention. Many state-of-the-art thermal control policies operate power management by employing dynamic frequency and voltage scaling (DVFS) based techniques.
Block model representation of a DVFS-based thermal management system
Most previous work targets power density reductions, which has the effect of reducing overall temperature. Moreover, these techniques do not minimize thermal gradients or hot-spots. A very recent approach tackles joint processor power optimizations and thermal management by using convex optimization. In this work, to make the system feasible from an implementation perspective, several simplifying assumptions are made. Assuming uniform temperature over the chip weakens the quality of the results. The frequent abrupt change in working frequencies and voltages produces thermal cycling that raises the failure rate of the system. The effect of thermal cycling on the reliability of a chip can be modelled by the Coffin-Manson relation, which relates in an exponential way the number of cycles to failure to the magnitude of thermal cycling. In addition, discontinuous power-mode transitions, both in voltage and frequencies scaling, waste additional power.
The guideline of this project is to investigate the thermal aware policy design of MPSoCs. Difficulties in modelling thermal properties of bi-dimensional and tri-dimensional MPSoC structures will be addressed. This thesis will analyze techniques to extract key-features of MPSoC architectures and to model them by keeping the model as simple as possible. On the one side results obtained from this work will be used to improve the accuracy of existing thermal simulators for MPSoCs. On the other side, these results will be used to improve the mathematical representation used by thermal management/balancing policies to perform their optimization on the MPSoCs. Another topic that will be investigated in this thesis is thermal management and balancing policies. This area is quite big. Many approaches have been proposed. This thesis will make a study on optimization algorithms that allows an online optimization of the MPSoC. A key point in the optimization process is the validity of the prediction on the future workload that would be required by the scheduler. Also this issue and techniques to better optimize thermal management policies will be addressed.
F. Zanini, D. Atienza, and G. De Micheli, “A Control Theory Approach for Thermal Balancing of MPSoC“, in Proceedings of the 14th Asia and South Pacific Design Automation Conference, vol. 1, (Yokohama), pp. 7-12, IEEE and ACM, 2009.
F. Zanini, D. Atienza, L. Benini, and G. De Micheli, “Multicore Thermal Management with Model Predictive Control“, in Proceedings of the 19th European Conference on Circuit Theory and Design (ECCTD 2009), vol. 1, (New York), pp. 90-95, IEEE Press, 2009.
F. Zanini, D. Atienza, A. K. Coskun, and G. De Micheli, “Optimal Multi-Processor SoC Thermal Simulation via Adaptive Differential Equation Solvers“, in Proceedings of the 17th Annual IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), vol. 1, (New York), pp. 80-85, IEEE/IFIP Press, 2009.