Integrated CPU and L2 Cache Voltage Scaling using Machine Learning

Author:

Nevine AbouGhazaleh , Alexandre Ferreira, Cosmin Rusu, Ruibin Xu,Frank Liberato, Bruce Childers, Daniel Moss´e, Rami Melhem

Department of Computer Science, University of Pittsburg


Motivation:

Power management is not trivial with the increase of computational and storage capabilities. The Multiple Clock Domains (MCD) design makes a great improvement in fine-grain power management with dynamic voltage scaling (DVS). This paper provides a Power-Aware Compiler-based approach (PACSL) that uses supervised learning to automatically construct the DVS policy of integrated CPU-core and on-chip L2 cache according to the system and workload requirement.

Main points:

1. The system state representation they selected: CPI, L2PI, MPI, CPU-core frequency and L2 cache frequency, which characterize the application behaviors. The optimal policy is decided by the exhaustive search in the state space.

2. They use supervised learning to train the data. The propositional rule – RIPPER is applied because it’s more compact, more expressive, more human-readable and easy to implement in hardware.

3. This paper shows the design issues like feature selection, optimization metrics, training applications and etc. to explain the choices and trade-offs in their experiments.

4. The experiments compare PACSL and an independent CPU-core and L2 cache DVS policy. The results of energy-product show that PACSL saves 22% on average and up to 46% than the traditional method in energy-delay product.

5. It discusses the performance by changing the state description elements, the optimization metric, the processor configuration, and control granularity. And it also analyzes the state space coverage and rule simplification issues.

Trade-offs:

1. There’s no comparison between the PACSL and the optimal policy. As I understand, a supervised learning algorithm has the correct answer for the classifier and thus has the error rate to measure how the algorithm performs.  In this paper, the training process explores all the combinations of system configurations, it’s reasonable to have the optimal policy (even by sampling). But this paper doesn’t talk about it.

2. This paper applies RIPPER but lacks quantitative analysis to show why RIPPER is better than other algorithms. Maybe some other fancy algorithms could generate better performance. 

3. The paper evaluates PACSL and an independent DVS policy only on Mibench and SPEC2000 benchmarks. I don’t know if these two benchmarks are representative enough.

4. When the state measurement like CPI is discretized, there is no explanation of why that interval is chosen. It seems like it’s from certain experience.

- Simin Wang




Comments

  1. Excellent observations. They were lazy about comparisons to optimal (at least for one application). Learning the bins/discretization might be an interesting future area.

    ReplyDelete

Post a Comment

Popular posts from this blog

A Machine Learning Approach to Live Migration Modeling

StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware

Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach