Integrated CPU and L2 Cache Voltage Scaling using Machine Learning

Authors : Nevine AbouGhazaleh, Alexandre Ferreira, Cosmin Rusu Ruibin, Xu Frank Liberato, Bruce Childers, Daniel Moss´e, Rami Melhem

Motivation: 


In Multiple Clock Domain chips, the design allows for fine grain power management of each domain using dynamic voltage and frequency scaling. The paper tries to exploit this extra level of power management provided to generate a custom power management policy for embedded processors. They propose a Power-Aware Compiler-based approach using Supervised Learning(PASCL) to automatically derive an integrated CPU-core and L2 cache DVS policy. The approach explores the fact that every application goes through memory intense and cpu intense stages which can be identified and used to set appropriate voltage frequencies for the processor and L2 cache.



Pros:


1. The input state space considered takes into account CPI, L2PI, MPI, CPU-core and cache frequencies which captures the program and architectural behavior
2. The technique can be used to optimise energy, performance, and energy-delay product.
3. Training takes into consideration the design issues such as feature selection to avoid overfitting, overheads incurred by frequency changes.
4. The technique saves 22% on average (up to 46%) in energy-delay product over a DVS technique that applies independent DVS decisions in each domain.


Cons:


1. Generation of training data is the key for the model's performance in this case. The application set used to collect the training samples needs to display enough variations to contribute to the State table population.
2. Though the paper mentions about the advantage of collecting training sample only once for any number of optimisation metrics, the fact that building the state space is an expensive process in itself is not considered
3. There is no actual inference done in this paper as we develop a simple rule based algorithm to derive rules based on available action spaces
4. The paper relies heavily on the state table populated initially which is done using continuous metrics that are discretised into bins. There is not much explanation about the discretisation done which is important because this could lead to loss in information
5. It’ll be interesting to see if we can use neural networks with RL instead as they can handle continuous metrics better and could give more insight into the relationships between voltages and states.
6. There is not enough reasoning for why RIPPER is used when learning the rules.

-Aakhila Shaheen

Comments

  1. Paper is a few years ago; it definitely could be revisited with deep learning/reinforcement learning approaches. A question might be: how good is good? In other words, how close was their lookup solution to "optimal"?

    ReplyDelete

Post a Comment

Popular posts from this blog

A Machine Learning Approach to Live Migration Modeling

StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware

Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach