CAPES: Unsupervised Storage Performance Tuning using Neural Network-Based Deep Reinforcement Learning

April 08, 2019

Proposed Solution

Authors proposed a Deep Reinforcement Learning based general parameter tuning system, called as CAPES, which doesn’t need any prior knowledge of the target system. It is designed to find the optimal values of tunable parameters in computer systems where human tuning can be costly and often cannot achieve optimal performance.

Good Points:

CAPES works in an unsupervised manner as it requires no prior knowledge of the targetsystem.
Setting up CAPES is very easy as it causes less downtime with little change to the target system.
CAPES can adapt to dynamically changing workloads.
CAPES is scalable to large distributed systems which can be simply be achieved by increasing the size of Deep Neural Network.
Using CAPES, it is possible to tune for multiple objectives at the same time, like tuning for throughput and fairness at the same time. Including new objective to be optimized can be achieved by simply adding one more unit in the output layer corresponding to that objective.
Usually, the algorithms either don’t consider the future rewards or consider future rewards up to certain time with equal importance which might increase the prediction error of the algorithm. But authors use the idea of discounting future rewards depending on how far the prediction is from the current time which is really smart as it does not lose the necessary information of immediately next timestep while not biasing the performance with rewards from farther timesteps.
The delay between the action and a reward doesn’t cause any problem because Q-function developed by the authors ultimately converges to the optimal point after iterative trainings according to Bellman’s proof.
Authors include the month, day, hour and minute as separate performance indicator instead of using date and time as individual single units which let the Deep Neural Network to learn the relationship between changes in the workload and the date/hour/minute that it changed approximately.
CPU memory utilization and network traffic are minimized as Monitoring Agent sends out performance indicator only when its data is different from the value of the previous sampling tick.
Introduction of the Interface Daemon in the architecture decouples the network communication code from other parts, reduces the overhead of locking the Replay DB and enables independent control of the Monitoring Agent and DRL Engine.
Using Action checker, Interface Daemon can rule out bad actions or parameter values to avoid hampering the target system.
Authors have provided an extensive evaluation of CAPES and they have made sure that it is prone to overfitting

Bad Points:

Authors claim that the architecture proposed does not limit how the system can be deployed. However, they do not provide examples of the implementation of any alternative or modified architecture.
Authors do not explain how to include system statuses (performance indicators) that are accumulative in nature if they are known to be related to system performance.
Monitoring Agent collects Performance Indicators at a predesignated sampling frequency. However, since the system’s state is dynamic in nature, the polling frequency should be dynamic too i.e. lower frequency when the system is in an overloaded state and higher otherwise to achieve better performance.
In order to use the Action checker, one has to manually provide all the actions or parameters to be ruled out which is based on expert understanding of the target system.
CAPES is not suitable for the systems where the number of read operations is more compared to the number of write operations, and most of the systems in real-world fall in such cases.
CAPES is not suitable for mission critical systems where suboptimal actions are too risky to be applied. Even though using Action checker component, one can rule out many possible bad actions, however checking if each of the possible action predicted needs to be considered or eliminated adds more overhead without provided the guarantee that the resultant action later will not be a bad action in future.

- Akash Kulkarni

Comments

jon_weissmanApril 9, 2019 at 6:24 AM
Points 5 and 6 are well taken. But it is ok to focus on write optimization provided read performance is not sacrificed. How do you think read performance could be optimized? As far as mission critical systems, are you saying any stochastic solution may be too risky? For example, would we ever want an airplane's auto-pilot to use a DNN or DRL? Good question!
ReplyDelete
Replies

Add comment

Search This Blog

CSci 8980 Machine Learning in Computer Systems

CAPES: Unsupervised Storage Performance Tuning using Neural Network-Based Deep Reinforcement Learning

Comments

Post a Comment

Popular posts from this blog

A Machine Learning Approach to Live Migration Modeling

CrystalBall: Statically Analyzing Runtime Behavior via Deep Sequence Learning

A Machine Learning Approach to Live Migration Modeling