Automatic Database Management System Tuning Through Large-scale Machine Learning

February 11, 2019

Authors - Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang

Motivation –

Most of the work to improve database efficiency pivot around logical or physical design of database. This paper focuses on tuning of low-level configuration knobs of Database Management System (DBMS) like memory cache than indexing. Database configuration tuning involves trying out various combinations/values of knobs. This is an NP- hard problem. Database giants like IBM, Oracle and BerkeleyDB tried to improve the efficiency of the databases using heuristics or influence diagrams which is a manual process and requires domain experts. This paper talks about OtterTune - an ML based tool which provides optimal values of DBMS knobs to improve the target performance measures like latency and throughput for a defined load.

Positive Points -

[1] OtterTune works well on any pre-listed DBMS types. It recommends knob settings for a new application using the data stored in its repository. It modifies the best match from repository to get the knobs for target measures. It also uses feedback provided for the recommendation to refine its model.

[2] Ottertune is robust in terms of redundant knobs (different names or strongly correlated knobs), granularities of knobs and knowledge of bad or good knobs as it only considers global knobs. It uses factor analysis and k-means for pruning to remove useless metrics.

[3] It uses DBMS’s internal runtime metrics to train the model as these metrics are directly affected by the knobs’ settings. It used linear regression technique named Lasso path algorithm to select impactful DBMS knobs while also considering polynomial features making the tuning more efficient by capturing the dependency and also reducing the search space.

[4] OtterTune uses Gaussian Process regression to recommend configuration. Not only this technique is computationally less expensive but also powerful as a neural network which is desired for on-line tuning.

Negative Points -

[1] Ottertune keeps the list of the knobs that can be changed dynamically and doesn’t require a restart. It mentions to change only dynamic knobs if restarting the DBMS is costly or undesirable. As the knobs may be highly correlated, it fails to take into account the consequences of tuning only few knobs while keeping other unchanged which could potentially decrease the performance.

[2] Paper proposes a solution to tune database that should not have any heuristic but to group the similar knobs together, it uses K-Means and assumes a heuristic that K = 40 will do a good job.

[3] Sometimes, OtterTune produces absurd values for some knobs while trying to optimize the target metrics as it fails to consider the negative effect of such recommendations.

- Saurabh Gupta

Comments

jon_weissmanFebruary 12, 2019 at 7:09 AM
For negative point 1, I believe it does try to learn knob dependencies. It is true that K-means does require K, but they point to a heuristic (not sure how good it is, though). I'm not sure of your evidence for point #3? I appreciate your review!
ReplyDelete
Replies

Add comment

Search This Blog

CSci 8980 Machine Learning in Computer Systems

Automatic Database Management System Tuning Through Large-scale Machine Learning

Comments

Post a Comment

Popular posts from this blog

A Machine Learning Approach to Live Migration Modeling

CrystalBall: Statically Analyzing Runtime Behavior via Deep Sequence Learning

StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware