Automatic Database Management System Tuning through Large-scale Machine Learning

Authors:

Dana Van Aken, Andrew Pavlo, Geoferry J. Gordon, Bohan Zhang

Motivation:

Database performance critically depends upon configuration knobs that control each aspect of the system, from the application's workload to the way the underlying hardware is utilized. Database tuning is complicated by the sheer number of configurable knobs, the dependencies between knobs, the number of possible values for each configuration and the difficulty in reusing previously determined configurations. The paper introduces Ottertune, an automated database tuning tool that recommends configurations based on previous experience. It compares the performance of OtterTune with that of Database-specific tuning tools and the recommendation of a human expert.

Pros:

1. Using only global knobs ensures that each that the metrics of one DBMS can also be used by another DBMS. Each tuning session can, therefore, have a larger repository of recorded metrics for training than traditional DBMS specific tuning tools.

2. Each tuning session has a repository of all the previously tested configurations and their effects available for learning. The information generated in each session can be used to refine the models and improve thereby improving the accuracy of the model over time

3. Pruning metrics using Factor Analysis and K-means clustering significantly reduces the search space of metrics to be considered for tuning and provides a concise representation of the workload

Cons:

1. The Lasso regression used in OtterTune makes use of polynomial terms to model the dependencies between the knobs. Depending upon the number of knobs being modeled, the number of higher-order interaction terms may grow explosively, affecting the complexity of the model.

2. Information about the effects of tuning a knob can only be gained through experience. Depending on the knob being tuned, the cost of conducting such a test may be extensive. However, without this prior information, it would be impossible to generate a learned tuning approach.

3. The tool is not completely automated since tasks such as blacklisting crucial knobs still require the intervention of a DBA. It also needs an external hint in order to discard useless metrics since it has no programmatic way of determining which metrics are truly useless.

4. It sacrifices fine-grained tuning at a tabular or component level by considering only global knobs in order to be able to map similar metrics across different databases.



-Karthik Unnikrishnan








Comments

  1. Excellent analysis. A deeper question is whether this approach can scale. How expensive is it to change configurations for each observation period? How many observation periods seem to be enough. Is there any hope of doing local knob optimization?

    ReplyDelete

Post a Comment

Popular posts from this blog

A Machine Learning Approach to Live Migration Modeling

StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware

CrystalBall: Statically Analyzing Runtime Behavior via Deep Sequence Learning