A Sample-Aware Database Tuning System With Deep Reinforcement Learning

A Sample-Aware Database Tuning System With Deep Reinforcement Learning

Zhongliang Li, Yaofeng Tu, Zongmin Ma
Copyright: © 2024 |Pages: 25
DOI: 10.4018/JDM.333519
Article PDF Download
Open access articles are freely available for download

Abstract

Based on the relationship between client load and overall system performance, the authors propose a sample-aware deep deterministic policy gradient model. Specifically, they improve sample quality by filtering out sample noise caused by the fluctuations of client load, which accelerates the model convergence speed of the intelligent tuning system and improves the tuning effect. Also, the hardware resources and client load consumed by the database in the working process are added to the model for training. This can enhance the performance characterization ability of the model and improve the recommended parameters of the algorithm. Meanwhile, they propose an improved closed-loop distributed comprehensive training architecture of online and offline training to quickly obtain high-quality samples and improve the efficiency of parameter tuning. Experimental results show that the configuration parameters can make the performance of the database system better and shorten the tuning time.
Article Preview
Top

Introduction

In recent years, there has been a notable and rapid development in information and communication technologies, with a significant emphasis on the domains of Cloud-Computing, Big-Data (Eachempati et al. (2022)), Artificial intelligence (Wang et al. (2019)), and the advent of 5G networks. The continuous expansion of the volume of the data and the enrichment of data types have resulted in increasingly diverse and rapid changes in database workload. Conventional operational methods are no longer adequate to fulfill the requirements of modern database systems. According to Chen et al. (2019), Kraska et al. (2019) and Li, Zhou & Li (2019), With the recent advancement of artificial intelligence technology AI-based database operation and maintenance methods (M'barek et al. (2016)) are gradually replacing traditional database operation and maintenance methods.

Improving the performance of the database is a main concern of the intelligent operation and maintenance. The tuning of database generally refers to increasing the throughput per unit time of the database or reducing the latency of a single database operation. There are a number of configurable parameters which is significantly important for the performance of the database operations. Oh and Sang (2005) and Weikum et al. (2002) showed in their work that database parameter tuning has always been a key concern of DBA. In recent years, the academic community has also made a significant improvement on database parameter tuning which impact on DBA performances (Aken, Pavlo, Gordon & Zhang, 2017; Cai et al., 2022; Cereda et al., 2021; Fekry et al., 2020; Gur et al., 2021; Ishihara & Shiba, 2020; Li, Zhou, Li & Gao, 2019; Kanellis et al., 2020; Kanellis et al., 2022; Zhang et al., 2019; Zhang et al., 2021; Zhang et al., 2022). However, researches have ignored the following three issues:

  • 1.

    Database performance failures are rare and difficult to obtain in online environments. It is relatively easy to construct performance problem samples in an offline database environment. However, due to the differences between offline simulation environments and real online environments, models trained solely on sample data from offline simulation environments may perform poorly in online environments. The above two reasons require the combination of samples from both online and offline simulation environments to jointly train the model.

  • 2.

    Both online and offline environments ignore the impact of hardware environment information on database tuning when generating samples. If hardware environment information is not included in the model training process, it cannot better characterize database performance characteristics, thereby affecting the accuracy of the model.

  • 3.

    Not all samples in the database can be used to train the model. Some samples are invalid and need to be filtered those outliers. This part of the sample does not refer to manual input errors, null values, and other anomalies. The main factors affecting database performance are not only database parameters but also workload. When the workload is relatively low (performance has not reached the performance bottleneck), the performance of the entire database is only related to the strength of the workload, and is not related to the database parameters. Therefore, when the database system does not reach the performance bottleneck, the samples generated need to be filtered to avoid errors in evaluating the model's output parameters.

At present, the most cutting-edge intelligent parameter tuning is based on reinforcement learning (Li, Zhou, Li & Gao, 2019; Lillicrap et al., 2016; Zhang et al., 2019). The authors introduce a method for identifying whether the recommendation result is effective after configuring recommendation induced from the reinforcement learning, and proposes a new generation of database intelligent tuning system. DBtune used reinforcement learning to solve the problem of parameter tuning. The main work and contributions are as follows:

Complete Article List

Search this Journal:
Reset
Volume 35: 1 Issue (2024)
Volume 34: 3 Issues (2023)
Volume 33: 5 Issues (2022): 4 Released, 1 Forthcoming
Volume 32: 4 Issues (2021)
Volume 31: 4 Issues (2020)
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing