Sigmod 2013

jninanyac14 de Junio de 2014

8.730 Palabras (35 Páginas)193 Visitas

Página 1 de 35

Performance and Resource Modeling in Highly-Concurrent

OLTP Workloads

Barzan Mozafari∗ Carlo Curino† Alekh Jindal∗ Samuel Madden∗

∗CSAIL MIT †Microsoft

{barzan,alekh,madden}@csail.mit.edu ccurino@microsoft.com

ABSTRACT

Database administrators of Online Transaction Processing (OLTP)

systems constantly face difficult questions. For example, “What is

the maximum throughput I can sustain with my current hardware?”,

“How much disk I/O will my system perform if the requests per second

double?”, or “What will happen if the ratio of transactions in my

system changes?”. Resource prediction and performance analysis are

both vital and difficult in this setting. Here the challenge is due to

high degrees of concurrency, competition for resources, and complex

interactions between transactions, all of which non-linearly impact

performance.

Although difficult, such analysis is a key component in enabling

database administrators to understand which queries are eating up

the resources, and how their system would scale under load. In

this paper, we introduce our framework, called DBSeer, that addresses

this problem by employing statistical models that provide

resource and performance analysis and prediction for highly concurrent

OLTP workloads. Our models are built on a small amount

of training data from standard log information collected during

normal system operation. These models are capable of accurately

measuring several performance metrics, including resource consumption

on a per-transaction-type basis, resource bottlenecks, and

throughput at different load levels. We have validated these models

on MySQL/Linux with numerous experiments on standard benchmarks

(TPC-C) and real workloads (Wikipedia), observing high accuracy

(within a few percent error) when predicting all of the above

metrics.

Categories and Subject Descriptors

H.2.4 [Systems]: Relational databases

Keywords

OLTP, Performance Predictions, Multi-tenancy

1. INTRODUCTION

Operating a large database management system (DBMS) or

a multi-tenant “database-as-a-service” [16] is a challenging and

stressful task for database administrators (DBA), especially as the

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that copies

bear this notice and the full citation on the first page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior specific

permission and/or a fee.

SIGMOD’13, June 22–27, 2013, New York, New York, USA.

DBMS starts experiencing heavy concurrent load. Although some

databases provide tools for measuring the run-time of an individual

query, many performance problems are a result of interactions

between concurrent queries, which existing systems are not capable

of modeling. Many transactions that run fine in isolation become

much slower when run together, as they interact in complex

ways and contend for shared resources. Load that is added over

time may cause resources that were previously abundant to become

constrained, and query performance to plummet. Applications may

generate unpredictable, time-varying load that puts strain on different

resources (e.g., RAM, disk, or CPU) at different times. To handle

all these scenarios, we need a way to attribute system load, on a perresource

basis to different queries, transactions, or applications. This

attribution enables a number of useful applications, including:

● Diagnosis / Performance Inspection: Why is a given query

running slow (in the presence of concurrency)? Which transaction

groups are causing the spike of lock wait times in the DBMS?

● Run-time Performance Isolation: In a database supporting

multiple applications, which application/transaction is using more

than its allocated share of resources? At what rate should transactions

of a given application be admitted/ dropped in order to avoid

any SLA (service-level agreement) violations?

● Billing: What is the actual contribution of each workload to

the overall resource consumption?

A second challenge for DBAs is to understand how database resource

consumption and performance vary as load on the system

changes, e.g., when partitioning the database, or when a sudden increase

in popularity of a website imposes unexpected load on the

back-end database To address the second challenge, we need what-if

analysis tools to allowDBAs to answer two other classes of questions:

● Performance Prediction: What will the performance (e.g. latency

or throughput) of a given query or application be if the rate of

‘new order’ transactions doubles?

● Provisioning: Which resource (e.g., disk, CPU, RAM)will bottleneck

first if the load on the system increases? What is the hardware

required to deliver a desired throughput?

To address these challenges, we introduce our approach, called

DBSeer, for statistical performance modeling and prediction. Our

techniques involve collecting a limited set of low-overhead statistics

fromthe DBMS and the operating system,measured during normal

system operation.We then use these statistics to build offlinemodels

for the given DBMS and the workload(s) running on it.Thesemodels

allow us to perform attribution and what-if analysis by assessing

the resource (e.g., CPU, disk, RAM, cache, DB locks) requirements

of individual queries or applications and estimating how those requirements

change as the database grows in size, as queries in the

workloads change, or as allocated resources vary.

In DBSeer, we develop two classes of models and compare their

performance. First, we develop black-box models, that make minimal

assumptions about the nature of the underlying system, and

train statistical regression models to predict future performance

based on past performance statistics. Second, we develop white-box

models that take the major components of the underlying database

system into account, to enable more accurate predictions. Specifically,

we develop white-box models for disk I/O, lock contention,

and memory utilization of MySQL. Although these models are focused

onMySQL, we believe that even solving performance prediction

in the context of this one system represents an important step

forward, as MySQL alone is used by millions of users. Moreover, in

Section 8.5, we report preliminary (but promising) results in applying

our models to another DBMS (PostgreSQL).

As we show in our experiments, the trade-off between these two

classes of models is that black-box models are more general but are

also less effective in making predictions outside of the range of inputs

on which they were trained. Unfortunately, many interesting

questions (including many what-if scenarios) require such “out of

range” predictions, e.g., predicting performance when dramatic and

heretofore unseen changes happen in the workload. This is why developing

white-box models is also necessary: they are less general

than black-boxmodels (as they make assumptions about the nature

of the database) but they provide higher extrapolation power.

Our approach in DBSeer1 is specifically designed for highly concurrent

OLTP (transaction processing) applications . These applications

run lightweight transactions that read or write a few records at

a time.We focus on this class of problems becauseOLTP settings are

most frustrating for DBAs, due to their high levels of concurrency

and the complex interactions between transactions (i.e.,competition

for different resources such as locks, cache, I/O). Such competitions

can lead to non-linear effects, where a small change in load can trigger

a large change in performance. Though some prior work has

addressed performance prediction in the case of OLAP (a.k.a. analytical)

databases [12, 3, 8], this problem has not been well studied in

OLTP.Thus, a key contribution of our work is to develop non-linear

models that capture highly concurrent locking and logging operations

of OLTP workloads.

In summary, we make several contributions towards modeling

transactional workload, including:

● Resource Models: we have developed white and black-box

models for predicting different resources, includingCPU,RAM, network,

disk I/O, and lock contention. Our primary contribution here

is a set of novel white-box models for predicting disk I/O and lock

contention.

● Extracting transaction types: we have developed highly accurate

clustering techniques to automatically extract and summarize

“classes of similar transactions” froma query log that allow us to accurately

group similar transactions. We show that this

...

Descargar como (para miembros actualizados) txt (60 Kb)

Leer 34 páginas más »

Leer documento completo Guardar

Disponible sólo en Clubensayos.com