Article by Ebel Dough.
This talks about the use of the Data Warehouse Query Driver TD Benchmark in relation to:
- Competitive benchmarks
- Proof of Concepts
- Migration Benchmarks
- Upgrade Impact Benchmarks
- Application Benchmarks
- Canary Queries
Where to Get it: TdBench V7.2 runs under Linux and has a version that includes TPC-DS and TPC-H benchmarks. There is also a downloadable Excel Reporting template and a set of 4 videos demonstrating its features at:http://downloads.teradata.com/download/tools/tdbench7-2
Often when people hear the term “Benchmark”, they associate it with a “speeds & feeds” contest between 2 or more vendors to win the business from a company.
- The RFP is issued,
- The contest rules are set,
- The vendors race to convert undocumented, legacy queries to their platform
- The selection committee gathers to hear the presentations
- And a winner is chosen.
TdBench was developed to enable our benchmark analysts to be more productive in simulating legacy environments, from coordinating multiple streams of queries with different SLA’s to injecting data loading activities or other events into the test scenarios. Queries and ETL activities can be coordinated within the test and multiple tests can be put into a single TdBench script with Linux commands between to manipulate files, views and tables for the next test. TdBench:
- helps in quickly assessing whether a test was successful as soon as it completes,
- allows us to track the impact of tuning changes across iterations of the PDM, and
- allows us to build suites of tests to be executed automatically at different scales of data, with/without AJI's, or before and after a hardware change.
It automatically tracks interfering workload, TASM rules in effect, Teradata release, and node/amp configuration. When it comes time to pull results for the presentation, the TdBench test metadata helps to select the error-free formal test runs from the dozens (or hundreds) of tests that may have been executed.
Proof Of Concept
Sometimes, a company is building a new application, so instead of converting legacy, a new data model is still evolving and queries are just being constructed. Then the "Benchmark" is executed within the “Proof of Concept”. This may be competitive or non-competitive. As part of the development activity, a query driver such as TdBench is valuable for tracking the changes in performance of various queries and workloads as the data model evolves. Descriptions of the run conditions and observations can be recorded into each test’s metadata. The TdBench linkage to DBQL and PDCR allows investigation of performance changes which sometimes are a result of using the wrong views or tables, or forgetting to recreate an AJI after a PDM redesign. Each query can be labeled with comments in the SQL or query band and using the TdBench analysis views, you can investigate:
- Changes in the query source across the test runs
- Changes in the objects referenced
- Changes in performance at the query step level with indexes highlighting the % of total query runtime, %error in estimated rows vs actual, and text messages suggested performance factors
- Changes in the explain plan
An increasing number of customers are asking for Migration Benchmarks to ensure:
- That the platform they intend to migrate to will deliver the incremental Queries-per-hour they need to support their growth
- That as they migrate their workload to the new platform, especially continuous load processing will be able to catch up after the migration Is complete.
- All necessary settings, TASM rules, cost profiles, views, and indexes have been created correctly on the new platform to reduce "Day One" issues
For these benchmarks, queries and other processing is selected from one or more critical periods of production. Data is migrated to the target platform and the queries are rerun. Since these benchmarks typically use live or copies of current data, they are a “one time use” benchmark since queries reference dates and keys that may not exist in the future.
TdBench analysis tools can be used to compare overall performance, identify queries that have changed the most based on run time, CPU, I/O, or parsing time, and drill down to look at the query execution steps, and objects reference on each platform to determine the cause of the change. TdBench runs on Linux and can define multiple DBMS platforms. The same test definitions can be run on each platform. The TdBench scripts can even run a set of tests on one server, then use the "SERVER' command to switch platforms run the same tests on the other.
Upgrade Impact Benchmark
One of the concerns that operations has when software is upgraded or maintained is whether the changes will have adverse impact on critical SLA’s. For example, as Teradata’s optimizer has evolved, each release has delivered higher performance for the overall workload. There are situations where a few queries perform worse. This could be because some statistics were not collected and the prior release’s optimizer fell into a “lucky guess”, or because the optimizer is now spending more time optimizing a short running complex SQL query than it took to run on the prior release. Unfortunately, sometimes there are bugs which affect some creative use of Teradata features that affect a few customers but not others.
Teradata is moving toward quarterly software releases which may make it more critical to quickly and productively assess the impact of moving to a new software release. For this type of benchmark, there may be multiple platforms such as a large production machine, a machine with deep data history, and one or more test and development platforms. Since this will be run multiple times on different sized platforms, a different approach to development is required.
- Sample a variety of query functionality from different sources
- Modify queries to run now and future, eliminating date references and reviewing constraints against keys
- For multiple servers, sample data from multiple tables to create seed data as input to extrapolation. Ensure keys referenced by queries are included in sample
- Create a serial test to validate query functionality
- Create a mixed workload test to validate new release’s workload management
The tests can initially be run on a test and development machine before and after maintenance events to validate the software release with the customer’s specific workload. The tests can be run again on a production machine during a maintenance window before and after restarting the system with the new release to validate that the performance on that platform is suitable. TdBench’s analysis tools allow you to make a quick Go/No Go decision on leaving the new release in production. The tools can then be used identify queries that have changed the most based on run time, CPU, I/O, or parsing time, and drill down to look at the query execution steps, and objects referenced.
A very simple benchmark that many shops run from Viewpoint is the “Canary Query” which should be a light-weight query to validate overall system availability and performance. At one customer site, we found that as part of their set of “Canary Queries”, they were also testing out the performance of their applications to detect changes based on data growth, updates to applications, or changes to the PDM. We found that they were using 14% of their system capacity with these queries running every 15 minutes.
To reduce impact to their users, we recommended splitting out the Application Testing and only run very light weight queries from Viewpoint. The Canary Queries are best executed by Viewpoint and it has great portlets for analyzing changes in performance.
We recommended to the above company to create an “Application Benchmark” to run on weekends with those queries designed to assess the performance impacts of data growth, application query changes and PDM changes.
This is a good candidate for running under TdBench because it assigns a RunID for each execution, and its analysis views are able to parse out query names from comments in the query text or from a query band to trace the performance changes of the test workload or of each query over time. This type of test runs selection queries against live data, but executes data modifying queries against copies of data or volatile tables.
As the Data Warehouse evolves to support a larger and more complex workload and one which is critical to the business, it is important that necessary maintenance and upgrade activities do not impact the user community. Benchmarking is an important tool to assess the impact of each change. But it isn't enough to clock a thousand queries before and after and look at the total minutes. The benchmarking approach needs to assume that you will need to get down to looking at a specific query or even a single step to recognize that an AJI wasn't there or that statistics were needed on a particular column among thousands of columns for the new release.
DBQL, Resusage, and Viewpoint are valuable tools for capturing data about system performance. PDCR and Viewpoint are great tools for looking at system growth and historical performance anomalies. TdBench is a tool that:
- helps to define and automate tests,
- record the conditions in effect at the time of the test,
- perform analysis of the discrete periods when the test was executing,
- perform comparisons of overall test performance between tests, and
- track the performance of individual queries and workloads across tests.
TdBench is available for download at:http://downloads.teradata.com/download/tools/tdbench7-2
Source link : http://developer.teradata.com/tools/articles/benchmarking-and-tdbench-7-2