Why Query Optimization ?
The typical goal of an SQL optimization is to get the result (data set) with less computing resources consumed and/or with shorter response time. We can follow several methodologies depending on our experience and studies, but at the end we have to get the answers for the following questions:
- Is the task really heavy, or just the execution of the query is non-optimal?
- What is/are the weak point(s) of the query execution?
- What can I do to make the execution optimal?
What is our goal?
Optimize a specific query that has been running before and we have the execution detail info
Step details clearly show where were the big resources burnt
In general, optimize the non optimal queries: find them, solve them
Like a.,but first find those queries, and then solve them one-by-one
Optimize a query, that has no detailed execution info, just the SQL (and "explain")
Deeper knowledge of the base data and "Teradata way-of-thinking" is required, since no easy and trustworthy resource peak-detecting is available. You have to imagine what will happen, and what can be done better.
Optimization !! How to do it ??
My opinion is that DBQL (DataBase Query Logging) is the fundamental basis of a Teradata system performance management – from SQL optimization point of view. I strongly recommend to switch DBQL comprehensively ON (SQL, Step, Explain, Object are important, excluding XML, that is huge, but actually has not too much extra), and use daily archiving from the online tables – just follow Teradata recommendation.
Finding good candidate queries
DBQL is an excellent source for selecting "low hanging fruits" for performance tuning. The basic rule: we can gain big save on expensive items only, let's focus on the top resource consuming queries first. But what is high resource consumption? I usually check top queries by one or more of these properties:
- Absolute CPU (CPU totals used by AMPs)
- Impact CPU (CPU usage corrected by skewness)
- Absolute I/O (I/O totals used by AMPs)
- Impact I/O (Disk I/O usage corrected by skewness)
- Spool usage
- Run duration
Finding weak point of a query
Does it have few or many "peak steps", that consume much resources?
- Which one(s)?
- What type of operations are they?
- Does it have high skewness? Bad parallel efficiency, very harmful
- Does it consume extreme huge spool? Compared to other queries…
Most of the queries will have one "peak step", that consumes most of the total resources. Typical cases:
"Retrieve step" with redistribution
Large number of rows and/or skewed target spool
"Retrieve step" with "duplication-to-all-AMPs"
Large number of rows duplicated to all AMPs
Huge number of comparisons: N * M
Merge or Hash join
Skewed base or prepared (spool) data
Large data set or skewed operation
Skewness and/or many hash collisions
Any kind of step
Non small, but strongly skewed result
What can we do?
- Supplement missing / refresh stale statistics
- Drop disturbing statistics (sometimes occurs…)
- Restructure the query
- Break up the query, place part result into volatile table w/ good PI and put statistics on
- Correct primary index of target / source tables
- Build secondary/join index/indices
Add extra components to the query.
You may know some additional "easy" filter that lightens the work. Eg. if you know that the join will match for only the last 3 days data of a year-covering table, you can add a date filter, which cost pennies compared to the join.
Restrict the result requirements to the real information demand.
Do the end-user really need that huge amount of data, or just a record of it?
What should we do?
Find the top consuming step, and determine why it it high consumer
- Its result is huge
- Its result is skewed
- Its work is huge
- Its input(s) is/are huge
Track the spool flow backwards from the top step, and find
- Low fidelity results (row count falls far from estimated row count)
- NO CONFIDENCE steps, specifically w/low fidelity
- Skewed spool, specifically non small ones
- Big duplications, specifically w/NO CONFIDENCE
Find the solution
Supplement missing statistics, typically on PI, join fields or filter condition
NO CONFIDENCE, low fidelity, big duplications
Break up the query
Store that part result into a volatile table, where fidelity is very bad, or spool is skewed. Choose a better PI for that
Modify PI of the target table
Slow MERGE step, typical hash-collision problems
- Eliminate product joins
- Decompose large product joins
- Supplement missing statistics, typically on PI, join fields or filter condition