Collect STATS Scheduling
Stats plays a critical role throughout any query process in a database, It provides vital analytical information to the optimizer of teradata which enables it to prepare the best and cheapest task execution plan.
Considering the criticality of stats , it needs to be latest and updated as nothing can be more worst than having a stale or old information.but Refreshing the information and gathering the information takes time and resources so we need to understand how latest should be the information which will not hamper the performance and at the same time make the best decision for query execution.
Leading a pivotal role in performance and the cost incurred to collect statistics of the table , Its very important to review the size/usage of the table on timely basis and review the stats process . Since stats process incurs time and CPU therefore in a batch process which is SLA/time bound, it plays a critical role to decide the stats process and its frequency of recurrence.
Things to consider for statistics collection on BIG Tables.
1) Understand the Volatility & DML operation performed on the table for which the stats performance needs to be reviewed. If multiple insert/update/delete are performed on a specific table-attribute and considering its vital business usage .
2) Evaluate the size of table on which the stats is being collected, If the table is appx 100 GB ,then evalute the daily , weekly , Monthly growth of the table, Based on size change if the % change in data is around > 10 % then schedule the stat process accordingly.