DBQL Skew Detection Report If DBQL logging with SQL and STEPINFO have been enabled, then the following SQL statement may be executed to detect queries that encountered skewed processing during execution the previous day. Skewed processing is detected by comparing the CPU time used on the busiest AMP on the system for each query to the average AMP CPU time for the query, which is implemented by a filter in blue in the SQL statement below. Smaller, less significant queries are excluded from the report by a minimum CPU time filter, which is in green. The report includes only queries from the previous day due to a timestamp filter which is in red.
SELECT SQ.SqlTextInfo ,CASE WHEN ST.QueryID = 0 THEN 0 ELSE ((ST.ProcID / 100000)*10000000000)+ ST.QueryID END (NAMED QueryID, DECIMAL(18,0), FORMAT ‘-Z(17)9’) ,ST.ProcID ,ST.CollectTimeStamp ,ST.StepLev1Num ,ST.StepLev2Num ,ST.StepName ,ST.StepStartTime ,ST.StepStopTime ,((ST.StepStopTime-ST.StepStartTime) HOUR(2) to SECOND)
(NAMED ElapsedTime) ,ST.EstProcTime ,ST.CPUtime (Format ‘ZZ,ZZZ,ZZ9.999’) ,ST.IOcount ,ST.EstRowCount ,ST.RowCount ,ST.HotAmp1CPU (Format ‘ZZ,ZZZ,ZZ9.999’) ,ST.HotCPUAmpNumber ,ST.HotAmp1IO ,ST.HotIOAmpNumber FROM DBC.DBQLStepTbl ST INNER JOIN DBC.DBQLSQLTbl SQ ON ST.QueryID = SQ.QueryID AND ST.ProcID = SQ.ProcID AND CAST(ST.CollectTimeStamp AS DATE) = CAST(SQ.CollectTimeStamp AS DATE) WHERE ST.HotAmp1CPU > 60 AND ST.HotAmp1CPU/NULLIFZERO(CPUtime) > .25 AND ST.CPUtime >= ST.HotAmp1CPU AND SQ.SqlRowNo = 1 AND CAST(ST.CollectTimeStamp AS DATE) = CURRENT_DATE – 1
DBQL Product Join Detection Report If DBQL logging with SQL and STEPINFO have been enabled, then the following SQL statement may be executed to detect queries that encountered problematic product joins during execution the previous day. Large product joins are detected by comparing the CPU time used by the query to the I/O performed by the query, which is implemented by a filter in blue in the SQL statement below. Smaller, less significant queries are excluded from the report by a minimum CPU time filter, which is in green. The report includes only queries from the previous day due to a timestamp filter which is in red. SELECT SQ.SqlTextInfo ,CASE WHEN ST.QueryID = 0 THEN 0 ELSE ((ST.ProcID / 100000)*10000000000)+ ST.QueryID END (NAMED QueryID, DECIMAL(18,0), FORMAT ‘-Z(17)9’) ,ST.ProcID ,ST.CollectTimeStamp ,ST.StepLev1Num ,ST.StepLev2Num ,ST.StepName ,ST.StepStartTime ,ST.StepStopTime ,((ST.StepStopTime-ST.StepStartTime) HOUR(2) to SECOND) (NAMED ElapsedTime) ,ST.EstProcTime ,ST.CPUtime (Format ‘ZZ,ZZZ,ZZ9.999’) ,ST.IOcount ,ST.EstRowCount ,ST.RowCount
ST.HotAmp1CPU (Format ‘ZZ,ZZZ,ZZ9.999’) ,ST.HotCPUAmpNumber ,ST.HotAmp1IO ,ST.HotIOAmpNumber FROM DBC.DBQLStepTbl ST INNER JOIN DBC.DBQLSQLTbl SQ ON ST.QueryID = SQ.QueryID AND ST.ProcID = SQ.ProcID AND CAST(ST.CollectTimeStamp AS DATE) = CAST(SQ.CollectTimeStamp AS DATE) WHERE ST.CPUtime*10 >= ST.IOCount AND ST.CPUtime > 60 AND SQ.SqlRowNo = 1 AND ST.StepName = ‘JIN’ AND CAST(ST.CollectTimeStamp AS DATE) = CURRENT_DATE – 1
- ResUsage – looking for the days or times of day when the system is running close to or at 100% capacity.
- AmpUsage – identifying a particularly consumptive application or group of users.
- Database Query Log (DBQL) – measuring the characteristics of queries and identifying the steps that might be causing problems.
Level 1 – Physical Design Tuning Level 2 – Query Level Tuning Level 3 – System or APP Level Tuning Level 4 – Workload Management
] DWH performance is considered at these stages : Fetch data from source system, Data processing through ETL Layer, Feeding data in to DWH, Time involved in Fetching data from DWH for reporting.
II. PHYSICAL DESIGN TUNING [3] Physical design tuning deals with the database design of storing business data in a normalized form. The standard design is in a third normal form, which means further breaking of Entity Relationship Diagram such as :- Relations become tables Attributes becomes columns
Relationships become data references, For ex, Primary Key and Foreign Key Database Design for Teradata should be implemented in as follows:- De-normalize where appropriate Partition tables where appropriate Group tables into databases where appropriate. Determine use of Segments Determine use of devices Implement referential integrity of constraints Normalization: [3]Norm
Normalization: [3]Normalization is the technique, where non-key columns depend in the key. A fully normalized design may not provided with the best performance, Therefore it is recommended to design for the third normal form and then de-normalize it, if performance issue arises. There are various benefits of Normalization such as: Searching, Sorting, Creating Index much faster way, since the tables are small and compact. Index Searching is also more efficiently There are few index per tables, therefore leads to data modification commands executed in a faster way There are few NULL values and less redundant data, making your database much more compact. Triggers executed more quickly, if you are not maintaining redundant data. DML anomalies are reduced. In a nutshell, normalization is a cleaner and easier to maintain the database. Once the database design is created, we can also use the method called de-normalization for improving performance of a specific query or application. There are various performance advantaged of De-normalization:- Minimizing need of Joins Minimizing foreign keys on tables. Minimizing number of Index, which help in saving space and reduces DML queries execution time. Pre-computing aggregate values at data modification time, rather than at select time. There are various de-normalization techniques used:- Adding Redundant Columns:- helps in
eliminating frequent joins. Adding Derived Columns:- helps minimize the use of Joins also will reduce the time needed to produce aggregate values Collapsing Table:- If user wants full joined data from two tables, we can simply collapse the table to imp
III. QUERY LEVEL TUNING
- index
- Query Rewriting:
Real Time Monitoring:
Comparison Of Resource Usage:- It a nothing but measure of resource usage, we can get result such as Total CPU Usage Spool Space needed The LHR (Ratio between CPU and I/O usage) CPU Skew Skew impact on CPU
IV. SYSTEM OR APP LEVEL TUNING
STEP 1: Identifying Problematic Queries and hence area of Improvement
One way is to analyze Re Usage Data looking for the days or times of the day when the system is running close to or at 100% busy. Amp Usage data can be used to identify a particularly consumptive application or group of users. When it gets down to the real tuning analysis, though, DBQL data is the place to go.
STEP 2: Recording Similar Queries:
STEP 3: Actually Tuning the Query:
STEP 4: Translating gains to Business values
Determining business value can be broken into calculations and sub-calculations. Check the impact of making a tuning change: Monthly CPU saved = Total old CPU for a month X the average improvement percent.
WORKLOAD MANAGEMENT
Very High Priority used for tactical queries such as short select, where multiple tables are joined. We used the concept of Join Index in this. High Priority used for mini batch, such as daily FASTLOAD, MLOAD. Medium Priority used in the strategic Complex queries, such as joins, stored procedure and macros Low Priority used in Batch Reports.
This whole workload management is done by priority scheduler tool. It is basically used for sharing of resources:- Resource partition is nothing but the groups we create to distinguish between the users and types of queries. Priority scheduler provides zero as default value, and there can be more additional resource partition. It is the resource partition, which carries weight and compared with the other resource partitions
Performance Groups are defined within each additional resource partitions.
Performance period is connection between performance group and allocation group, we can have from 1-8 performance period. Allocation group weight is compared with the weight of other allocation group, it can also limit the amount of CPU used by sessions under it.
Increasing System Capacity is not always a solution to Improve System Performance. If we want to improve our System’s performance and Utilize it to the fullest we will have to make sure we follow all the above levels continuously and keep the system tuned as being perfect in one level also does not grantee the System to be tuned.
—————————————
A holistic view of Teradata performance—gained through the timely collection of data—is a good precursor to application tuning. Many customers have engaged Teradata Professional Services to install the performance data collection and reporting (PDCR) database.
This historical performance database and report toolkit provides diagnostic reports and graphs to help tune applications, monitor performance, manage capacity and operate the Teradata system at peak efficiency. If the PDCR database is not installed for performance tuning, it is imperative to enable Database Query Log (DBQL) detail, SQL and objects data logging for a timeframe that best represents the system workload to identify the optimum queries for tuning.
To optimize performance and extract more value from your Teradata system, follow these application tuning steps:
STEP 1: Identify performance-tuning opportunities
One way is to analyze ResUsage Data looking for the days or times of the day when the system is running close to or at 100% busy. AmpUsage data can be used to identify a particularly consumptive application or group of users. When it gets down to the real tuning analysis, though, DBQL data is the place to go.
The DBQL logs historical data about queries including query duration, CPU consumption and other performance metrics. It also offers information to calculate suspect query indicators such as large-table scans, skewing (when the Teradata system is not using all the AMPs in parallel) and large-table-to-large-table product joins (a highly consumptive join).
STEP 2: Find and record “like queries” with similar problems
While the DBQL is used to find specific incidents of problem queries, it can also be used to examine the frequency of a problem query. In this scenario, a DBA might notice that a marketing manager runs a problem query every Monday morning, and the same problem query is run several times a day by various users. Identifying and documenting the frequency of problem queries offers a more comprehensive view of the queries affecting data warehouse performance and helps prioritize tuning efforts
STEP 3: Determine a tuning solution
Performance analysis should be completed by an experienced Teradata DBA with a fundamental understanding of Teradata performance. Identifying problem queries and recording instances of like queries is easy; the difficulty is analyzing and tuning a specific query. This analysis and tuning takes time and attention to detail, and it requires experience with the Teradata system and knowledge of Teradata features, including:
![]() |
EXPLAIN. Analyzes a problem query and shows step-by-step text for how Teradata’s Optimizer executes a query |
![]() |
Visual Explain Tool. Shows the same EXPLAIN information but in graphics |
When a DBA collects statistics depends on how frequently data is loaded in the data warehouse. Collecting statistics ensures the Optimizer has the most accurate and timely information about the data. Teradata Statistics Wizard analyzes specified workloads or databases and creates a recommended list of statistics collection operations. The collection of missing statistics can help the Optimizer choose the best query plan.
DBQL Step data can help a DBA drill deeper into a problem query by showing which query step is causing the problem skewing, product joining or large scan. This data, when matched up with the EXPLAIN plan, can provide the insight needed to tune the query.
To improve query performance—particularly queries with large-scan indicators—additional indexes or index changes should be considered. Teradata’s various indexing options enable efficient resource use, saving I/O and CPU time and thereby making more resources available for other work. Options such as partitioned primary index (PPI), secondary indexes and join indexes can help reduce resource consumption and make queries more efficient.
Teradata’s Index Wizard can recommend where to add secondary indexes and PPIs in tables based on a particular workload. DBQL Object Data can be used to determine and remove unused secondary indexes, which will reduce maintenance overhead when loading data and help reclaim storage space.
STEP 4: Determine the best solution
To determine the best tuning options, it is important to baseline existing performance conditions (using DBQL data), pilot potential solutions through experimentation and analyze the results. If multiple optimization strategies are found, DBAs should test one strategy at a time by temporarily creating the new scenario, changing the queries to use the new objects, running the queries, and measuring, documenting and analyzing the results.
DBAs must run tests on the same production system and take the following steps to determine the solution with the best cost/benefit and viability of the final performance fixes:
![]() |
Test the system, using a user ID with a low workload priority. |
![]() |
Use each of the optimization strategies and gather the new DBQL data. |
![]() |
Compare the new DBQL measurements with the original baseline. |
STEP 5: Regression testing
Regression testing is an important quality control process to ensure any optimization changes or fixes will not adversely affect data warehouse performance. First, the DBA must determine a representative list of queries that apply to the selected performance fix. From there, a regression test suite is created to gauge the effectiveness of the solution before production.
In regression testing, the new environment is re-created on the same production system, and the effects of the change are measured and documented. The goal is to ensure queries that are not part of the tuning process are not unduly affected by the optimization changes.
To illustrate, suppose the DBA wants to make a primary index (PI) change, a modification the DBA thinks will improve system performance. Regression testing shows that this particular PI change causes another application to perform full-table scans for its most common queries. In this case, changing the PI may not be the optimal choice, and other tuning options should be considered.
STEP 6: Quantify and translate performance gains into business value
CIOs are routinely pressed to show how their IT dollars affect operations and enable cost reduction and business growth. Quantifying the business value of query optimization, or any IT improvement, is an important step to showcasing the value of the data warehouse.
Determining business value can be broken into calculations and sub-calculations. To answer the question “How many CPU seconds equals a node?” use the following calculations:
![]() |
Determine per node CPU seconds in a day (number of CPUs per node X 86,400) – 20%, where 86,400 equals the number of seconds in a day, and 15% to 20% is subtracted from the equation to account for system-level work not recorded in DBQL. |
![]() |
Multiply per node CPU seconds in a day by 30 to get CPU seconds per node per month. On a four-CPU node, the equation would look something like this: ((4 X 86,400) – (4 X 86,400)/5)) X 30 = 8,294,400 CPU seconds. |
![]() |
Check the impact of making a tuning change: Monthly CPU saved = Total old CPU for a month X the average improvement percent. |
To bring these calculations to life, imagine the DBA makes an optimization change where regression testing shows an overall 50% improvement in query CPU consumption. Queries that previously averaged 8 million CPU seconds a month now average 4 million CPU seconds a month. This equates to half a node of processing power savings.
Taking this a step further, if the targeted queries run only during normal business hours (a 12-hour day), the performance improvement would be even greater.
STEP 7: Document and implement
Presenting application tuning recommendations to IT management and business users typically requires more than a spreadsheet of data, although a spreadsheet can be used for backup material or a deeper dive into performance data and options. The presentation should be tailored to a specific audience and should capture the value of application tuning. The presentation might include:
![]() |
Query optimization process |
![]() |
Options found and tested |
![]() |
Best option |
![]() |
Options discarded, and why |
![]() |
Lists of what still needs testing |
![]() |
Observations and recommendations |
![]() |
Anticipated savings |
Decision support workload typically involves full table scans (FTS) and all-row joins -usually between big tables.
Indexing (NUSI,USI, etc.) can improve performance, but more often all table rows are accessed.
Here is an example of a decision support query:
SELECT CustomerGroupName, SUM(Revenue)
FROM
Sales t01
INNER JOIN
CustomerGroup t02
ON
t01.GroupCd = t02.GroupCd
GROUP BY 1;
Tactical workload is characterized by single row access or the access to a limited number of rows. Here is an example for a tactical query:
SELECT *
FROM Sales
WHERE SalesId = 100;
Tactical query performance does not improve with the number of Teradata nodes. Adding more nodes to your Teradata system will not have any impact on the performance of your single tactical query. Nevertheless, adding nodes to your system allows more tactical queries to be executed at the same time.
The performance of decision support queries improves with the number of nodes. Assuming a perfect data distribution (no skewing), doubling the number of nodes can speed up your query by a factor of two (nevertheless, this are theoretical numbers, no system is working 100% in parallel and without any skewing effects).
the most suitable join type for tactical workload is the Nested Join.
Business users often need to retrieve information based on the natural key of a table and not the surrogate keys. It‘s always a good idea to think about keeping the natural keys in your physical data model, when dealing with tactical workload!
Tactical Workload Tuning on Teradata
[Total: 13 Average: 4.5/5]
- 7
- 16
- 2
-
26Shares
Tactical workload tuning is a special discipline on Teradata. Tactical workload requires a special skill set and tools.
In this article I will show you the skills needed to optimize tactical workload.
You will learn how to use the suitable tools and the “tactical workload skill set”.
The difference between Tactical and Decision Support Workload
Decision support workload typically involves full table scans (FTS) and all-row joins -usually between big tables.
Indexing (NUSI,USI, etc.) can improve performance, but more often all table rows are accessed.
Here is an example of a decision support query:
SELECT CustomerGroupName, SUM(Revenue)
FROM
Sales t01
INNER JOIN
CustomerGroup t02
ON
t01.GroupCd = t02.GroupCd
GROUP BY 1;
Tactical workload is characterized by single row access or the access to a limited number of rows. Here is an example for a tactical query:
SELECT *
FROM Sales
WHERE SalesId = 100;
Optimized tactical queries need indexed access paths, such as UPI,NUPI,NUSI,USI, or a single row partition. The main goal is to have an execution plan with each step only delivering one or a few rows.
To achieve this goal, the tactical selection criteria of your query has to be applied as early as possible in the execution plan.
With a proper query design, tactical query execution is stable, and performance is not decreasing if table cardinality is growing, while decision support queries usually become slower with a growing amount of data:
Tactical query performance does not improve with the number of Teradata nodes. Adding more nodes to your Teradata system will not have any impact on the performance of your single tactical query. Nevertheless, adding nodes to your system allows more tactical queries to be executed at the same time.
The performance of decision support queries improves with the number of nodes. Assuming a perfect data distribution (no skewing), doubling the number of nodes can speed up your query by a factor of two (nevertheless, this are theoretical numbers, no system is working 100% in parallel and without any skewing effects).
The Tactical Workload Toolkit
Beside optimizing the retrieve steps with indexed access to single rows (USI,UPI,NUSI,etc.) , we have to optimize the joins. Join optimization means to minimize the number of AMPs involved in the join operations.
For example, the execution plan of below query shows that the selected row is accessed from only one AMP (single-AMP join step):
with no residual conditions, which is joined to Sales.SalesDetails by way of the primary index “Sales.sd.SalesId = 100″
with a residual condition of (“Sales.SalesDetails.SalesId = 100″). Sales.Sales and Sales.SalesDetails
are joined using a merge join, with a join condition of (“Sales.Sales.SalesId = Sales.SalesDetails.SalesId”).
The result goes into Spool 1 (one-amp), which is built locally on that AMP.
The size of Spool 1 is estimated with low confidence to be 151 rows.
Above query will always perform identical (given the execution plan is not changing). This is exactly what we want to achieve in tactical workload tuning:
A stable direct access to the data.
While above example was built around a merge join, the most suitable join type for tactical workload is the Nested Join.
When designing your physical data model, you should consider a design which allows the usage of Nested Joins.
Business users often need to retrieve information based on the natural key of a table and not the surrogate keys. It‘s always a good idea to think about keeping the natural keys in your physical data model, when dealing with tactical workload!
Here is an example of a physical data model which allows the usage of Nested Joins.
CREATE TABLE Sales
(
ProductId INTEGER NOT NULL,
SalesId INTEGER NOT NULL
) PRIMARY INDEX (ProductId );
CREATE TABLE Products
(
ProductId INTEGER NOT NULL,
ProductCode CHAR(10)
) PRIMARY INDEX (ProductId )
UNIQUE INDEX (ProductCode);
In our example above, the data is accessed via the natual key “ProductCode” and not the surrogate key (ProductId). Therefore, we use an USI to create a direct data access path via the natural key:
SELECT * FROM Products p INNER JOIN Sales s ON p.prod_id = s.prod_id WHERE ProductCode = ‚Private‘
TOOLS TO OPTIMIZE QUERIES
Teradata Visual Explain
The Teradata System Emulation Tool allows users to quickly project a production environment by emulating a large production system in a small test or development setting. This reduces the cost of a query plan analysis and development efforts
Teradata Statistics Wizard
helps with statistics collection. Collecting statistics ensures that Teradata Optimizer has the most accurate and timely information about the data. But knowing which statistics to collect can be time consuming, and it requires in-depth application and database knowledge. And the timing of statistics collection depends on how frequently data is loaded in the data warehouse. Teradata Statistics Wizard helps this process by analyzing specified workloads or databases and creates a recommended list of operations. Statistics collection can help the Teradata Optimizer choose the best query plan.
To improve query performance—particularly for those with large scan indicators—additional indexes or index changes should be considered. Teradata offers indexing options that enable efficient resource use, saving I/O and CPU time, which makes more resources available for other work. Options such as partitioned primary index (PPI), secondary indexes and join indexes can help reduce resource consumption and make queries more efficient.
The Teradata Index Wizard
automates the process of picking secondary indexes for a database and workload to maximize system performance. It can recommend where to add secondary indexes and PPIs in tables, based on a particular workload. When enabled, DBQL captures all statement types. In addition, macros, views, triggers, stored procedure and user-defined functions (UDFs) are logged in the DBQL Object table (DBQLObjTbl). DBQL Object Data can be used to determine and remove unused secondary indexes, which will reduce maintenance overhead when loading data and help reclaim storage space.
ENABLING THE CRUCIAL FIRST STEP
Determining the best tuning options for queries involves setting a baseline for existing performance conditions using DBQL data, piloting potential solutions through experimentation, and then analyzing the results. The Teradata Analyst Pack, along with other Teradata tools and utilities, enhances DBAs’ ability to analyze and understand the detailed steps involved in query plans and the influences of the system configuration, data demographics and secondary index structure. The Teradata Analyst Pack lets DBAs take the important first step in identifying ways to improve system performance and test proposed improvements without affecting the production environment.
Heartbeat Query – Performance Tuning
A heartbeat query can be any SQL statement run at specific intervals whose response time is
being monitored.
- Measure response time as an indicator of system demand or system/database hangs
- Initiate an alert system if response time degrades so that you can take appropriate action
Heartbeat queries are classified as
- System
- Production
System Heartbeat Queries:
- used to check overall system/database hangs, to react when response time reaches a certain thresholds, or when stalled, such as send alert and/or capture system level information
- it should execute diagnostics that capture the state of the system if performance stalls
- they are intended to focus on Teradata core system. They should be short running , low impact queries on tables that are normally not write locked
- they are more useful when run frequently (for example 5-10 mins)
- they should be run on system nodes (which eliminates other factors like middle tiers, network connections)
Example of system heartbeat query:
select * from dbc.dbcinfo;
As the query runs, Teradata Manager can monitor the query, logging start and end times.
Production Heartbeat Queries:
- response times play a major role in the production heartbeat queries, these need to be monitored and stored for tracking purposes
- response time are an indicator of system demand. When system demand is high, heartbeat response is high
- they can be run on production user tables
- monitor overall response
- monitor specific area of the job mix
- can be more complex and similar in nature to a particular type of production query, running in the same Priority Scheduler performance group
- they are run less frequently (for example 20-60 mins)
ruleset