Data warehousing with Teradata: August 2013

Thursday, August 22, 2013

Index creation in Teradata

There are two types of indexes in Teradata, namely Primary Index and Secondary index. primary index should be created while creating table, whereas secondary index can be created after table has been created.

Secondary index create syntax:

non-unique(NUSI):

Create Index indexname(col1,col2,...) on databasename.tablename;

unique(USI):

Create Unique Index indexname(col1,col2,...) on databasename.tablename;

Tuesday, August 20, 2013

What is Hot Amp in Teradata?

Hot Amp is situation when one or few AMP's in the system work more than other AMP's in the system due to uneven data distribution among the AMP's.

when a select query is executed on skewed tables, few amps with more data will work longer while other AMP's with lower data volume will finish its processing earlier and sit idle waiting for Hot AMP's to finish its processing. in this scenario the overall query performance directly proportional to the Hot AMP's performance.

Find Skewed tables

we can find the skewed tables of a system by using below query, we can use the skewed tables for Primary Index revisit for performance enhancement of overall system.

Lock Dbc.TableSize For Access
Lock Dbc.diskspace For Access
SELECT b.databasename,b.TableName AS TableName,c.tablekind,SUM(CurrentPerm) /1024/1024 AS "CurrentPerm(MB)",
SUM(PeakPerm)/1024/1024 AS "PeakPerm(MB)",(100 - (AVG(CurrentPerm)/MAX(CurrentPerm)*100)) AS SkewFactor
FROM Dbc.TableSize b,dbc.tables c, (
Select
databasename ,zeroifnull((Max(currentperm) - Avg(currentperm))/max(nullifzero(currentperm)) * 100) As SKW
From dbc.diskspace
Group By 1
Having SKW >=10 ) a
WHERE b.DataBaseName=a.databasename
and c.tablename=b.tablename
and c.tablekind='T'
Group By 1,2,3
Having SkewFactor >=20
Order By "CurrentPerm(MB)" desc;

Hash function in teradata

Data distribution of a table on all AMP's depends on Hashvalue calculated on Primary Index value, Teradata has is own algorithm to calculate hash value. There are four type of hash functions in Teradata which simulate the hash algorithm, which can be used to find the data distribution of a table.

1. HASHROW(column1,..)
2. HASHBUCKET(Hashrow)
3. HASHAMP(hashbucket)
4. HASHBAKAMP(hashbucket)

Query to check data distribution of a table:

SELECT HASHAMP (HASHBUCKET (HASHROW (columnName))) AS “AMPNO”, COUNT (*) FROM Tablename GROUP BY 1 ORDER BY 2 DESC;

You can use below query to check on which amp number, amp bucket number a particular value will be mapped to.

SELECT HASHROW ('motivation') AS "Hash Value"
, HASHBUCKET (HASHROW ('motivation')) AS "Bucket No"
, HASHAMP (HASHBUCKET (HASHROW ('motivation'))) AS "AMP No"
, HASHBAKAMP (HASHBUCKET (HASHROW ('motivation'))) AS "AMP Fallback No" ;

you can use HASHROW() function to check hash collision.

SELECT HASHROW ('motivation') AS "Hash Value";

Monday, August 12, 2013

Teradata Viewpoint Services

Teradata viewpoint is web application and is very interactive and efficient monitoring tool, which replace teradata manager and its functionality from version td13, it is installed and run on dedicated server in the network to capture various monitoring and performance data of teradata database system.

viewpoint basically capture data using its data collection service (dcs) and save it to its back-end database runs in postgressql database, its portal service called viewpoint service displays the collected information interactively on viewpoint portlet.

there is mainly 3 services which manage the viewpoint operation
1. viewpoint (portal service).
2. dcs (data collector service)
3. postgresql (back-end database service)

we can check the status of services or start/stop/restart as we face some issue

$/etc/init.d/dcs start/stop/status

$/etc/init.d/postgresql start/stop/status

$/etc/init.d/viewpoint start/stop/status

we have their respective log files to check for any error or warnings.

/opt/teradata/viewpoint/logs/viewpoint.log

/opt/teradata/dcs/logs/dcs.log

* apart from these services there is supporting service called CAM services for all java based services.