TERADATA TPUMP

TPUMP

277.A certain load is being imposed on the table and that too, every hour. The traffic in the morning is relatively low, and that of the night is very high. As per this situation, which is the most advisable utility and how is that utility supposed to be loaded?

The most suggestible utility here has to be Tpump. By making use of packet size decreasing or increasing, the traffic can be easily handled.

278.Explain TPUMP (Teradata Parallel Data Pump) Utility in Teradata?

  • TPUMP allows near real time updates from Transactional Systems into the Data Warehouse.It can perform Insert, Update and Delete operations or a combination from the same source.
  • It can be used as an alternative to MLOAD for low volume batch maintenance of large databases.
  • TPUMP allows target tables to have Secondary Indexes, Join Indexes, Hash Indexes, Referential Integrity, Populated or Empty Table, Multiset or Set Table or Triggers defined on the Tables.
  • TPUMP can have many sessions as it doesn’t have session limit.
  • TPUMP uses row hash locks thus allowing concurrent updates on the same table
  • It is called TPump because you can adjust the rate for which you want the data loaded, from trickle to maximum speed.
  • TPump is used when you want to load during the day while simultaneously querying a table.
  • TPump does it all and is used for maintenance on populated tables one row at a time.
  • TPump is NOT a Block Level Utility and has No session  Limits
  • so it is perfect when you don’t need to load massive amounts of data or you want to simultaneously load and query a table.
  • It is often used if the tables you’re loading to have secondary indexes, join indexes, triggers, or referential integrity.
  • It is also great when you need hundreds of load jobs to run simultaneously.
  • It is not as fast as FastLoad or MultiLoad, but it is pretty darn fast.
  • It is called TPump because you can adjust the rate for which you want the data loaded, from a trickle to maximum speed.
  • TPump is NOT a Block Level Utility and has No Limits
  • FastLoad performs Inserts into empty Teradata tables in 64K blocks.
  • MultiLoad loads to populated Teradata tables in 64K blocks.
  • FastExport – Exports data off of Teradata at the 64K blo
  • No Secondary Indexes (FastLoad)
  • No Unique Secondary Indexes (FastLoad and MultiLoad)
  • No Triggers (FastLoad and MultiLoad)
  • No Join Indexes (FastLoad and MultiLoad)
  • No Referential Integrity (FastLoad and MultiLoad)
  • Only 30 FastLoad and MultiLoad combinations can run at one time.
  • Only 60 FastExport jobs can run simultaneously!
  • TPump has none of the limits above, so it is perfect when you don’t need to load massive amounts of data or you want to simultaneously load and query a table.
  • It is often used if the tables you’re loading to have secondary indexes, join indexes, triggers, or referential integrity.
  • It is also great when you need hundreds of load jobs to run simultaneously.
  • It is not as fast as FastLoad or MultiLoad, but it is pretty darn fast

280. What are TPUMP Utility Limitations?

  • Use of SELECT statement is not allowed.
  • Concatenation of Data Files is not supported.
  • Exponential & Aggregate Operators are not allowed.
  • Arithmetic functions are not supported.

Rule #1: No concatenation of input data files is allowed. TPump is not designed to support this.
Rule #2: TPump will not process aggregates, arithmetic functions or exponentiation. If you need data conversions or math, you might consider using an INMOD to prepare the data prior to loading it.
Rule #3: The use of the SELECT function is not allowed. You may not use SELECT in your SQL statements.
Rule #4: No more than four IMPORT commands may be used in a single load task. This means that at most, four files can be directly read in a single run.
Rule #5: Dates before 1900 or after 1999 must be represented by the yyyy format for the year portion of the date, not the default format of yy. This must be specified when you create the table. Any dates using the default yy format for the year are taken to mean 20th century years.
Rule #6: On some network attached systems, the maximum file size when using TPump is 2GB. This is true for a computer running under a 32-bit operating system.
Rule #7: TPump performance will be diminished if Access Logging is used. TPump uses normal SQL to accomplish its tasks.

281.

AS in MLOAD

SESSIONS Max [Min]
ERRORTABLE Tablename (defaults to jobname_ET)
ERRLIMIT Errcount [errpercent]
CHECKPOINT Frequency (default is 15 minutes)
TENACITY Hours to continue to try and logon to run (default 4)
SLEEP Minutes to sleep between logon attempts

These .BEGIN Parameters are only used in TPump

  • SERIALIZE If ON, this options guarantees that operations on a row occur serially.
  • PACK Statements to pack into a multiple-statement request. (default is 20, max is 600)
  • PACKMAXIMUM Use the max pack factor.
  • RATE Max rate statements are sent to the Teradata RDBMS per minute. (Default unlimited)
  • LATENCY Number (range is from 10-600 seconds) Allows TPump to commit to Teradata any data sitting in the buffer longer than the LATENCY value.
  • NOMONITOR Prevents TPump from checking for statement rate changes from or update status information for the TPump Monitor application. By default the monitoring is ON
  • ROBUST ON|OFF (default is ON). The OFF parameter signals TPump to use “simple” restart logic. In this case, restarts cause TPump to begin where the last checkpoint occurred.
  • MACRODB dbname (default is logtable database). Database if a macro is used in the TPump job.

Five Formats of TPump Files

Binary• Each record is a 2-byte integer, n, that is followed by n bytes of data. A byte is the smallest means of storage for Teradata.
FastLoad• This format is the same as Binary, plus a marker (X ‘0A’ or X ‘0D’) that specifies the end of the record.
Text• Each record has a random number of bytes and is followed by an end of the record marker.
Unformat• The format for these input records is defined in the LAYOUT statement of the MultiLoad script using the components FIELD, FILLER and TABLE.
Vartext• This is variable length text RECORD format separated by delimiters such as a comma. For this format you may only use VARCHAR, LONG VARCHAR (IBM) or VARBYTE data formats in your MultiLoad LAYOUT. Note that two delimiter characters in a row will result in a null value between them.

Sample script

Executing a TPump Script