UNION TRANSFORMATION
1. WHAT IS A UNION TRANSFORMATION?
2. WHAT ARE THE RESTRICTIONS OF UNION TRANSFORMATION?
3. HOW COME UNION TRANSFORMATION IS ACTIVE?
le target table? How do I go about it? Explain in detail through mapping flow.
- We can use joiner, if we want to join the data sources. Use a joiner and use the matching column to join the tables.
- We can also use a Union transformation, if the tables have some common columns and we need to join the data vertically. Create one union transformation add the matching ports form the two sources, to two different input groups and send the output group to the target.
The basic idea here is to use, either Joiner or Union transformation, to move the data from two sources to a single target. Based on the requirement, we may decide, which one should be used.
117.Why update strategy and union transformations are Active? Explain with examples.
- The Update Strategy changes the row types. It can assign the row types based on the expression created to evaluate the rows. Like IIF (ISNULL (CUST_DIM_KEY), DD_INSERT, DD_UPDATE). This expression, changes the row types to Insert for which the CUST_DIM_KEY is NULL and to Update for which the CUST_DIM_KEY is not null.
- The Update Strategy can reject the rows. Thereby with proper configuration, we can also filter out some rows. Hence, sometimes, the number of input rows, may not be equal to number of output rows.
Like IIF (IISNULL (CUST_DIM_KEY), DD_INSERT,
IIF (SRC_CUST_ID! =TGT_CUST_ID), DD_UPDATE, DD_REJECT))
Here we are checking if CUST_DIM_KEY is not null then if SRC_CUST_ID is equal to the TGT_CUST_ID. If they are equal, then we do not take any action on those rows; they are getting rejected.
118. What is a Union Transformation?
Union is an Active, Connected non-blocking multiple input group transformation used to merge data from multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union transformation does not remove duplicate rows.
119. What are the restrictions of Union Transformation?
- All input groups and the output group must have matching ports. The precision, data type, and scale must be identical across all groups.
- We can create multiple input groups, but only one default output group.
- The Union transformation does not remove duplicate rows.
- We cannot use a Sequence Generator or Update Strategy transformation
- upstream from a Union transformation.
- The Union transformation does not generate transactions.
120. How come union transformation is active?
Active transformations are those that may change the number or position of rows in the data stream. Any transformation that splits or combines data streams or reduces, expands or sorts data is an active transformation because it cannot be guaranteed that when data passes through the transformation the number of rows and their position in the data stream are always unchanged.
Union is an active transformation because it combines two or more data streams into one. Though the total number of rows passing into the Union is the same as the total number of rows passing out of it, and the sequence of rows from any given input stream is preserved in the output, the positions of the rows are not preserved, i.e. row number 1 from input stream 1 might not be row number 1 in the output stream. Union
does not even guarantee that the output is repeatable.
For Union, number of input rows does not match with the number of output rows. Consider, we have two sources with 10 and 20 rows individually. For each of this input Source we are getting 30 output rows. We could probably consider this like a Joiner with 10 and 20 rows with Full Outer Join, with no matching columns,
which will give you all the rows as output.
It is a debatable Topic as why UNION transformation is Active. Union Transformation is derived from Multigroup External transformation. As Multigroup External transformation is Active, Union transformationcan be termed as active.