SORTER TRANSFORMATION
1. WHAT IS A SORTER TRANSFORMATION?
2. WHY IS SORTER AN ACTIVE TRANSFORMATION?
3. HOW DOES SORTER HANDLE CASE SENSITIVE SORTING?
4. HOW DOES SORTER HANDLE NULL VALUES?
5. HOW DOES A SORTER CACHE WORKS?
6. HOW TO DELETE DUPLICATE RECORDS OR RATHER TO SELECT DISTINCT ROWS FOR FLAT FILE SOURCES?
111. What is a Sorter Transformation?
Sorter is an Active Connected transformation used to sort data in ascending or descending order according to specified sort keys. The Sorter transformation contains only input/output ports.
Sorter Transformation is used to sort large volume of data through multiple ports. It is much likely to work as the ORDER BY Clause in SQL. Sorter Transformation can be Active, Passive or Connected.
Active Transformation passes through Mapping and changes a number of rows whereas Passive Transformation passes through Mapping but does not change the number of rows.
Most of the INFORMATICA Transformations are Connected to the Data Path.
112. Why is Sorter an Active Transformation?
This is because we can select the “distinct” option in the sorter property. When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.
It is an active transformation because it removes the duplicates from the key and consequently changes the number of rows.
113. How does Sorter handle Case Sensitive sorting?
The Case Sensitive property determines whether the Integration Service considers case when sorting data.
When we enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters.
114. How does Sorter handle NULL values?
We can configure the way the Sorter transformation treats null values. Enable the property Null Treated Low if we want to treat null values as lower than any other value when it performs the sort operation. Disable this option if we want the Integration Service to treat null values as higher than any other value.
115. How does a Sorter Cache works?
The Integration Service passes all incoming data into the Sorter Cache before Sorter transformation performs the sort operation.
The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine.
If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory.
116. How to delete duplicate records or rather to select distinct rows for flat file sources?