1. WHAT IS A LOOKUP TRANSFORM?
2. WHAT ARE THE DIFFERENCES BETWEEN CONNECTED AND UNCONNECTED LOOKUP?
3. WHAT ARE THE DIFFERENT LOOKUP CACHE(S)?
4. IS LOOKUP AN ACTIVE OR PASSIVE TRANSFORMATION?
5. WHAT IS THE DIFFERENCE BETWEEN STATIC AND DYNAMIC LOOKUP CACHE?
6. WHAT ARE THE USES OF INDEX AND DATA CACHES?
7. WHAT IS PERSISTENT LOOKUP CACHE?
8. WHAT TYPE OF JOIN DOES LOOKUP SUPPORT?
9. EXPLAIN HOW LOOKUP TRANSFORMATION WORKS LIKE SQL LEFT OUTER JOIN.
10. WHERE AND WHY DO WE USE UNCONNECTED LOOKUP INSTEAD OF CONNECTED LOOKUP?
11. HOW CAN WE IDENTIFY PERSISTENT CACHE FILES IN INFORMATICA SERVER?
12. HOW TO CONFIGURE A LOOKUP ON A FLAT FILE WITH HEADER?
13. WHAT IS THE DIFFERENCE BETWEEN PERSISTENT CACHE AND SHARED CACHE?
14. DESCRIBE HOW TO RETURN MULTIPLE PORT VALUES FROM UNCONNECTED LOOKUP IN INFORMATICA.
15. HOW TO MAKE THE PERSISTENT LOOKUP CACHE IN SYNC WITH LOOKUP TABLE?
16. IF WE USE PERSISTENT CACHE FOR A DYNAMIC LOOKUP, WILL THE CACHE FILE BE UPDATED OR INSERTED AS REQUIRED?
17. IS THERE ANYTHING WRONG IN SHARING A PERSISTENT CACHE BETWEEN STATIC AND DYNAMIC LOOKUP?
18. WHAT IS THE DIFFERENCE BETWEEN THE TWO UPDATE PROPERTIES – UPDATE ELSE INSERT, INSERT ELSE UPDATE IN DYNAMIC
19. IF THE DEFAULT VALUE FOR THE LOOKUP RETURN PORT IS NOT SET, WHAT WILL BE THE OUTPUT WHEN THE LOOKUP CONDITION
20. HOW CAN WE ENSURE DATA IS NOT DUPLICATED IN THE TARGET WHEN THE SOURCE HAS DUPLICATE RECORDS, USING LOOKUP
46.When do you use SQL override in a lookup transformation?
You should override the lookup query in the following circumstances:
- Override the ORDER BY clause. Create the ORDER BY clause with fewer columns to increase performance. When you override the ORDER BY clause, you must suppress the generated ORDER BY clause with a comment notation.
Note: If you use pushdown optimization, you cannot override the ORDER BY clause or suppress the generated ORDER BY clause with a comment notation.
- A lookup table name or column names contains a reserved word. If the table name or any column name in the lookup query contains a reserved word, you must ensure that they are enclosed in quotes.
- Use parameters and variables. Use parameters and variables when you enter a lookup SQL override. Use any parameter or variable type that you can define in the parameter file. You can enter a parameter or variable within the SQL statement, or use a parameter or variable as the SQL query. For example, you can use a session parameter, $ParamMyLkpOverride, as the lookup SQL query, and set $ParamMyLkpOverride to the SQL statement in a parameter file. The designer cannot expand parameters and variables in the query override and does not validate it when you use a parameter or variable. The integration service expands the parameters and variables when you run the session.
- A lookup column name contains a slash (/) character. When generating the default lookup query, the designer and integration service replace any slash character (/) in the lookup column name with an underscore character. To query lookup column names containing the slash character, override the default lookup query, replace the underscore characters with the slash character, and enclose the column name in double quotes.
- Add a WHERE clause. Use a lookup SQL override to add a WHERE clause to the default SQL statement. You might want to use the WHERE clause to reduce the number of rows included in the cache. When you add a WHERE clause to a Lookup transformation using a dynamic cache, use a Filter transformation before the Lookup transformation to pass rows into the dynamic cache that match the WHERE clause.
Note: The session fails if you include large object ports in a WHERE clause.
- Other. Use a lookup SQL override if you want to query lookup data from multiple lookups or if you want to modify the data queried from the lookup table before the Integration Service caches the lookup rows. For example, use TO_CHAR to convert dates to strings.
47.How Do You Create A Mapping Using Multiple Lookup Transformation?
Use unconnected lookup if same lookup repeats multiple times.
Which Is Better Among Connected Lookup And Unconnected Lookup Transformations In Informatica Or Any Other Etl Tool?
If you are having defined source you can use connected, source is not well defined or from different database you can go for unconnected.
Connected and unconnected lookup depends on scenarios and performance If you are looking for a single value for look up and the value is like 1 in 1000 then you should go for unconnected lookup. Performance wise its better as we are not frequently using the transformation. If multiple columns are returned as lookup value then one should go for connected lookup.
48. What is a Lookup transform?
Lookup transformations are those transformations which have admission right to RDBMS based data set. The server makes the access faster by using the lookup tables to look at explicit table data or the database. Concluding data is achieved by matching the look up condition for all look up ports delivered during transformations
- Lookup transformation is used to look up a source, source qualifier, or target in order to get relevant data.
- It is used to look up a ‘flat file’, ‘relational table’, ‘view’ or ‘synonym’.
- Lookup can be configured as Active or Passive as well as Connected or Unconnected transformation.
- When the mapping contains the lookup transformation, the integration service queries the lookup data and compares it with lookup input port values. One can use multiple lookup transformations in a mapping.
- The lookup transformation is created with the following type of ports:
- Input port (I)
- Output port (O)
- Look up Ports (L)
- Return Port (R)
The result is passed to other transformations and the target.
- Get related value
- Perform a calculation
- Update slowly changing dimension tables.
- It is used for looking up data in a relational table through mapping.
- Lookup definition from any relational database is imported from a source which has tendency of connecting client and server.
- One can use multiple lookup transformation in a mapping.
We have the following types of Lookup.
- Relational or flat file lookup. To perform a lookup on a flat file or a relational table.
- Pipeline lookup. To perform a lookup on application sources such as JMS or MSMQ.
- Connected or unconnected lookup.
- A connected Lookup transformation receives source data, performs a lookup, and returns data to the pipeline.
- An unconnected Lookup transformation is not connected to a source or target. A transformation in the pipeline calls the Lookup transformation with a: LKP expression. The unconnected Lookup transformation returns one column to the calling transformation.
- Cached or un-cached lookup.We can configure the lookup transformation to Cache the lookup data or directly query the lookup source every time the lookup is invoked. If the Lookup source is Flat file, the lookup is always cached.
49. What are the differences between Connected and Unconnected Lookup
So it can be said that an unconnected lookup can be called multiple times in mapping..
Any number of input parameters can exist. For instance, you can provide input parameters like column 1, column 2, column 3, and so on. But the return value would only be one.
50.How you can differentiate between Connected LookUp and Unconnected LookUp?
Connected Lookup is part of the data flow which is connected to another transformation, it takes data input directly from another transformation which performs a lookup. It uses both static and dynamic Cache.
Unconnected Lookup does not take the data input from another transformation but it can be used as a function in any transformation using LKP(LookUp) Expression. It uses the only static cache.
51.Explain the features of Connected and Unconnected lookup.
The features of Connected Lookup can be explained as follows:
- There is a direct source of input from the pipeline for connected lookup.
- It has active participation in data flow and both dynamic as well as the static cache is used as the case is.
- It caches all lookup columns and returns the default values as the output when lookup condition does not match.
- More than one column values can be returned to the output port.
- Multiple output values are passed as well as output ports are linked to another transformation.
- Connected lookup supports user-defined default values.
The features of unconnected lookup can be explained as follows:
- Unconnected lookup uses static cache and its source of input is the result received from the output of LKP expression.
- It caches only the lookup output ports and returns the value as NULL when lookup condition does not match.
- Only one column is returned from each port.
- Only one output value is passed to another transformation.
- User-defined default values are not supported by unconnected lookup.
52. What are the different lookup cache(s)?
- Informatica Lookups can be cached or un-cached (No cache).
- And Cached lookup can be either static or dynamic.
- By default, Informatica cache is static cache.
A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.
- Static Cache
- Dynamic Cache
- Persistent Cache
- Shared Cache
Based on the configurations done at lookup transformation/Session Property level, we can have following types of Lookup Caches.
- Persistent Cache
By default, the Lookup caches are deleted post successful completion of the respective sessions but, we can configure to preserve the caches, to reuse it next time.
- Shared Cache
We can share the lookup cache between multiple transformations. We can share an unnamed cache between transformations in the same mapping. We can share a named cache between transformations in the same or different mappings.
1. No Cache
- To disable cache, un check the “Lookup caching enabled” option in the session properties.
- When Cache is not enabled, For every Row, Look up hits the database for comparison and returns the result.
- Un- cached lookup For each record, it goes to the lookup Source, performs the lookup and returns value. So for 10K rows, it will go the Lookup source 10K times to get the related values.
- In order to reduce the to and fro communication with the Lookup Source and Informatica Server, we can configure the lookup transformation to create the cache.
- In this way, the entire data from the Lookup Source is cached and all lookups are performed against the Caches.
- Static Cache
- In case of static Cache, Look up creates a one-time cache before start reading-writing to the source-target tables.
- The Cache is nothing but the snapshot of the data in the Look up table.
- For every row, look up refers to the cache and returns the result.
- As this is a one time snapshot, so this cache is said to be static cache.
- Any further data modification in the physical table don’t affect the cache, once created.
- Dynamic Cache
- The cache updates or changes dynamically when lookup on target table.
- The dynamic lookup T/R allows for the synchronization of the target lookup table image in the memory with its physical table in the database.
- The dynamic lookup T/R or dynamic lookup cache is operated in only connected mode (connected lookup )
- Dynamic lookup cache support only equality conditions (=conditions)
|New Lookup Row||Description|
|0||The integration service does not update or insert the row in the cache|
|1||The integration service inserts the row into the cache|
|2||The integration service updates the row in the cache|
- Look up creates a one-time cache before start reading-writing to the source-target tables.
- after that, for every row, look up either inserts or updates the target table as per the defined logic/condition.
- So, at one point of time, both the target table as well as the Look up cache is updated.
- Hence , at the run time, the cache is updated.
- so this type of cache is called as Dynamic cache.
- Dynamic caches are helpful while dealing with Duplicate Records.
- The startegy to be set in the session properties to deal with each row for insert/update in the cache.
- Look-up has very powerful usage and should be used wisely to get the optimum performance.
- basing upon the type of data we are delaing, we have to correctly choose, which type of cache to go with.
Based on the types of the Caches configured, we can have two types of caches, Static and Dynamic.
The Integration Service performs differently based on the type of lookup cache that is configured. The following table compares Lookup transformations with an uncached lookup, a static cache, and a dynamic cache:
53. Is lookup an active or passive transformation?
From Informatica 9x, Lookup transformation can be configured as an “Active” transformation.
Find out How to configure lookup as active transformation.
However, in the earlier versions of Informatica, lookup is a passive transformation.
54. What is the difference between Static and Dynamic Lookup Cache?
We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation. Rows are not added dynamically in the cache.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target. It basically, caches the rows as and when it is passed.
In case you are wondering why we need to make lookup cache dynamic, read this article on dynamic lookup.
55. What are the uses of index and data caches?
The conditions are stored in index cache and records from the lookup are stored in data cache
56. What is Persistent Lookup Cache?
If the cache generated for a Lookup needs to be preserved for subsequent use then persistent cache is used. It will not delete the index and data files. It is useful only if the lookup table remains constant. Lookups are cached by default in Informatica. Lookup cache can be either non-persistent or persistent. The Integration Service saves or deletes lookup cache files after a successful session run based on, whether the Lookup cache is checked as persistent or not.
57. What type of join does Lookup support?
Lookup is just similar like SQL LEFT OUTER JOIN.
58. Explain how lookup transformation works like SQL Left Outer Join.
Lookup means if the source input column value matches the lookup table comparison column value then it will Return valid values from the lookup table else it will return NULL. Let’s consider the EMP table as Source and DEPT table as lookup. We want to extract the location of eachemployee based on his or her department number. So if the Location details are not available in the DEPT table, still we want to have all the other information of the employee coming from the source EMP table, apart from NULL as location and load in our target table.
So the equivalent SQL query looks like below:-
SELECT EMP.*, DEPT.LOC
FROM EMP LEFT OUTER JOIN DEPT
ON EMP.DEPTNO = DEPT.DEPTNO
Hence Lookup is associated with the Source table as Left Outer Join.
59.Where and why do we use Unconnected Lookup instead of Connected Lookup?
The best part of unconnected lookup is that, we can call the lookup based on some condition and not every time. I.e. based on some condition met we can invoke the unconnected lookup in an expression transformation else not. By this we may optimize the performance of a flow. We may consider unconnected lookup as a function in any procedural language. It takes multiple parameters as input and returns one values, and can be used repeatedly. Same way unconnected lookup can be used in any scenario where we need to use the lookup repeatedly either in single or multiple transformation.
With the unconnected lookup, we get the performance benefit of not caching the same data multiple times.
Also it is a good coding practice.
60.How can we Identify Persistent Cache Files in Informatica Server?
- Cache files are generated in the Cache directory of the Informatica Server for transformations like Aggregator, Joiner, Lookup, Rank & Sorter.
- Two types of cache files are generated i.e. the data and index files exception being Sorter transformation.
- Most Important point is that Informatica automatically deletes all the generated .dat and .idx cache files after a session run is finished.
- So the files that are present in the Cache directory are basically the Persistent Cache files of Lookup transformation, Aggregator Cache files of Incremental Aggregation sessions or if the session run was not successfully completed.
Informatica generated cache files are named as:
PMAGG*.idx, PMAGG*.dat, PMJNR*.idx, PMJNR*.dat, PMLKP*.idx, PMLKP*.dat.
Often while handling big data cache Informatica creates multiple index and data files due to paging and appends a number to the end of the files e.g. PMAGG*.dat0, PMAGG*.idx0, PMAGG*.dat1, PMAGG*.idx1.
So if we have followed any particular naming convention for Lookup Persistent Cache Name e.g. table_ name_PC or the table names have a convention like GDW_ then use shell commands accordingly to identify the cache files in server.
In this context you can revisit Lookup Persistent Cache and Incremental Aggregation article
61.How to configure a Lookup on a flat file with header?
When we try to create a lookup transformation, we have the option to select the location of the Lookup Table from any of Source, Target, Source Qualifier, Import from Relational Table or Import from Flat File.
So after selecting the flat file as lookup from the desired location, the edit Transformation tab of the lookup will have the Flat file information to choose between Delimited or Fixed width and advanced properties to modify like Column Delimiters, Code Page and obviously Number of initial rows to skip.Set Number of initial rows to skip as 1. Set the Lookup condition as required.
Apart from that go to the Mapping tab of the corresponding session and select the lookup transformation to configure the Lookup source file directory and filename and Lookup source file type i.e. Direct or Indirect
62.What is the difference between persistent cache and shared cache?
Persistent cache is a type of Informatica lookup cache in which the cache file is stored in disk. We can configure the session to re-cache if necessary. It will be used only if we are sure that lookup table will not change between sessions.
It will be used if your mapping uses any static tables as lookup mostly.
If the persistent cache is shared across mappings, we call it as shared cache (named). We will provide a name for this cache file.
If the lookup table is used in more than one transformation/mapping then the cache built for the first lookup can be used for the others. It can be used across mappings.
For Shared cache we have to give the name in cache file name prefix property. Use the same name it in different lookup where we want to use the cache.
Unshared cache: Within the mapping if the lookup table is used in more than one transformation then the cache built for the first lookup can be used for the others. It cannot be used across mappings.
63.Describe how to return multiple port values from unconnected lookup in Informatica.
Informatica Unconnected Lookup by default supports only one return port.
So alternatively we can write a Lookup SQL override with the required ports values concatenated into a single string as return port value.
Call the Unconnected lookup from the expression transformation and use various output ports to retrieve the lookup values based on the concatenated return value. Use SUBSTR, INSTR functions to extract the column values from the concatenated return field.
64.How to make the persistent lookup cache in sync with lookup table?
To make the persistent cache in sync with the lookup table simply enable Re-cache option of the lookup transformation to rebuild the lookup cache from lookup table again. While loading the target dimension table
we can choose to make the lookup cache dynamic and recache-persistent so that once dimension is loaded the persistent cache file is in sync and available during Fact table loading.
65.If we use persistent cache for a dynamic lookup, will the cache file be updated or inserted as required?
Having persistent cache will not impact the dynamic cache anyway in doing insert & updates to the cache file. Just that cache file will have a proper name assigned using persistent named cache and it can be reused later.
66.Is there anything wrong in sharing a persistent cache between static and dynamic lookup?
Static & Dynamic lookup cannot share the same persistent cache.
67.What is the difference between the two update properties – update else insert, insert else update in dynamic lookup cache?
In Dynamic Cache:
Update else Insert: In this scenario, if incoming record already exists in lookup cache then the record is going to be updated in the cache and also the target else it will be inserted.
Insert else Update: In this scenario, if incoming record does not exist in lookup cache then the record is going to be inserted in the cache and also the target else it will be updated.
These options play a role in the performance part. If we know the nature of the source data we can set the update option accordingly. Suppose if the maximum source data is destined for insert we will select Insert else Update, otherwise we will go for Update else Insert. Also, if the number of duplicate records coming from Source is greater or there are few potential duplicates in source then we go for Update Else Insert or
Insert Else Update respectively for better performance.
68.If the default value for the lookup return port is not set, what will be the output when the lookup condition fails?
NULL will be returned from lookup transformation on lookup condition failure.
69.How can we ensure data is not duplicated in the target when the source has duplicate records, using lookup transformation?
Using Dynamic lookup cache we can ensure duplicate records are not inserted in the target. That is through Using Dynamic Lookup Cache of the target table and associating the input ports with the lookup port and checking the Insert Else Update option will help to eliminate the duplicate records in source and hence loading
unique records in the target.
For more details check, Dynamic Lookup Cache