Wednesday, November 12, 2014

Interview Questions Informatica - 1

1. What is an update strategy transformation?

Update strategy transformation is used to flag source rows for insert, update, delete or reject within a mapping. Based on this flagging each row will be either inserted or updated or deleted from the target. Alternatively the row can be rejected.

2. Why update strategy is an active transformation?

As update strategy transformation can reject rows, it is called as an active transformation.

3. What are the constants used in update strategy transformation for flagging the rows?
  • DD_INSERT is used for inserting the rows. The numeric value is 0.
  • DD_UPDATE is used for updating the rows. The numeric value is 1.
  • DD_DELETE is used for deleting the rows. The numeric value is 2.
  • DD_REJECT is used for rejecting the rows. The numeric value is 3.

4. If you place an aggregator after the update strategy transformation, how the output of aggregator will be affected?

The update strategy transformation flags the rows for insert, update and delete of reject before you perform aggregate calculation. How you flag a particular row determines how the aggregator transformation treats any values in that row used in the calculation. For example, if you flag a row for delete and then later use the row to calculate the sum, the integration service subtracts the value appearing in this row. If the row had been flagged for insert, the integration service would add its value to the sum.

5. How to update the target table without using update strategy transformation?

In the session properties, there is an option 'Treat Source Rows As'. Using this option you can specify whether all the source rows need to be inserted, updated or deleted.

6. If you have an update strategy transformation in the mapping, what should be the value selected for 'Treat Source Rows As' option in session properties?

The value selected for the option is 'Data Driven'. The integration service follows the instructions coded in the update strategy transformation.

7. If you have an update strategy transformation in the mapping and you did not selected the value 'Data Driven' for 'Treat Source Rows As' option in session, then how the session will behave?

If you do not choose Data Driven when a mapping contains an Update Strategy or Custom transformation, the Workflow Manager displays a warning. When you run the session, the Integration Service does not follow instructions in the Update Strategy transformation in the mapping to determine how to flag rows.

8. In which files the data rejected by update strategy transformation will be written?

If the update strategy transformation is configured to Forward Rejected Rows then the integration service forwards the rejected rows to next transformation and writes them to the session reject file. If you do not select the forward reject rows option, the integration service drops rejected rows and writes them to the session log file. If you enable row error handling, the Integration Service writes the rejected rows and the dropped rows to the row error logs. It does not generate a reject file.

INFORMATICA INTERVIEW QUESTIONS ON LOOKUP TRANSFORMATION

 

1. What is a lookup transformation?
A lookup transformation is used to look up data in a flat file, relational table, view, and synonym.

2. What are the tasks of a lookup transformation?
The lookup transformation is used to perform the following tasks?
  • Get a related value: Retrieve a value from the lookup table based on a value in the source.
  • Perform a calculation: Retrieve a value from a lookup table and use it in a calculation.
  • Update slowly changing dimension tables: Determine whether rows exist in a target.
3. How do you configure a lookup transformation?
Configure the lookup transformation to perform the following types of lookups:
  • Relational or flat file lookup
  • Pipeline lookup
  • Connected or unconnected lookup
  • Cached or uncached lookup

4. What is a pipeline lookup transformation?
A pipeline lookup transformation is used to perform lookup on application sources such as JMS, MSMQ or SAP. A pipeline lookup transformation has a source qualifier as the lookups source.

5. What is connected and unconnected lookup transformation?
  • A connected lookup transformation is connected the transformations in the mapping pipeline. It receives source data, performs a lookup and returns data to the pipeline.
  • An unconnected lookup transformation is not connected to the other transformations in the mapping pipeline. A transformation in the pipeline calls the unconnected lookup with a :LKP expression.

6. What are the differences between connected and unconnected lookup transformation?
  • Connected lookup transformation receives input values directly from the pipeline. Unconnected lookup transformation receives input values from the result of a :LKP expression in another transformation.
  • Connected lookup transformation can be configured as dynamic or static cache. Unconnected lookup transformation can be configured only as static cache.
  • Connected lookup transformation can return multiple columns from the same row or insert into the dynamic lookup cache. Unconnected lookup transformation can return one column from each row.
  • If there is no match for the lookup condition, connected lookup transformation returns default value for all output ports. If you configure dynamic caching, the Integration Service inserts rows into the cache or leaves it unchanged. If there is no match for the lookup condition, the unconnected lookup transformation returns null.
  • In a connected lookup transformation, the cache includes the lookup source columns in the lookup condition and the lookup source columns that are output ports. In an unconnected lookup transformation, the cache includes all lookup/output ports in the lookup condition and the lookup/return port.
  • Connected lookup transformation passes multiple output values to another transformation. Unconnected lookup transformation passes one output value to another transformation.
  • Connected lookup transformation supports user-defined values. Unconnected lookup transformation does not support user-defined default values.

7. How do you handle multiple matches in lookup transformation? or what is "Lookup Policy on Multiple Match"?
"Lookup Policy on Multiple Match" option is used to determine which rows that the lookup transformation returns when it finds multiple rows that match the lookup condition. You can select lookup to return first or last row or any matching row or to report an error.

8. What is "Output Old Value on Update"?
This option is used when dynamic cache is enabled. When this option is enabled, the integration service outputs old values out of the lookup/output ports. When the Integration Service updates a row in the cache, it outputs the value that existed in the lookup cache before it updated the row based on the input data. When the Integration Service inserts a new row in the cache, it outputs null values. When you disable this property, the Integration Service outputs the same values out of the lookup/output and input/output ports.

9. What is "Insert Else Update" and "Update Else Insert"?
These options are used when dynamic cache is enabled.
  • Insert Else Update option applies to rows entering the lookup transformation with the row type of insert. When this option is enabled the integration service inserts new rows in the cache and updates existing rows when disabled, the Integration Service does not update existing rows.
  • Update Else Insert option applies to rows entering the lookup transformation with the row type of update. When this option is enabled, the Integration Service updates existing rows, and inserts a new row if it is new. When disabled, the Integration Service does not insert new rows.

10. What are the options available to configure a lookup cache?
The following options can be used to configure a lookup cache:
  • Persistent cache
  • Recache from lookup source
  • Static cache
  • Dynamic cache
  • Shared Cache
  • Pre-build lookup cache

11. What is a cached lookup transformation and uncached lookup transformation?
  • Cached lookup transformation: The Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. The Integration Service stores condition values in the index cache and output values in the data cache. The Integration Service queries the cache for each row that enters the transformation.
  • Uncached lookup transformation: For each row that enters the lookup transformation, the Integration Service queries the lookup source and returns a value. The integration service does not build a cache.

12. How the integration service builds the caches for connected lookup transformation?
The Integration Service builds the lookup caches for connected lookup transformation in the following ways:
  • Sequential cache: The Integration Service builds lookup caches sequentially. The Integration Service builds the cache in memory when it processes the first row of the data in a cached lookup transformation.
  • Concurrent caches: The Integration Service builds lookup caches concurrently. It does not need to wait for data to reach the Lookup transformation.

13. How the integration service builds the caches for unconnected lookup transformation?
The Integration Service builds caches for unconnected Lookup transformations as sequentially.

14. What is a dynamic cache?
The dynamic cache represents the data in the target. The Integration Service builds the cache when it processes the first lookup request. It queries the cache based on the lookup condition for each row that passes into the transformation. The Integration Service updates the lookup cache as it passes rows to the target. The integration service either inserts the row in the cache or updates the row in the cache or makes no change to the cache.

15. When you use a dynamic cache, do you need to associate each lookup port with the input port?
Yes. You need to associate each lookup/output port with the input/output port or a sequence ID. The Integration Service uses the data in the associated port to insert or update rows in the lookup cache.

16. What are the different values returned by NewLookupRow port?
The different values are
  • 0 - Integration Service does not update or insert the row in the cache.
  • 1 - Integration Service inserts the row into the cache.
  • 2 - Integration Service updates the row in the cache.

17. What is a persistent cache?
If the lookup source does not change between session runs, then you can improve the performance by creating a persistent cache for the source. When a session runs for the first time, the integration service creates the cache files and saves them to disk instead of deleting them. The next time when the session runs, the integration service builds the memory from the cache file.

18. What is a shared cache?
You can configure multiple Lookup transformations in a mapping to share a single lookup cache. The Integration Service builds the cache when it processes the first Lookup transformation. It uses the same cache to perform lookups for subsequent Lookup transformations that share the cache.

19. What is unnamed cache and named cache?
  • Unnamed cache: When Lookup transformations in a mapping have compatible caching structures, the Integration Service shares the cache by default. You can only share static unnamed caches.
  • Named cache: Use a persistent named cache when you want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.

20. How do you improve the performance of lookup transformation?
  • Create an index on the columns used in the lookup condition
  • Place conditions with equality operator first
  • Cache small lookup tables.
  • Join tables in the database: If the source and the lookup table are in the same database, join the tables in the database rather than using a lookup transformation.
  • Use persistent cache for static lookups.
  • Avoid ORDER BY on all columns in the lookup source. Specify explicitly the ORDER By clause on the required columns.
  • For flat file lookups, provide Sorted files as lookup source.

    INFORMATICA INTERVIEW QUESTIONS ON SQL TRANSFORMATION

    1. What is SQL transformation?
    SQL transformation process SQL queries midstream in a pipeline and you can insert, update, delete and retrieve rows from a database.

    2. How do you configure a SQL transformation?
    The following options are required to configure SQL transformation:
    • Mode: Specifies the mode in which SQL transformation runs. SQL transformation supports two modes. They are script mode and query mode.
    • Database type: The type of database that SQL transformation connects to.
    • Connection type: Pass database connection to the SQL transformation at run time or specify a connection object.
    3. What are the different modes in which a SQL transformation runs?
    SQL transformation runs in two modes. They are:
    • Script mode: The SQL transformation runs scripts that are externally located. You can pass a script name to the transformation with each input row. The SQL transformation outputs one row for each input row.
    • Query mode: The SQL transformation executes a query that you define in a query editor. You can pass parameters to the query to define dynamic queries. You can output multiple rows when the query has a SELECT statement.

    4. In which cases the SQL transformation becomes a passive transformation and active transformation?
    If you run the SQL transformation in script mode, then it becomes passive transformation. If you run the SQL transformation in the query mode and the query has a SELECT statement, then it becomes an active transformation.

    5. When you configure an SQL transformation to run in script mode, what are the ports that the designer adds to the SQL transformation?
    The designer adds the following ports to the SQL transformation in script mode:
    • ScriptName: This is an input port. ScriptName receives the name of the script to execute the current row.
    • ScriptResult: This is an output port. ScriptResult returns PASSED if the script execution succeeds for the row. Otherwise it returns FAILED.
    • ScriptError: This is an output port. ScriptError returns the errors that occur when a script fails for a row.

    6. What are the types of SQL queries you can specify in the SQL transformation when you use it in query mode.
    • Static SQL query: The query statement does not change, but you can use query parameters to change the data. The integration service prepares the query once and runs the query for all input rows.
    • Dynamic SQL query: The query statement can be changed. The integration service prepares a query for each input row.

    7. What are the types of connections to connect the SQL transformation to the database available?
    • Static connection: Configure the connection object tin the session. You must first create the connection object in workflow manager.
    • Logical connection: Pass a connection name to the SQL transformation as input data at run time. You must first create the connection object in workflow manager.
    • Full database connection: Pass the connect string, user name, password and other connection information to SQL transformation input ports at run time.

    8. How do you find the number of rows inserted, updated or deleted in a table?
    You can enable the NumRowsAffected output port to return the number of rows affected by the INSERT, UPDATE or DELETE query statements in each input row. This NumRowsAffected option works in query mode.

    9. What will be the output of NumRowsAffected port for a SELECT statement?
    The NumRowsAffected outout is zero for the SELECT statement.

    10. When you enable the NumRowsAffected output port in script mode, what will be the output?
    In script mode, the NumRowsAffected port always returns NULL.

    11. How do you limit the number of rows returned by the select statement?
    You can limit the number of rows by configuring the Max Output Row Count property. To configure unlimited output rows, set Max Output Row Count to zero.

    INFORMATICA INTERVIEW QUESTIONS ON EXPRESSION TRANSFORMATION

    1. What is an expression transformation?

    An expression transformation is used to calculate values in a single row.
    Example: salary+1000

    2. How to generate sequence numbers using expression transformation?
    Create a variable port in expression transformation and increment it by one for every row. Assign this variable port to an output port.

    3. Consider the following employees data as source?

    Employee_id, Salary
    -------------------
    10,          1000
    20,          2000
    30,          3000
    40,          5000
    

    Q1. Design a mapping to load the cumulative sum of salaries of employees into target table?
    The target table data should look like as

    Employee_id, Salary, Cumulative_sum
    -----------------------------------
    10,          1000,      1000
    20,          2000,      3000
    30,          3000,      6000
    40,          5000,      11000
    

    Q2. Design a mapping to get the pervious row salary for the current row. If there is no pervious row exists for the current row, then the pervious row salary should be displayed as null.
    The output should look like as

    Employee_id, Salary, Pre_row_salary
    -----------------------------------
    10,          1000,      Null
    20,          2000,      1000
    30,          3000,      2000
    40,          5000,      3000
    

    4. Consider the following employees table as source

    Department_no, Employee_name
    ----------------------------
    20,               R
    10,               A
    10,               D
    20,               P
    10,               B
    10,               C
    20,               Q
    20,               S
    

    Q1. Design a mapping to load a target table with the following values from the above source?

    Department_no, Employee_list
    ----------------------------
    10,            A
    10,            A,B
    10,            A,B,C
    10,            A,B,C,D
    20,            A,B,C,D,P
    20,            A,B,C,D,P,Q
    20,            A,B,C,D,P,Q,R
    20,            A,B,C,D,P,Q,R,S
    

    Q2. Design a mapping to load a target table with the following values from the above source?

    Department_no, Employee_list
    ----------------------------
    10,            A
    10,            A,B
    10,            A,B,C
    10,            A,B,C,D
    20,            P
    20,            P,Q
    20,            P,Q,R
    20,            P,Q,R,S
    

    INFORMATICA INTERVIEW QUESTIONS ON SORTER TRANSFORMATION

    1. What is a sorter transformation?

    Sorter transformation is used to sort the data. You can sort the data either in ascending or descending order according to a specified sort key.

    2. Why sorter is an active transformation?

    As sorter transformation can suppress the duplicate records in the source, it is called an active transformation.

    3. How to improve the performance of a session using sorter transformation?

    Sort the data using sorter transformation before passing in to aggregator or joiner transformation. As the data is sorted, the integration service uses the memory to do aggregate and join operations and does not use cache files to process the data.

    INFORMATICA INTERVIEW QUESTIONS ON AGGREGATOR TRANSFORMATION

    1. What is aggregator transformation?
    Aggregator transformation performs aggregate calculations like sum, average, count etc. It is an active transformation, changes the number of rows in the pipeline. Unlike expression transformation (performs calculations on a row-by-row basis), an aggregator transformation performs calculations on group of rows.

    2. What is aggregate cache?
    The integration service creates index and data cache in memory to process the aggregator transformation and stores the data group in index cache, row data in data cache. If the integration service requires more space, it stores the overflow values in cache files.

    3. How can we improve performance of aggregate transformation?
    • Use sorted input: Sort the data before passing into aggregator. The integration service uses memory to process the aggregator transformation and it does not use cache memory.
    • Filter the unwanted data before aggregating.
    • Limit the number of input/output or output ports to reduce the amount of data the aggregator transformation stores in the data cache.

    4. What are the different types of aggregate functions?

    The different types of aggregate functions are listed below:
    • AVG
    • COUNT
    • FIRST
    • LAST
    • MAX
    • MEDIAN
    • MIN
    • PERCENTILE
    • STDDEV
    • SUM
    • VARIANCE

    5. Why cannot you use both single level and nested aggregate functions in a single aggregate transformation?

    The nested aggregate function returns only one output row, whereas the single level aggregate function returns more than one row. Since the number of rows returned are not same, you cannot use both single level and nested aggregate functions in the same transformation. If you include both the single level and nested functions in the same aggregator, the designer marks the mapping or mapplet as invalid. So, you need to create separate aggregator transformations.

    6. Up to how many levels, you can nest the aggregate functions?

    We can nest up to two levels only.
    Example: MAX( SUM( ITEM ) )

    7. What is incremental aggregation?

    The integration service performs aggregate calculations and then stores the data in historical cache. Next time when you run the session, the integration service reads only new data and uses the historical cache to perform new aggregation calculations incrementally.

    8. Why cannot we use sorted input option for incremental aggregation?

    In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this historical cache the data need not be in sorted order.  If you give sorted input, the records come as presorted for that particular run but in the historical cache the data may not be in the sorted order. That is why this option is not allowed.

    9. How the NULL values are handled in Aggregator?

    You can configure the integration service to treat null values in aggregator functions as NULL or zero. By default the integration service treats null values as NULL in aggregate functions.
     

1 comment :