Top 34 Informatica Data Quality Interview Questions You Must Prepare 19.Mar.2024

Active trformation:

It is a process it changes the number of rows that have gone through the mapping. This process is called as Active trformation

Some of the Active trformations are:

  • Sorter trformations
  • Filter trformations
  • Joiner trformations
  • Rank trformations
  • Router trformations

etc.

Passive trformation:

It is a process where it doesn’t change the number of rows that have gone through the mapping. This process is called as Passive trformation.

Some of the Passive trformations are:

  • Expression trformation
  • Sequence Generator trformation
  • Lookup trformation
  • External procedure trformation
  • Output trformation
  • Input trformation

Etc.

Yes, you can join two flat files together using joiner trformation.

Joiner trformation is an active and connected trformation where it is primarily used to join two sources of data. The source of data can be from one origin or it can be from two different origins

The main use of stored procedure trformation is because it is a vital tool for maintaining and populating databases within the environment.

You can validate a mapplet as a rule. A rule is business logic that defines conditions applied to source data when you run a profile.

You can validate a mapplet as a rule when the mapplet meets the following requirements:

  1. It contains an Input and Output trformation. 
  2. The mapplet does not contain active trformations. 
  3. It does not specify cardinality between input groups.

The Power Center Integration Service is an application service that runs sessions and workflows.

The Data Integration Service is an application service that performs data integration tasks for the Analyst tool,the Developer tool, and external clients. The Analyst tool and the Developer tool send data integration task requests to the Data Integration Service to preview or run data profiles, SQL data services, and mappings. Commands from the command line or an external client send data integration task requests to the Data Integration Service to run SQL data services or web services.

They are three types of dimensions that are available:

  1. Junk dimension
  2. Degenerative dimension
  3. Conformed dimension

A group of workflow tasks accumulated in a set is nothing but classified as a “worklet”.

Within the workflow tasks, the following are included:

  1. Timer
  2. Decision
  3. Command
  4. Event wait
  5. Mail
  6. Session
  7. Link
  8. Assignment
  9. Control

The throughput option is found in Informatica workflow monitor. Within the workflow monitor, right click on the session, then click on the run properties. Under source/target statistics we can find the throughput option.

The PowerCenter application services and PowerCenter application clients use the PowerCenter Repository Service. The PowerCenter repository has folder-based security.

The other application services, such as the Data Integration Service, Analyst Service, Developer tool, and Analyst tool, use the Model Repository Service. The Model Repository Service has project-based security.

You can migrate some Model repository objects to the PowerCenter repository.

In PowerCenter, you create a source definition to include as a mapping source. You create a target definition to include as a mapping target. In the Developer tool, you create a physical data object that you can use as a mapping source or target.

The aggregator is nothing but a function which stores all the data in the aggregator cache until and unless it deals with all the aggregate calculations.  

So when you are executing a session in which you are using an aggregator trformation, the Informatica server will automatically start creating indexes and data caches in the memory to accommodate and process the trformation.

It is a known fact that Informatica server needs more space, it stores the overflow values in all the cache files.

  1. Mapplet in PowerCenter and in the Developer tool is a reusable object that contains a set of trformations. You can reuse the trformation logic in multiple mappings. 
  2. PowerCenter mapplet can contain source definitions or Input trformations as the mapplet input.  It must contain Output trformations as the mapplet output. 
  3. Developer tool mapplet can contain data objects or Input trformations as the mapplet input.  It can contain data objects or Output trformations as the mapplet output. 

A mapping in the Developer tool also includes the following features:

  1. You can validate a mapplet as a rule. 
  2. You use a rule in a profile. 
  3. A mapplet can contain other mapplets. 

Dynamic cache:

It decreases the performance and productivity when compared to static cache

Static Cache:

Static cache is a process where it just inserts the data all the time. It doesn’t matter how many times the data is coming through, all it cares about is just inserting the data.

  • The reusable trformation concept is widely used in mappings.
  • Reusable trformation is different to that of other mappings where they use trformations as it stores as a metadata.
  • Whenever there is a change in the reusable trformation, the trformation will be nullified in the mappings.

If you have to do the session partition then you need to start configuring the session to partition to source data and then you have to install Informatica server machine in different CPu. I.e. multifold CPU’s.

Target load order is nothing but a list of all activities where one can define the priority. Based on this priority the data will be loaded into the Informatica server.

If you have a list of source qualifiers connected to multiple targets then you can define the order or dictate an order to the Informatica server so that the data can be loaded into the targets.

OLAP stands for Online Analytical Processing. It is defined as a method in which multidimensional analysis occurs.

The different tools available in workflow manager are:

  1. Task designer
  2. Task developer
  3. Workflow designer

As the name itself suggests that the event is predefined. It is nothing but a file watch event. Within this process, it will wait for a certain file to arrive at a specific location.

A user-defined event is nothing but a flow of tasks in the workflow process. These events can be created and raised as on when there is a need associated with it.

  • A surrogate key is nothing but a replacement of the primary key within the database.
  • It is considered to be a unique identification factor for each row within a table.
  • It is very helpful because the primary key can change and thus makes it difficult process to update the data, but not with the surrogate key.
  • A surrogate key is always in the form of a digit or an integer.

  1. The connected lookup is a lookup which participates in all the data flows and it is capable of receiving inputs directly from the pipeline itself.
  2. Within connected lookup can be used within both dynamic cache and static cache.
  3. Within connected lookup, it caches all lookup columns
  4. The connected lookup will support user-defined values

Control M is an alternative tool for scheduling processes other than workflow manager pmcmd.

A session task is defined as a bunch of instructions which are guided towards a power center server which ultimately defines when to trfer the data from source to the targets.

  • Within Informatica, the data is processed based on row by row.
  • Within the target table, every row is inserted and it is marked as a default one.
  • The use of update strategy is done only when there is a need to update a single row or insert a row based on a sequence defined.
  • Within the update strategy, we need to mention the condition so that the specified row in the update strategy can be processed and the row can be actually marked as per the condition, i.e. updated or inserted.

  • The unconnected lookup is entitled to receive input values from the result of LKP
  • With unconnected lookup, it can only return one column value
  • The unconnected lookup does not really support user-defined default values

The following components are installed while installing Informatica power center:

  1. PowerCenter clients
  2. Integration services
  3. Repository service
  4. PowerCenter Domain
  5. Administration console for PowerCenter

Slow changing dimensions are those where the dimensions are meant to be changed in over time. The slow changing dimensions are noted as SCD.

They are three different types of slowly changing dimensions, they are:

  1. Slowly changing dimension-Type 1:  In this type of SCD it has only current records
  2. Slowly changing dimension-Type 2:  In this type of SCD it has both current records and also historical records
  3. Slowly changing dimension-Type 3: In this type of SCD it has current records plus one previous record

Workflow can be defined as a set of instructions which are intended to communicate to the server and letting it know on how to implement the tasks.

  • A mapplet is nothing but a recyclable object which uses a mapplet designer.
  • Mapplet permits to reuse the trformation logic in different mappings.
  • A mapplet consists of a set of trformations.

In Informatica, they are two types of loading:

  1. Normal loading
  2. Bulk loading

Normal loading is a process where the records are loaded one by one and it writes a log for the same. When compared to other types of loading normal loading the loading process takes time to the target source.

Bulk loading is a process where a set of records are loaded into the target database at once. When compared to normal loading process, bulk loading process takes very less time to load the data.

A parameter file is nothing but a file which is created in a text editor or a word pad.

The following different values can be defined in a parameter file, they are:

  1. Mapping parameters
  2. Mapping variables
  3. Session parameters

The standalone command task can be used anywhere within a workflow process to execute the shell commands.

The term trformation itself depicts the nature of the activity. It is a repository object where it generates, modifies and passes the data.

The following are different types of trformations that are available in Informatica:

  1. Aggregator trformation
  2. Expression trformation
  3. Filter trformation
  4. Joiner trformation
  5. Lookup trformation
  6. Normalizer trformation
  7. Rank trformation
  8. Router trformation