The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor.
A fact table is always DENORMALISED table. It consists of data from dimension table (Primary Key's) and Fact table has foreign keys and measures.
When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier trformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.
The types of dimensions available are:
In a mapping we can use any number of trformations depending on the project, and the included trformations in the particular related trformations.
By using command in the command task there is a option pression. we can write appropriate command of pmcmd to run workflow.
There is a lot of difference between informatica and AbInitio:
Use a pre sql statement., but this is a hard coding method. If you change the column names or put in extra columns in the flat file, you will have to change the insert statement.
You can also achieve this by changing the setting in the Informatica Repository manager to display the columns heading. The only disadvantage of this is that it will be applied on all the files that will be generated by this server.
We can use SCD Type 1/2/3 to load any Dimensions based on the requirement. We can also use procedure to populate Time Dimension
Reusable trformation :
Create one procedure and declare the sequence inside the procedure, finally call the procedure in informatica with the help of stored procedure trformation.
Correction the rejected data and send to target relational tables using load order utility. Find out the rejected data by using column indicator and row indicator.
Default option for update strategy trformation is dd_insert or we can put '0' in session level data driven.
A Stored Procedure trformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate time-consuming tasks that are too complicated for standard SQL statements.
Narmalizer Trformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer trformation can be used to create multiple rows from a single row of data.
PowerCenter Server on Windows can connect to following databases:
PowerCenter Server on UNIX can connect to following databases:
The informatica repository is at the center of the informatica suite. You create a set of metadata tables within the repository database that the informatica application and tools access. The informatica client and server access the repository to save and retrieve metadata.
If you are having defined source you can use connected, source is not well defined or from different database you can go for unconnected.
Connected and unconnected lookup depends on scenarios and performance If you are looking for a single value for look up and the value is like 1 in 1000 then you should go for unconnected lookup. Performance wise its better as we are not frequently using the trformation. If multiple columns are returned as lookup value then one should go for connected lookup.
Java Trformation available in the 8x version and it is not available in 7x version.
Normalizer : It is a trformation mainly using for cobol sources. It change the rows into columns and columns into rows.
Normalization : To remove the redundancy and inconsistency.
Normalizer Trformation : can be used to obtain multiple columns from a single row.
Informatica PowerCenter includeds following type of repositories :
Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.
Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
Local Repository : Local repository is within a domain and it’s not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders.
Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment.
create one procedure and declare the sequence inside the procedure, finally call the procedure in informatica with the help of stored procedure trformation.
Partition's can be done on both relational and flat files.
Informatica supports following partitions
All these are applicable for relational targets. For flat file only database partitioning is not applicable.
Informatica supports Navy partitioning. you can just specify the name of the target file and create the partitions, rest will be taken care by informatica session.
In a STAR schema there is no relation between any two dimension tables, whereas in a SNOWFLAKE schema there is a possible relation between the dimension tables.
In star schema there is no relationship between two relational tables. All dimensions are de-normalized and query performance is degrades. In this snow flake schema dimensions are normalized. In this SF schema table space is increased. Maintenance cost is high. Query performance is increased.
while importing flat file definition just specify the scale for a numeric data type in the mapping, the flat file source supports only number datatype (no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.
source ->number datatype port ->SQ -> decimal datatype. Integer is not supported. hence decimal is taken care.
Import the field as string and then use expression to convert it, so that we can avoid truncation if decimal places in source itself.
Its possible to join the two or more tables by using source qualifier. But provided the tables should have relationship.
When you drag and drop the tables you will getting the source qualifier for each table. Delete all the source qualifiers. Add a common source qualifier for all. Right click on the source qualifier you will find EDIT click on it. Click on the properties tab, you will find sql query in that you can write your sqls.
You can also do it using Session --- mapping---source there you have an option called User Defined Join there you can write your SQL.
There two methods for creating reusable trformations:
Summary Filter - we can apply records group by that contain common values.
Detail Filter - we can apply to each and every record in a database.
Aggregator performance improves dramatically if records are sorted before passing to the aggregator and "sorted input" option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.
It is often a good idea to sort the record set in database level e.g. inside a source qualifier trformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator.
When ETL load the data from source we can declare the rank of the incoming data to pass a rank trformation. We can't declare two rank on a single source data. We can do rank the row by declaring the rank Trformation and declaring the rank port.
A Router trformation is similar to a Filter trformation because both trformations allow you to use a condition to test data. However, a Filter trformation tests data for one condition and drops the rows of data that do not meet the condition. A Router trformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.
If you need to test the same input data based on multiple conditions, use a Router Trformation in a mapping instead of creating multiple Filter trformations to perform the same task.
A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.
When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.
Relational source : To access relational source which is situated in a remote place , you need to configure database connection to the datasource.
FileSource : To access the remote source file you must configure the FTP connection to the host machine before you create the session.
Heterogeneous : When you are mapping contains more than one source type, the server manager creates a heterogeneous session that displays source options for all types.
As far my knowledge by using power exchange tool convert VSAM file to oracle tables then do mapping as usual to the target table.
Under certain circumstances, when a session does not complete, you need to truncate the target tables and run the session from the beginning. Run the session from the beginning when the Informatica Server cannot run recovery or when running recovery might result in inconsistent data.
If there is no recovery mode on in session and workflow failed in mid of execution then
The Informatica PowerCenter Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning.
GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.
To flag source records as INSERT, DELETE, UPDATE or REJECT for target database. Default flag is Insert. This is must for Incremental Data Loading.
This is the important trformation,is used to maintain the history data or just most recent changes into the target table.
We can set or flag the records by using these two levels.
It depends upon the informatica version we are using. suppose if we are using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions.
Its a session option, when the informatica server performs incremental aggr. it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally for performance we will use it.
When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. If the source changes incrementally and you can capture changes, you can configure the session to process those changes. This allows the Integration Service to update the target incrementally, rather than forcing it to process the entire source and recalculate the same data each time you run the session.
Its usually not done in the mapping (trformation) level. Its done in session level. Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index command. Use this command task in the workflow after the session or else, You can create it with a post session command.
Informatica server, load managers, data trfer manager, reader, temp server and writer are the components of informatica server. first load manager sends a request to the reader if the reader is ready to read the data from source and dump into the temp server and data trfer manager manages the load and it send the request to writer as per first in first out process and writer takes the data from temp server and loads it into the target.
For a relational sources informatica server creates multiple connections for each partition of a single source and extracts separate range of data for each connection.
Informatica server reads multiple partitions of a single source concurrently. Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurrently.
For XML and file sources, informatica server reads multiple files concurrently. For loading the data informatica server creates a separate file for each partition (of a source file). You can choose to merge the targets.
For improved performance follow these tips:-
Inner equi join.
You cannot. If you want to start batch that resides in a batch, create a new independent batch and copy the necessary sessions into the new batch.
Specifies the directory used to cache master records and the index to these records. By default, the cached files are created in a directory specified by the server variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive. There are 2-types of cache in the joiner:
In hash partitioning, the Informatica Server uses a hash function to group rows of data among partitions. The Informatica Server groups the data based on a partition key.Use hash partitioning when you want the Informatica Server to distribute rows to the partitions by group. For example, you need to sort items by item ID, but you do not know how many items have a particular ID number.