The Sort Component in Abinitio re-orders the data. It comprises of two parameters “Key” and “Max-core”.
“Abinitio” is a latin word meaning “from the beginning.” Abinitio is a tool used to extract, trform and load data. It is also used for data analysis, data manipulation, batch processing, and graphical user interface based parallel processing.
Air command used in Abinitio includes
Other air command for Abinitio include air object cat, air object modify, air lock show user, etc.
The following are the ways to improve the performance of a graph :
Following is the order of evaluation:
In Ab initio, dependency analysis is a process through which the EME examines a project entirely and traces how data is trferred and trformed- from component-to-component, field-by-field, within and between graphs.
To make a graph behave dynamically, PDL is used
For Example : define a parameter named myfield with a value “string(“ | “”) name;”
A SANDBOX is referred for the collection of graphs and related files that are saved in a single directory tree and behaves as a group for the purposes of navigation, version control, and migration.
Architecture of Abinitio includes
Abinition is logically divided into two segments
Different types of parallelism used in Abinitio includes
AbInitio supports 3 parallelisms. They are
Data Parallelism : Same data is parallelly worked in a single application
Component Parallelism : Different data is worked parallelly in a single application
Pipeline Parallelism : Data is passed from one component to another component. Data is worked on both of the components.
The syntax for m_dump in Abinitio is used to view the data in multifile from unix prompt. The command for m_dump includes
To connect with Ab initio Server, there are several ways like
In Abinitio, partition is the process of dividing data sets into multiple sets for further processing. Different types of partition component includes
The Abinitio co-operating system provide features like
The file extensions used in Abinitio are
De-partition is done in order to read data from multiple flow or operations and are used to re-join data records from different flows. There are several de-partition components available which includes Gather, Merge, Interleave, and Concatenation.
The .dbc extension provides the GDE with the information to connect with the database are
Roll-up component enables the users to group the records on certain field values. It is a multiple stage function and consists initialize 2 and Rollup 3.
Look-up
Duplicate records can be avoided by using the following:
EME:
GDE:
Co-operative System:
Ex:
decimal_strip("-0184o") := "-184"
decimal_strip("oxyas97abc") := "97"
decimal_strip("+$78ab=-*&^*&%cdw") := "78"
decimal_strip("Honda") "0"
To execute graph infinitely, the graph end script should call the .ksh file of the graph. Therefore, if the graph name is abc.mp then in the end script of the graph it should call to abc.ksh. This will run the graph for infinitely.
The following is the process to add default rules in trformer
Example: A set of variables, say v1,v2,v3,v4,v5,v6 are assigned with NULL.
Another variable num is assigned with value 340 (num=340)
num = first_defined(NULL, v1,v2,v3,v4,v5,v6,NUM)
The result of num is 340
Use decimal cast with the size in the trform() function, when the size of the string and decimal is same.
Ex: If the source field is defined as string(8).
- The destination is defined as decimal(8)
- Let us assume the field name is salary.
- The function is out.field :: (decimal(8)) in salary
- If the size of the destination field is lesser that the input then string_substring() function can be used
Ex : Say the destination field is decimal(5) then use…
- out.field :: (decimal(5))string_lrtrim(string_substring(in.field,1,5))
- The ‘ lrtrim ‘ function is used to remove leading and trailing spaces in the string
To run a graph infinitely:
Check point:
Phase:
Partitioning by Key / Hash Partition :
Round Robin Partition :
For example: a pack of 52 cards is distributed among 4 players in a round-robin fashion.