Top 50 Data Warehousing Interview Questions You Must Prepare 24.May.2024

Yes we can create reports without creating Universe.

PC Repository is a relational database that stores the metadata describes different objects like mapping and trformation. This is managed by repository service.

pmcmd>startworkflow -f foldername workflowname

Microstrategy Desktop is available in English, French, German, Spanish, Korean, Italian, Swedish, Japanese and Portuguese.

A Private Synonyms can be accessed only by the owner.

EIM: It is batch mode Integration. When data volume is large then we have to go to EIM.
EAI: It is real time Integration. When data volume is small then we have to go to EAI.

a) Poor Performance.
b) Can be used for Variety of Databases.
c) Can handle Stored Procedures.
Plug-In: a) Good Performance. b) Database specific.(Only one database)

The design method consists of two major phases.
During the first phase, you create the underlying database structure of your universe. This structure includes the tables and columns of a database and the joins by which they are linked. You may need to resolve loops which occur in the joins using aliases or contexts. You can conclude this phase by testing the integrity of the overall structure.

During the second phase, you can proceed to enhance the components of your universe. You can also prepare certain objects for multidimensional analysis. As with the first phase, you should test the integrity of your universe structure. You may also wish to perform tests on the universes you create from the BusinessObjects User module. Finally, you can distribute your universes to users by exporting them to the repository or via your file system.

For a universe based on a simple relational schema, Designer provides Quick Design, a wizard for creating a basic yet complete universe. You can use the resulting universe immediately, or you can modify the objects and create complex new ones. In this way, you can gradually refine the quality and structure of your universe.

Explain plan can be reviewed to check the execution plan of the query. This would guide if the expected indexes are used or not.

Hyper cube or multidimensional cube forms the core of OLAP system. This consists of measures which are arranged according to dimensions. Hyper cube Meta data is created by star or snow flake schema of tables in RDBMS. Dimensions are extracted from dimension table and measures from the fact table.

A lookup table is the table placed on the target table based upon the primary key of the target, it just updates the table by allowing only modified (new or updated) records based on thelookup condition.

Fact is key performance indicator to analyze the business. Dimension is used to analyze the fact. Without dimension there is no meaning for fact.

No. OLTP database tables are normalized and it will add additional time to queries to return results. Additionally OLTP database is smaller and it does not contain longer period (many years) data, which needs to be analyzed. A OLTP system is basically ER model and not Dimensional Model. If a complex query is executed on a OLTP system, it may cause a heavy overhead on the OLTP server that will affect the normal business processes.

Pass through functions are used to utilize various special functions that specific to databases.Some of the passthrough functions available are Applysimple and Applycomparision.

A dashboard in business intellgence allows huge data and reports to be read in a single graphical interface. They help in making faster decisions by replying on measurable data seen at a glance. They can also be used to get into details of this data to analyze the root cause of any business performance. It represents the business data and business state at a high level. Dashboards can also be used for cost control. Example of need of a dashboard: Banks run thousands of ATM’s. They need to know how much cash is deposited, how much is left etc.

Shared connection
Secured connection
Personal connection - used only stand alone system.

Infromatica Repository:The informatica repository is at the center of the informatica suite. You create a set of metadata tables within the repository database that the informatica application and tools access. The informatica client and server access the repository to save and retrieve metadata.

The latest version of GDE ism1.15 AND Co>operating system is 2.14.

a) Bo is a reporting tool to Query the database or data warehouse.
b) Bo is a query, reporting and analysis tool.
c) Bo is an OLAP tool. (ROLAP)

you can create different universe for different sources and link them in BO.

a) From the data manager tab we can know the no of rows returned and the time taken to run the report.
b) Click on any column in result set, right click and select count all. This will display the total number of columns in the result set.

A Cartesian join will get you a Cartesian product. A Cartesian join is when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself.

Containers : Usage and Types?

Containers is a collection of stages used for the purpose of Reusability. There are 2 types of Containers.

a) Local Container: Job Specific
b) Shared Container: Used in any job within a project.

TIBCO BusinessWorks offers a variety of types of tractions that can be used in different situations. You can use the type of traction that suits the needs of your integration project. When you create a traction group, you must specify the type of traction. TIBCO BusinessWorks supports the following types of tractions:
• Java Traction API (JTA) UserTraction
• XA Traction.

Maxcore is a value (it will be in Kb).Whne ever a component is executed it will take that much memeory we specified for execution.

Define 3 tritions from JDBC update with condition on the no of updates and call appropriate child processes.

Dimensions that change over time are called Slowly Changing Dimensions. For instance, a product price changes over time; People change their names for some reason; Country and State names may change over time. These are a few examples of Slowly Changing Dimensions since some changes are happening to them over a period of time.

If the data in the Dimension table happen to change very rarely,then it is called as slowly changing dimension.

ex: changing the name and address of a person,which happens rerely.

Drilling can be done in drill down, up, through, and across; scope is the overall view of the drill exercise.

Yes. In order to perform cross-project operations, the projects involved must originate from the same source project. In other words, the projects can only be related by the duplication of a single project. This ensures that the projects have a similar set of schema and application objects, and that the object ID's in the two projects are the same. MicroStrategy Object Manager uses the object and version ID's across the projects to perform comparisons. MicroStrategy Object Manager prevents the user from attempting operations across unrelated projects.

m_dump command prints the data in a formatted way.

The basic purpose of the scheduling tool in a DW Application is to stream line the flow of data from Source To Target at specific time or based on some condition.

Cascading report works based on the condition but drill thru work based on the data item what we select as a drill thru options.

Surrogate key is a substitution for the natural primary key.It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table.

Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Info sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key.

It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.

Some tables have columns such as AIRPORT_NAME OR CITY_NAME which are stated as the primary keys (according to the business users) but ,not only can these change, indexing on a numerical value is probably better and you could consider creating a surrogate key called, say, AIRPORT_ID. This would be internal to the system and as far as the client is concerned, you may display only the AIRPORT_NAME.

Security filter is used to apply security at the database data level.Whenever a users associated with security filter runs a report, a WHERE clause is always included in the report sql with the condition defined in the Security Filter.

The Entity-Relationship (ER) model was originally proposed by Peter in 1976 [Chen76] as a way to unify the network and relational database views. Simply stated the ER model is a conceptual data model that views the real world as entities and relationships. A basic component of the model is the Entity-Relationship diagram, which is used to visually represent data objects. Since Chen wrote his paper the model has been extended and today it is commonly used for database design for the database designer, the utility of the ER model is: it maps well to the relational model. The constructs used in the ER model can easily be trformed into relational tables. It is simple and easy to understand with a minimum of training. Therefore, the database designer to communicate the design to the end user can use the model. In addition, the model can be used as a design plan by the database developer to implement a data model in specific database management software.

Business Services and Workflows Both
Business Services Only
Workflow Only

we have multiple domains,but security domain is not multiple,only document domain and universe domain is multiple.

A management reporting tool to gauge how well the organization company is performing. It normally uses "traffic-lights" or "smiley faces" to determine the status.

Data analysis: consider that you are running a business and u store the data of that; in some form say in register or in a comp and at the year end you want know the profit or loss then it called data analysis .Data analysis use: then u want to know which product was sold the highest and if the business is running in a loss then finding, where we went wrong we do analysis.

Cubes are logical representation of multidimensional data. The edge of the cube contains dimension members and the body of the cube contains data values.

Essbase supports two different types of attributes.
@User-Defined attributes
@Simple attributes
User-Defined attributes: The attributes that are defined by the user.
Simple attributes: Essbase supports some attributes, they are: Boolean, date, number, and string.

An active trformation can change the number of rows that pass through it, but a passive trformation can not change the number of rows that pass through it.

As the term suggests, a real-time data warehouse is a system, which reflects all changes to its sources in real time. As simple as it sounds, this is still an area of active research in the field. In traditional DWH, the operational system(s) are kept separate from the DWH for a good reason.

The Operational systems are designed to accept inputs or changes to data regularly, hence have a good chance of being regularly queried. On the other hand, a DWH is supposed to do just the opposite - it is used to query data for reports only. No changes to data, through user actions is expected (or designed). The only inputs could come from the ETL feed at stipulated times. The ETL would source its data from the Operational systems just explained above.

To create a real-time DWH we would have to merge both systems (several ways are being explored), a concept that is against the reason of creating a DWH. Bigger challenges occur in terms of updating aggregated data in facts at real time, still maintaining the surrogate keys.

Besides, we would need lightening fast hardware to try this.Near Real time DWH is a trade-off between the conventional design and the dream of all clients today. The frequency of ETL updates in higher in this case for e.g. once in 2 hours. We can also analyze and use selective refreshes at shorter time intervals, while complete refreshes may still be kept further apart. Selective refreshes would look at only those tables that get updated regularly.

Data Mining is used for the estimation of future. For example, if we take a company/business organization, by using the concept of Data Mining, we can predict the future of business in terms of Revenue (or) Employees (or) Customers (or) Orders etc.Traditional approaches use simple algorithms for estimating the future. However, it does not give accurate results when compared to Data Mining.

Users can receive both tables and charts from the Microstrategy platform, and content from current information sources such as traction processing systems, Enterprise Resource Planning systems, databases, XML files, and web servers.

From the designer u can spot the already mentioned hierarchy and also u can change or edit the same.
Go to Tool->Hierarchies
Here u can ether add a new one or can also edit the existing one.

Config file consists of the following.

a) Number of Processes or Nodes.
b) Actual Disk Storage Location.

To convert 4 way to 8 way partition we need to change the layout in the partioning component. There will be seperate parameters for each and every type of partioning eg. AI_MFS_HOME, AI_MFS_MEDIUM_HOME, AI_MFS_WIDE_HOME etc.

The appropriate parameter need to be selected in the component layout for the type of partioning.

Three types of partitions are there.
@Trparent partition: A form of shared partition that provides the ability to access and manipulate remote data trparently as though it is part of your local database. The remote data is retrieved from the data source each time you request it. Any updates made to the data are written back to the data source and become immediately accessible to both local data target users and trparent data source users.
@Replicated Partition:
@Linked Partition:

Star schema:
A single fact table with N number of DimensionSnowflake schema: Any dimensions with extended dimensions are known as snowflake schema.