Top 50 Sql Server Analysis Services (SSAS) Interview Questions You Must Prepare 19.Mar.2024

Attribute relationships are the way of telling the analysis service engine that how the attributes are related with each other. It will help to relate two or more  attributes to each other.Processing time will be decreased if proper relationships are given. This increases the Cube Processing performance and MDX query performance too.

In Microsoft SQL Server Analysis Services, attributes within a dimension are always related either directly or indirectly to the key attribute. When you define a dimension based on a star schema, which is where all dimension attributes are derived from the same relational table, an attribute relationship is automatically defined between the key attribute and each non-key attribute of the dimension.

When you define a dimension based on a snowflake schema, which is where dimension attributes are derived from multiple related tables, an attribute relationship is automatically defined as follows:

  • Between the key attribute and each non-key attribute bound to columns in the main dimension table.
  • Between the key attribute and the attribute bound to the foreign key in the secondary table that links the underlying dimension tables.
  • Between the attribute bound to foreign key in the secondary table and each non-key attribute bound to columns from the secondary table.

The maximum size of the dimension is 4 gb.

A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”

The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called “primary key column”

The primary key column of each dimension table corresponding to the one of the key column  in any related fact table.

  • Measure groups : These measure groups can contain different dimensions and be at different granularity  but so long as you model your cube correctly, your users will be able to use measures from each of these measure groups in their queries easily and without worrying about the underlying complexity.
  • Creating multiple measure groups : To create a new measure group in the Cube Editor, go to the Cube Structure tab and right-click on the cube name in the Measures pane and select ‘New Measure Group’. You’ll then need to select the fact table to create the measure group from and then the new measure group will be created; any columns that aren’t used as foreign key columns in the DSV will automatically be created as measures, and you’ll also get an extra measure of aggregation type Count. It’s a good idea to delete any measures you are not going to use at this stage.
  •  Measures :  Measures are the numeric values that our users want to aggregate, slice, dice and otherwise analyze, and as a result, it’s important to make sure they behave the way we want them to. One of the fundamental reasons for using Analysis Services is that, unlike a relational database it allows us to build into our cube design business rules about measures: how they should be formatted, how they should aggregate up, how they interact with specific dimensions and so on.

A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject that may be distributed to support business needs. Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization.

Data marts are often derived from subsets of data in a data warehouse, though in the bottom-up data warehouse design methodology the data warehouse is created from the union of organizational data marts.

They are 3 types of data mart they are:

  1. Dependent
  2. Independent
  3. Logical data mart

Attribute Hierarchy Ordered: Determines whether the associated attribute hierarchy is ordered. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.

By selecting  “ AttributeHierarchyEnabled = False”, we can make an  attribute not in process.

  • In 2005 a MAX of 2000 partitions can be created per measure group and that limit is lifted in later versions.
  • In any version the MINIMUM is ONE Partition per measure group.

Star schema: In star schema fact table will be directly linked with all dimension tables. The star schema’s dimensions are denormalized with each dimension being represented by a single table. In a star schema a central fact table connects a number of individual dimension tables.

Snowflake: The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. In a star schema, each dimension is represented by a single dimensional table, whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each representing a level in the dimensional hierarchy. In snow flake schema fact table will be linked directly as well as there will be some intermediate dimension tables between fact and dimension tables.

Star flake: A hybrid structure that contains a mixture of star(denormalized) and snowflake(normalized) schema’s.

A hierarchy is a very important part of any OLAP engine and allows users to drill down from  summary levels hierarchies represent the way user expect to explore data at more detailed level

hierarchies  is made up of multipule levels creating the structure based on end user requirements.
->years->quarter->month->week ,are all the levels of calender hierarchy

They are 2 types of hierarchies they are

  1. Natural hierarchy
  2. Unnatural hierarchy
  • Natural hierarchy: This means that the attributes are intuitively related to one another. There is a clear relationship from the top of the hierarchy to the bottom.
    Example: An example of this would be date: year, quarter and month follow from each other, and in part, define each other.
  • Unnatural hierarchy: This means that the attributes are not clearly related.
    Example: An example of this might be geography; we may have country -> state -> city, but it is not clear where Province might sit.

Just open the datasourceview and on right click we find the option REFRESH. Click the REFRESH then it will add new attributes to the table which can be added to Cube.

Key column of any attribute: Contains the column or columns that represent the key for the attribute, which is the column in the underlying relational table in the data source view to which the attribute is bound. The value of this column for each member is displayed to users unless a value is specified for the NameColumn property.

Name  column of an attribute: Identifies the column that provides the name of the attribute that is displayed to users, instead of the value in the key column for the attribute. This column is used when the key column value for an attribute member is cryptic or not otherwise useful to the user, or when the key column is based on a composite key. The NameColumn property is not used in parent-child hierarchies; instead, the NameColumn property for child members is used as the member names in a parent-child hierarchy.

Value columns of an attribute: Identifies the column that provides the value of the attribute. If the NameColumn element of the attribute is specified, the same DataItem values are used as default values for the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns collection of the attribute contains a single KeyColumn element representing a key column with a string data type, the same DataItem values are used as default values for the ValueColumn element.

Translation: The translation feature in analysis service allows you to display caption and attributes names that correspond to a specific language. It helps in providing GLOBALIZATION to the Cube.

The data source is the Physical Connection information that analysis service uses to connect to the database that host the data. The data source contains the connection string which specifies the server and the database hosting the data as well as any necessary authentication credentials.

Slowly changing dimensions (SCD) determine how the historical changes in the dimension tables are handled. Implementing the SCD mechanism enables users to know to which category an item belonged to in any given date.

They are 3 types of database schema they are

  1. Star
  2. Snowflake
  3. Starflake

There are three standard storage modes in OLAP applications

  1. MOLAP
  2. ROLAP
  3. HOLAP

An attribute hierarchy is created for every attribute in a dimension, and each hierarchy is available for dimensioning fact data. This hierarchy consists of an “All” level and a detail level containing all members of the hierarchy.

you can organize attributes into user-defined hierarchies to provide navigation paths in a cube. Under certain circumstances, you may want to disable or hide some attributes and their hierarchies.

Attribute Hierarchy Optimized State:

Determines the level of optimization applied to the attribute hierarchy. By default, an attribute hierarchy is FullyOptimized, which means that Analysis Services builds indexes for the attribute hierarchy to improve query performance. The other option, NotOptimized, means that no indexes are built for the attribute hierarchy. Using NotOptimized is useful if the attribute hierarchy is used for purposes other than querying, because no additional indexes are built for the attribute. Other uses for an attribute hierarchy can be helping to order another attribute.

The full form of AMO is Analysis Management Objects. This is used to create or alter cubes from .NET code.

  • Giving attribute relationships
  • Calculations
  • Giving dimension usage (many to many relationship)
  • Analyzing the requirements

A data source view is a persistent set of tables from a data source that supply the data for a particular cube. BIDS also includes a wizard for creating data source views, which you can invoke by right-clicking on the Data Source Views folder in Solution Explorer.

  1. Datasource view is the logical view of the data in the data source.
  2. Data source view  is the only thing a cube can see.

To add a dimension to a cube follow these steps.

  1. In Solution Explorer, right-click the cube, and then click View Designer.
  2. In the Design tab for the cube, click the Dimension Usage tab.
  3. Either click the Add Cube Dimension button, or right-click anywhere on the work surface  and then click Add Cube Dimension.
  4. In the Add Cube Dimension dialog box, use one of the following steps:
  5. To add an existing dimension, select the dimension, and then click OK.
  6. To create a new dimension to add to the cube, click New dimension, and then follow the steps in the Dimension Wizard.

By defining roles we provide security to cubes. Using roles we can restrict users from accessing restricted data.

Procedure as follows: 

  1. Define Role
  2. Set Permission
  3. Add appropriate Users to the role

You can use the Partition Wizard to define partitions for a measure group in a cube. By default, a single partition is defined for each measure group in a cube. Access and processing performance, however, can degrade for large partitions. By creating multiple partitions, each containing a portion of the data for a measure group, you can improve the access and processing performance for that measure group.

This is very important interview question. The “Factless Fact Table” is a table which is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.

Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected

They are six relation between the dimension and measure group, they are

  1. No Relationship
  2. Regular
  3. Refernce
  4. Many to Many
  5. Data Mining
  6. Fact

A surrogate key is the SQL generated key which acts like an alternate primary key for the table in database, Data warehouses commonly use a surrogate key to uniquely identify an entity. A surrogate is not generated by the user but by the system. A primary difference between a primary key and surrogate key in few databases is that primarykey uniquely identifies a record while a Surrogatekey uniquely identifies an entity.

Ex: An employee may be recruited before the year 2000 while another employee with the same name may be recruited after the year 200@Here, the primary key will uniquely identify the record while the surrogate key will be generated by the system (say a serial number) since the SK is NOT derived from the data.

They are 3 types of dimensions:

  1. confirm dimension
  2. junk dimension
  3. degenerate attribute

AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis Services for the attribute. If the attribute hierarchy is not enabled, the attribute cannot be used in a user-defined hierarchy and the attribute hierarchy cannot be referenced in Multidimensional Expressions (MDX) statements.

It is a concept of STORING Historical Changes and when ever an IT guy finds a new way to store then a new Type will come into picture. Basically there are 3 types of SCD they are given below

  1. SCD type1
  2. SCD type2
  3. SCD type3

Perspectives are a way to reduce the complexity of cubes by hidden elements like measure groups, measures, dimensions, hierarchies etc. It’s nothing but slicing of a cube, for ex we are having retail and hospital data and end user is subscribed to see only hospital data, then we can create perspective according to it.

All the dimensions that are created using NEW DIMENSION Wizard are database dimensions. In other words, the dimensions which are at Database level are called Database Dimensions.

They are 6 types of processing in ssas ,they are

  1. Process Full
  2. Process Data
  3. Process Index
  4. Process Incremental
  5. Process Structure
  6. UnProcess

An attribute is a specification that defines a property of an object, element, or file. It may also refer to or set the specific value for a given instance of such.

Steps to create a cube in ssas:

  1. Create  a data source.
  2. Create a datasource view.
  3. Create Dimensions
  4. Create a cube.
  5. Deploy and Process the cube.

Datawarehouse is complete data where as Data mart is Subset of the same.

Ex: All the organisation data may related to finance department, HR, banking dept are stored in data warehouse where as in data mart only finance data or HR department data will be stored. So data warehouse is a collection of different data marts.

AttributeHierarchyVisible : Determines whether the attribute hierarchy is visible to client applications. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.

The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly.

For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.

A named calculation is a SQL expression represented as a calculated column. This expression appears and behaves as a column in the table. A named calculation lets you extend the relational schema of existing tables or views in a data source view without modifying the tables or views in the underlying data source.

Named calculation is used to create a new column in the DSV using hard coded values or by using existing columns or even with both.

MOLAP (Multi dimensional Online Analytical Processing) : MOLAP is the most used storage type. Its designed to offer maximum query performance to the users. the data and aggregations are stored in a multidimensional format, compressed and optimized for performance. This is both good and bad. When a cube with MOLAP storage is processed, the data is pulled from the relational database, the aggregations are performed, and the data is stored in the AS database. The data inside the cube will refresh only when the cube is processed, so latency is high.

Advantages:

  1. Since the data is stored on the OLAP server in optimized format, queries (even complex calculations) are faster than ROLAP.
  2. The data is compressed so it takes up less space.
  3. And because the data is stored on the OLAP server, you don’t need to keep the connection to the relational database.
  4. Cube browsing is fastest using MOLAP.

Impersonation allows SSAS to assume the identity/security context of the client application which is used by SSAS to perform the server side data operations like data access, processing etc. As part of impersonation, the following options are available in SSAS:

  • Use a specific Windows user name and password: This option lets you to specify Windows account credentials which will be used by SSAS to perform operations like source data access, processing etc.
  • Use the service account: When this option is selected, SSAS uses the credentials of the service account under which the Analysis Services service is configured/running for source data access, processing etc.
  • Use the credentials of the current user: When this option is set, SSAS uses the credentials of the current user for performing operations like DMX Open Queries, Local cubes etc. This option cannot be used for performing server side operations like source data access, processing etc.
  • Inherit: This option let's the SSAS server decide which impersonation mode is suitable for each type of operation. When this option is set, by default SSAS will use the service account for operations like processing and the credentials of the current user for operations like Local cubes, querying the data mining models, etc...

They are 2 types of attribute relationships they are

  1. Rigid
  2. Flexible
  • Rigid: In Rigid relationships  where the relationship between the attributes is fixed, attributes will not change levels or their respective attribute relationships.
    • Example: The time dimension. We know that month “January 2009” will ONLY belong to Year “2009” and it wont be moved to any other year.
  • Flexible :   In Flexible relationship between the attributes is changed.
    • Example: An employee and department. An employee can be in accounts department today but it is possible that the employee will be in Marketing department tomorrow.

Bulid: Verifies the project files and create several local files.
Deploy: Deploy the structure of the cube(Skeleton) to the server.
Process: Read the data from the source and build the dimensions and cube structures

We can hide the attribute by selecting “AttributeHierarchyVisible = False” in properties of the attribute.

Role play dimensions: We already discussed about this. This is nothing but CONFIRMED Dimensions. A dimension can play different role in a fact table you can recognize a roleplay dimension when there are multiple columns in a fact table that each have foreign keys to the same dimension table.

Ex1: There are three dimension keys in the factinternalsales,factresellersales tables which all refer to the dimtime table,the same time dimension is used  to track sales by that contain either of these fact table,the corresponding  role-playing dimension are automatically added to the cube.

Ex2 : In retail banking, for checking account cube we could have transaction date dimension and effective date dimension. Both dimensions have date, month, quarter and year attributes. The formats of attributes are the same on both dimensions, for example the date attribute is in ‘dd-mm-yyyy’ format. Both dimensions have members from 1993 to 2010.

A cube dimension is an instance of a database dimension within a cube is called as cube dimension. A database dimension can be used in multiple cubes, and multiple cube dimensions can be based on a single database dimension

Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not impact the underlying database. Named query is mainly used to merge the two or more table in the datasource view or to filter columns of a table.

Actions are powerful way of extending the value of SSAS cubes for the end user. They can click on a cube or portion of a cube to start an application with the selected item as a parameter, or to retrieve information about the selected item.

One of the objects supported by a SQL Server Analysis Services cube is the action. An action is an event that a user can initiate when accessing cube data. The event can take a number of forms. For example, a user might be able to view a Reporting Services report, open a Web page, or drill through to detailed information related to the cube data

Analysis Services supports three types of actions..

  • Report action: Report action Returns a Reporting Services report that is associated with the cube data on which the action is based.
  • Drill through: Drillthrough Returns a result set that provides detailed information related to the cube data on which the action is based.
  • Standard: Standard has five action subtypes that are based on the specified cube data.
    • Dataset: Returns a multidimensional dataset.
    • Proprietary: Returns a string that can be interpreted by a client application.
    • Rowset: Returns a tabular rowset.
    • Statement: Returns a command string that can be run by a client application.
    • URL:  Returns a URL that can be opened by a client application, usually a browser.

ROLAP (Relational Online Analytical Processing) : ROLAP does not have the high latency disadvantage of MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will be zero latency between the relational source database and the cube.

Disadvantage of this mode is the performance, this type gives the poorest query performance because no objects benefit from multi dimensional storage.

Advantages:

  • Since the data is kept in the relational database instead of on the OLAP server, you can view the data in almost real time.
  • Also, since the data is kept in the relational database, it allows for much larger amounts of data, which can mean better scalability.
  • Low latency.