What are the difference between data mart and data warehouse?
Datawarehouse is complete data where as Data mart is Subset of the same.
Ex:
All the organisation data may related to finance department, HR, banking dept are stored in data warehouse where as in data mart only finance data or HR department data will be stored. So data warehouse is a collection of different data marts.
Have you ever worked on performance tuning, if yes what are the steps involved in it?
We need to identify the bottlenecks to tune the performance, to overcome the bottleneck we need to following the following.
Avoid named queries
Unnecessary relationships between tables
Proper attribute relationships to be given
Proper aggregation design
Proper partitioning of data
Proper dimension usage design
Avoid unnecessary many to many relationships
Avoid unnecessary measures
Set AttributeHierarchyEnabled = FALSE to Attributes that is not required
Won’t take even single measure which is not necessary.
What are the difficulties faced in cube development?
This question is either to test whether you are really experienced or when he doesnot have any questions to ask ..
You can tell any area where you feel difficult to work. But always the best answers will be the following.
Giving attribute relationships
Calculations
Giving dimension usage (many to many relationship)
Analyzing the requirements
Explain the flow of creating a cube?
Steps to create a cube in ssas
Create a data source.
Create a datasource view.
Create Dimensions
Create a cube.
Deploy and Process the cube.
What is a datasource or DS?
The data source is the Physical Connection information that analysis service uses to connect to the database that host the data. The data source contains the connection string which specifies the server and the database hosting the data as well as any necessary authentication credentials.
What is datasourceview or DSV?
A data source view is a persistent set of tables from a data source that supply the data for a particular cube. BIDS also includes a wizard for creating data source views, which you can invoke by right-clicking on the Data Source Views folder in Solution Explorer.
Datasource view is the logical view of the data in the data source.
Data source view is the only thing a cube can see.
What is named calculation?
A named calculation is a SQL expression represented as a calculated column. This expression appears and behaves as a column in the table. A named calculation lets you extend the relational schema of existing tables or views in a data source view without modifying the tables or views in the underlying data source.
Named calculation is used to create a new column in the DSV using hard coded values or by using existing columns or even with both.
What is named query?
Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not impact the underlying database. Named query is mainly used to merge the two or more table in the datasource view or to filter columns of a table.
How you provide security to cube?
By defining roles we provide security to cubes. Using roles we can restrict users from accessing restricted data. Procedure as follows –
Define Role
Set Permission
Add appropriate Users to the role
How much time it take to Process the Cube?
This is Very very important question. This again depends on the SIZE of database,Complexity of the database and your server settings. For database with 50 cr transaction records, it generally takes 3.5 hrs.
How many Calculation you done in Your Project?
I answer more than 5000 and if you tell the same then you are caught unless you are super good in MDX. Best answer for you is “Worked on 50 calculations”.
How will you hide an attribute?
We can hide the attribute by selecting “AttributeHierarchyVisible = False” in properties of the attribute.
How will you make an attribute not process?
By selecting “ AttributeHierarchyEnabled = False”, we can make an attribute not in process.
A named query is used to join multiple tables, to remove unnecessary columns from a table of a database. You can achieve the same in database using Views but this Named Queries will be the best bet whe you don’t have access to create Views in database.
How will you add a new column to an existing table in data source view?
By using named calculations we can add a new column to an existing table in the data source view. Named Calculation is explained above.
What is dimension table?
A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”
The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called “primary key column”
The primary key column of each dimension table corresponding to the one of the key column in any related fact table.
What is fact table?
A fact table contains the basic information that you wish to summarize. The table that stores the detailed value for measure is called fact table. In simple and best we can define as “The table which contains METRICS” that are used to analyse the business.
It consists of 2 sections
1) Foregine key to the dimesion
2) measures/facts(a numerical value that used to monitor business activity)
What is Factless fact table?
This is very important interview question. The “Factless Fact Table” is a table which is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.
Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected
What is star, snowflake and star flake schema?
Star schema: In star schema fact table will be directly linked with all dimension tables. The star schema’s dimensions are denormalized with each dimension being represented by a single table. In a star schema a central fact table connects a number of individual dimension tables.
Snowflake: The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. In a star schema, each dimension is represented by a single dimensional table, whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each representing a level in the dimensional hierarchy. In snow flake schema fact table will be linked directly as well as there will be some intermediate dimension tables between fact and dimension tables.
Star flake: A hybrid structure that contains a mixture of star(denormalized) and snowflake(normalized) schema’s.
What is use of AttributeHierarchyEnabled?
AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis Services for the attribute. If the attribute hierarchy is not enabled, the attribute cannot be used in a user-defined hierarchy and the attribute hierarchy cannot be referenced in Multidimensional Expressions (MDX) statements.
What is use of AttributeHierarchyOptimizedState?
AttributeHierarchyOptimizedState: Determines the level of optimization applied to the attribute hierarchy. By default, an attribute hierarchy is FullyOptimized, which means that Analysis Services builds indexes for the attribute hierarchy to improve query performance. The other option, NotOptimized, means that no indexes are built for the attribute hierarchy. Using NotOptimized is useful if the attribute hierarchy is used for purposes other than querying, because no additional indexes are built for the attribute. Other uses for an attribute hierarchy can be helping to order another attribute.
What is use of AttributeHierarchyOrdered ?
AttributeHierarchyOrdered: Determines whether the associated attribute hierarchy is ordered. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
What is the use of AttributeHierarchyVisible ?
AttributeHierarchyVisible : Determines whether the attribute hierarchy is visible to client applications. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
What are types of storage modes?
There are three standard storage modes in OLAP applications
MOLAP
ROLAP
HOLAP
Datawarehouse is complete data where as Data mart is Subset of the same.
Ex:
All the organisation data may related to finance department, HR, banking dept are stored in data warehouse where as in data mart only finance data or HR department data will be stored. So data warehouse is a collection of different data marts.
Have you ever worked on performance tuning, if yes what are the steps involved in it?
We need to identify the bottlenecks to tune the performance, to overcome the bottleneck we need to following the following.
Avoid named queries
Unnecessary relationships between tables
Proper attribute relationships to be given
Proper aggregation design
Proper partitioning of data
Proper dimension usage design
Avoid unnecessary many to many relationships
Avoid unnecessary measures
Set AttributeHierarchyEnabled = FALSE to Attributes that is not required
Won’t take even single measure which is not necessary.
What are the difficulties faced in cube development?
This question is either to test whether you are really experienced or when he doesnot have any questions to ask ..
You can tell any area where you feel difficult to work. But always the best answers will be the following.
Giving attribute relationships
Calculations
Giving dimension usage (many to many relationship)
Analyzing the requirements
Explain the flow of creating a cube?
Steps to create a cube in ssas
Create a data source.
Create a datasource view.
Create Dimensions
Create a cube.
Deploy and Process the cube.
What is a datasource or DS?
The data source is the Physical Connection information that analysis service uses to connect to the database that host the data. The data source contains the connection string which specifies the server and the database hosting the data as well as any necessary authentication credentials.
Red Hat Frequently Asked SSAS Interview Questions Answers |
What is datasourceview or DSV?
A data source view is a persistent set of tables from a data source that supply the data for a particular cube. BIDS also includes a wizard for creating data source views, which you can invoke by right-clicking on the Data Source Views folder in Solution Explorer.
Datasource view is the logical view of the data in the data source.
Data source view is the only thing a cube can see.
What is named calculation?
A named calculation is a SQL expression represented as a calculated column. This expression appears and behaves as a column in the table. A named calculation lets you extend the relational schema of existing tables or views in a data source view without modifying the tables or views in the underlying data source.
Named calculation is used to create a new column in the DSV using hard coded values or by using existing columns or even with both.
What is named query?
Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not impact the underlying database. Named query is mainly used to merge the two or more table in the datasource view or to filter columns of a table.
How you provide security to cube?
By defining roles we provide security to cubes. Using roles we can restrict users from accessing restricted data. Procedure as follows –
Define Role
Set Permission
Add appropriate Users to the role
How much time it take to Process the Cube?
This is Very very important question. This again depends on the SIZE of database,Complexity of the database and your server settings. For database with 50 cr transaction records, it generally takes 3.5 hrs.
How many Calculation you done in Your Project?
I answer more than 5000 and if you tell the same then you are caught unless you are super good in MDX. Best answer for you is “Worked on 50 calculations”.
How will you hide an attribute?
We can hide the attribute by selecting “AttributeHierarchyVisible = False” in properties of the attribute.
How will you make an attribute not process?
By selecting “ AttributeHierarchyEnabled = False”, we can make an attribute not in process.
Why we need named queries?
A named query is used to join multiple tables, to remove unnecessary columns from a table of a database. You can achieve the same in database using Views but this Named Queries will be the best bet whe you don’t have access to create Views in database.
How will you add a new column to an existing table in data source view?
By using named calculations we can add a new column to an existing table in the data source view. Named Calculation is explained above.
What is dimension table?
A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”
The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called “primary key column”
The primary key column of each dimension table corresponding to the one of the key column in any related fact table.
What is fact table?
A fact table contains the basic information that you wish to summarize. The table that stores the detailed value for measure is called fact table. In simple and best we can define as “The table which contains METRICS” that are used to analyse the business.
It consists of 2 sections
1) Foregine key to the dimesion
2) measures/facts(a numerical value that used to monitor business activity)
What is Factless fact table?
This is very important interview question. The “Factless Fact Table” is a table which is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.
Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected
What is star, snowflake and star flake schema?
Star schema: In star schema fact table will be directly linked with all dimension tables. The star schema’s dimensions are denormalized with each dimension being represented by a single table. In a star schema a central fact table connects a number of individual dimension tables.
Snowflake: The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. In a star schema, each dimension is represented by a single dimensional table, whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each representing a level in the dimensional hierarchy. In snow flake schema fact table will be linked directly as well as there will be some intermediate dimension tables between fact and dimension tables.
Star flake: A hybrid structure that contains a mixture of star(denormalized) and snowflake(normalized) schema’s.
What is use of AttributeHierarchyEnabled?
AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis Services for the attribute. If the attribute hierarchy is not enabled, the attribute cannot be used in a user-defined hierarchy and the attribute hierarchy cannot be referenced in Multidimensional Expressions (MDX) statements.
What is use of AttributeHierarchyOptimizedState?
AttributeHierarchyOptimizedState: Determines the level of optimization applied to the attribute hierarchy. By default, an attribute hierarchy is FullyOptimized, which means that Analysis Services builds indexes for the attribute hierarchy to improve query performance. The other option, NotOptimized, means that no indexes are built for the attribute hierarchy. Using NotOptimized is useful if the attribute hierarchy is used for purposes other than querying, because no additional indexes are built for the attribute. Other uses for an attribute hierarchy can be helping to order another attribute.
What is use of AttributeHierarchyOrdered ?
AttributeHierarchyOrdered: Determines whether the associated attribute hierarchy is ordered. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
What is the use of AttributeHierarchyVisible ?
AttributeHierarchyVisible : Determines whether the attribute hierarchy is visible to client applications. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
What are types of storage modes?
There are three standard storage modes in OLAP applications
MOLAP
ROLAP
HOLAP
Post a Comment