What are the types of processing and explain each?
They are 6 types of processing in ssas ,they are
Process Full
Process Data
Process Index
Process Incremental
Process Structure
UnProcess
Process Full: Processes an Analysis Services object and all the objects that it contains. When Process Full is executed against an object that has already been processed, Analysis Services drops all data in the object, and then processes the object. This kind of processing is required when a structural change has been made to an object, for example, when an attribute hierarchy is added, deleted, or renamed. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
Process Data: Processes data only without building aggregations or indexes. If there is data is in the partitions, it will be dropped before re-populating the partition with source data. This processing option is supported for dimensions, cubes, measure groups, and partitions.
Process Index: Creates or rebuilds indexes and aggregations for all processed partitions. This option causes an error on unprocessed objects. This processing option is supported for cubes, dimensions, measure groups, and partitions.
Process Increment: Adds newly available fact data and process only to the relevant partitions. This processing option is supported for measure groups, and partitions.
Process Structure: If the cube is unprocessed, Analysis Services will process, if it is necessary, all the cube’s dimensions. After that, Analysis Services will create only cube definitions. If this option is applied to a mining structure, it populates the mining structure with source data. The difference between this option and the Process Full option is that this option does not iterate the processing down to the mining models themselves. This processing option is supported for cubes and mining structures.
Unprocess : Drops the data in the object specified and any lower-level constituent objects. After the data is dropped, it is not reloaded. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
Process Default: Detects the process state of an object, and performs processing necessary to deliver unprocessed or partially processed objects to a fully processed state. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
What is a cube?
The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly.
For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.
What is AMO?
The full form of AMO is Analysis Managament Objects. This is used to create or alter cubes from .NET code.
After creating the cube, if we added a new column to the OLTP table then how you add this new attribute to the cube?
Just open the datasourceview and on right click we find the option REFRESH. Click the REFRESH then it will add new attributes to the table which can be added to Cube.
REAL TIME INTERVIEW QUESTIONS –
What is the size of the Cube in your last Project?
Answer to this question varies from project to project and mainly depends on how BIG is your database and how COMPLEX the database design is. Generally for the database with a TRANSACTION TABLE of 50 crore records, the cube size will be around 100GB. So, better go with 100GB as answer to this question.
What is size of the database in your last Project?
You can expect this question immediately after you answer 100GB to the last question. The database size will be 600 to 800GB for which the cube will come to 100 GB. So go with 800GB for this question.
What is size of the fact(Transaction) table in your last Project?
This will be the next question if you answer 800GB as your dataabase size. Here he is not expecting SIZE in GBs but the interviewer will be expecting NUMBER OF ROWS in the Transaction table. Go with 57Crore records for this question.
How frequently you process the cube?
You have to be very careful here. Frequency of processing cube depends on HOW FREQUENTLY YOU ARE GETTING NEW DATA. Once the new data comes then SSIS team loads it and send a mail to SSAS team after load is completed successfully. Once SSAS team receives the mail then these guys will look for best time to PROCESS.
Typically we get data either Weekly or Monthly. So you can say that the processing of the cube will be done either Weekly or monthly.
How frequently you get DATA from clients?
This answer should be based on your last answer. IF you answered WEEKLY to last question then the Answer to this question also should be WEEKLY. IF MONTHLY for last question then this answer also should be MONTHLY.
What type of Processing Options you used to process the cube in your Project?
This is the toughest question to answer. This depends on DATA you have and CLIENTS requirements. Let me explain here.
If the database is SMALL, let’s say it has only 1 crore records then people do FULL PROCESS as it wont take much time.
If the database is MEDIUM, let’s say it has only 15 crore records then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes little bit of time.
If the database is HUGE, let’s say it has more than 35 to 40 crore records then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes lot of time. In this case we TRY to convince clients for INCREMENTAL and if they don’t agree then we don’t have any other option.
Incremental process will come into picture ONLY when there is no updates to the OLD data i.e no changes to already existing data else NO OTHER OPTION than FULL PROCESS.
What is use of IsAggregatable property?
In Analysis Service we generally see all dimension has All member. This is because of IsAggregatable property of the attribute. You can set its value to false, so that it will not show All member. Its default member for that attribute. If you hide this member than you will have to set other attribute value to default member else it will pick some value as default and this will create confusion in browsing data if someone is not known to change in default member.
What are key, name and value columns of an attribute?
Key column of any attribute: Contains the column or columns that represent the key for the attribute, which is the column in the underlying relational table in the data source view to which the attribute is bound. The value of this column for each member is displayed to users unless a value is specified for the NameColumn property.
Name column of an attribute: Identifies the column that provides the name of the attribute that is displayed to users, instead of the value in the key column for the attribute. This column is used when the key column value for an attribute member is cryptic or not otherwise useful to the user, or when the key column is based on a composite key. The NameColumn property is not used in parent-child hierarchies; instead, the NameColumn property for child members is used as the member names in a parent-child hierarchy.
Value columns of an attribute: Identifies the column that provides the value of the attribute. If the NameColumn element of the attribute is specified, the same DataItem values are used as default values for the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns collection of the attribute contains a single KeyColumn element representing a key column with a string data type, the same DataItem values are used as default values for the ValueColumn element.
What is hierarchy, what are its types and difference between them?
A hierarchy is a very important part of any OLAP engine and allows users to drill down from summary levels hierarchies represent the way user expect to explore data at more detailed level
hierarchies is made up of multipule levels creating the structure based on end user requirements.
->years->quarter->month->week ,are all the levels of calender hierarchy
They are 2 types of hierarchies they are
Natural hierarchy
Unnatural hierarchy
Natural hierarchy: This means that the attributes are intuitively related to one another. There is a clear relationship from the top of the hierarchy to the bottom.
Example: An example of this would be date: year, quarter and month follow from each other, and in part, define each other.
Unnatural hierarchy: This means that the attributes are not clearly related.
Example: An example of this might be geography; we may have country -> state -> city, but it is not clear where Province might sit.
What is Attribute hierarchy?
An attribute hierarchy is created for every attribute in a dimension, and each hierarchy is available for dimensioning fact data. This hierarchy consists of an “All” level and a detail level containing all members of the hierarchy.
you can organize attributes into user-defined hierarchies to provide navigation paths in a cube. Under certain circumstances, you may want to disable or hide some attributes and their hierarchies.
What is use of AttributeHierarchyDisplayFolder property ?
AttributeHierarchyDisplayFolder: Identifies the folder in which to display the associated attribute hierarchy to end users. For example if I set the property value as “Test” to all the Attributes of a dimension then a folder with the name “Test” will be created and all the Attributes will be placed into the same.
What Is Factless Fact Table?
This is very important interview . The “Factless Fact Table” is a table which is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.
Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected
What Is Fact Table?
A fact table contains the basic information that you wish to summarize. The table that stores the detailed value for measure is called fact table. In simple and best we can define as “The table which contains METRICS” that are used to analyse the business.
It consists of 2 sections
Foregine key to the dimesion
measures/facts(a numerical value that used to monitor business activity)
What Is Dimension Table?
A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”
The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called “primary key column”
The primary key column of each dimension table corresponding to the one of the key column in any related fact table.
How Will You Add A New Column To An Existing Table In Data Source View?
By using named calculations we can add a new column to an existing table in the data source view.
Why We Need Named Queries?
A named query is used to join multiple tables, to remove unnecessary columns from a table of a database. You can achieve the same in database using Views but this Named Queries will be the best bet whe you don’t have access to create Views in database.
What Is Named Query?
Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not impact the underlying database. Named query is mainly used to merge the two or more table in the datasource view or to filter columns of a table.
How Many Types Of Dimensions Are There And What Are They?
They are 3 types of dimensions
confirm dimension
junk dimension
degenerate attribute
How Many Types Of Attribute Relationships Are There?
They are 2 types of attribute relationships they are
Rigid
Flexible
Rigid: In Rigid relationships where the relationship between the attributes is fixed, attributes will not change levels or their respective attribute relationships.
Example: The time dimension. We know that month “January 2009” will ONLY belong to Year “2009” and it wont be moved to any other year.
Flexible : In Flexible relationship between the attributes is changed.
Example: An employee and department. An employee can be in accounts department today but it is possible that the employee will be in Marketing department tomorrow.
What Is Attribute Relationships, Why We Need It?
Attribute relationships are the way of telling the analysis service engine that how the attributes are related with each other. It will help to relate two or more attributes to each other.Processing time will be decreased if proper relationships are given. This increases the Cube Processing performance and MDX query performance too.
In Microsoft SQL Server Analysis Services, attributes within a dimension are always related either directly or indirectly to the key attribute. When you define a dimension based on a star schema, which is where all dimension attributes are derived from the same relational table, an attribute relationship is automatically defined between the key attribute and each non-key attribute of the dimension.
When you define a dimension based on a snowflake schema, which is where dimension attributes are derived from multiple related tables, an attribute relationship is automatically defined as follows
Between the key attribute and each non-key attribute bound to columns in the main dimension table.
Between the key attribute and the attribute bound to the foreign key in the secondary table that links the underlying dimension tables.
Between the attribute bound to foreign key in the secondary table and each non-key attribute bound to columns from the secondary table.
They are 6 types of processing in ssas ,they are
Process Full
Process Data
Process Index
Process Incremental
Process Structure
UnProcess
Process Full: Processes an Analysis Services object and all the objects that it contains. When Process Full is executed against an object that has already been processed, Analysis Services drops all data in the object, and then processes the object. This kind of processing is required when a structural change has been made to an object, for example, when an attribute hierarchy is added, deleted, or renamed. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
Process Data: Processes data only without building aggregations or indexes. If there is data is in the partitions, it will be dropped before re-populating the partition with source data. This processing option is supported for dimensions, cubes, measure groups, and partitions.
Process Index: Creates or rebuilds indexes and aggregations for all processed partitions. This option causes an error on unprocessed objects. This processing option is supported for cubes, dimensions, measure groups, and partitions.
Process Increment: Adds newly available fact data and process only to the relevant partitions. This processing option is supported for measure groups, and partitions.
Process Structure: If the cube is unprocessed, Analysis Services will process, if it is necessary, all the cube’s dimensions. After that, Analysis Services will create only cube definitions. If this option is applied to a mining structure, it populates the mining structure with source data. The difference between this option and the Process Full option is that this option does not iterate the processing down to the mining models themselves. This processing option is supported for cubes and mining structures.
Unprocess : Drops the data in the object specified and any lower-level constituent objects. After the data is dropped, it is not reloaded. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
Process Default: Detects the process state of an object, and performs processing necessary to deliver unprocessed or partially processed objects to a fully processed state. This processing option is supported for cubes, databases, dimensions, measure groups, mining models, mining structures, and partitions.
Tech Mahindra Frequently Asked SSAS Interview Questions Answers |
What is a cube?
The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly.
For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.
What is AMO?
The full form of AMO is Analysis Managament Objects. This is used to create or alter cubes from .NET code.
After creating the cube, if we added a new column to the OLTP table then how you add this new attribute to the cube?
Just open the datasourceview and on right click we find the option REFRESH. Click the REFRESH then it will add new attributes to the table which can be added to Cube.
REAL TIME INTERVIEW QUESTIONS –
What is the size of the Cube in your last Project?
Answer to this question varies from project to project and mainly depends on how BIG is your database and how COMPLEX the database design is. Generally for the database with a TRANSACTION TABLE of 50 crore records, the cube size will be around 100GB. So, better go with 100GB as answer to this question.
What is size of the database in your last Project?
You can expect this question immediately after you answer 100GB to the last question. The database size will be 600 to 800GB for which the cube will come to 100 GB. So go with 800GB for this question.
What is size of the fact(Transaction) table in your last Project?
This will be the next question if you answer 800GB as your dataabase size. Here he is not expecting SIZE in GBs but the interviewer will be expecting NUMBER OF ROWS in the Transaction table. Go with 57Crore records for this question.
How frequently you process the cube?
You have to be very careful here. Frequency of processing cube depends on HOW FREQUENTLY YOU ARE GETTING NEW DATA. Once the new data comes then SSIS team loads it and send a mail to SSAS team after load is completed successfully. Once SSAS team receives the mail then these guys will look for best time to PROCESS.
Typically we get data either Weekly or Monthly. So you can say that the processing of the cube will be done either Weekly or monthly.
How frequently you get DATA from clients?
This answer should be based on your last answer. IF you answered WEEKLY to last question then the Answer to this question also should be WEEKLY. IF MONTHLY for last question then this answer also should be MONTHLY.
What type of Processing Options you used to process the cube in your Project?
This is the toughest question to answer. This depends on DATA you have and CLIENTS requirements. Let me explain here.
If the database is SMALL, let’s say it has only 1 crore records then people do FULL PROCESS as it wont take much time.
If the database is MEDIUM, let’s say it has only 15 crore records then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes little bit of time.
If the database is HUGE, let’s say it has more than 35 to 40 crore records then people prefer to do INCREMENTAL PROCESS unless CLIENTS ask us to do FULL PROCESS as it takes lot of time. In this case we TRY to convince clients for INCREMENTAL and if they don’t agree then we don’t have any other option.
Incremental process will come into picture ONLY when there is no updates to the OLD data i.e no changes to already existing data else NO OTHER OPTION than FULL PROCESS.
What is use of IsAggregatable property?
In Analysis Service we generally see all dimension has All member. This is because of IsAggregatable property of the attribute. You can set its value to false, so that it will not show All member. Its default member for that attribute. If you hide this member than you will have to set other attribute value to default member else it will pick some value as default and this will create confusion in browsing data if someone is not known to change in default member.
What are key, name and value columns of an attribute?
Key column of any attribute: Contains the column or columns that represent the key for the attribute, which is the column in the underlying relational table in the data source view to which the attribute is bound. The value of this column for each member is displayed to users unless a value is specified for the NameColumn property.
Name column of an attribute: Identifies the column that provides the name of the attribute that is displayed to users, instead of the value in the key column for the attribute. This column is used when the key column value for an attribute member is cryptic or not otherwise useful to the user, or when the key column is based on a composite key. The NameColumn property is not used in parent-child hierarchies; instead, the NameColumn property for child members is used as the member names in a parent-child hierarchy.
Value columns of an attribute: Identifies the column that provides the value of the attribute. If the NameColumn element of the attribute is specified, the same DataItem values are used as default values for the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns collection of the attribute contains a single KeyColumn element representing a key column with a string data type, the same DataItem values are used as default values for the ValueColumn element.
What is hierarchy, what are its types and difference between them?
A hierarchy is a very important part of any OLAP engine and allows users to drill down from summary levels hierarchies represent the way user expect to explore data at more detailed level
hierarchies is made up of multipule levels creating the structure based on end user requirements.
->years->quarter->month->week ,are all the levels of calender hierarchy
They are 2 types of hierarchies they are
Natural hierarchy
Unnatural hierarchy
Natural hierarchy: This means that the attributes are intuitively related to one another. There is a clear relationship from the top of the hierarchy to the bottom.
Example: An example of this would be date: year, quarter and month follow from each other, and in part, define each other.
Unnatural hierarchy: This means that the attributes are not clearly related.
Example: An example of this might be geography; we may have country -> state -> city, but it is not clear where Province might sit.
What is Attribute hierarchy?
An attribute hierarchy is created for every attribute in a dimension, and each hierarchy is available for dimensioning fact data. This hierarchy consists of an “All” level and a detail level containing all members of the hierarchy.
you can organize attributes into user-defined hierarchies to provide navigation paths in a cube. Under certain circumstances, you may want to disable or hide some attributes and their hierarchies.
What is use of AttributeHierarchyDisplayFolder property ?
AttributeHierarchyDisplayFolder: Identifies the folder in which to display the associated attribute hierarchy to end users. For example if I set the property value as “Test” to all the Attributes of a dimension then a folder with the name “Test” will be created and all the Attributes will be placed into the same.
This is very important interview . The “Factless Fact Table” is a table which is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.
Factless fact tables are used for tracking a process or collecting stats. They are called so because, the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected
What Is Fact Table?
A fact table contains the basic information that you wish to summarize. The table that stores the detailed value for measure is called fact table. In simple and best we can define as “The table which contains METRICS” that are used to analyse the business.
It consists of 2 sections
Foregine key to the dimesion
measures/facts(a numerical value that used to monitor business activity)
What Is Dimension Table?
A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”
The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called “primary key column”
The primary key column of each dimension table corresponding to the one of the key column in any related fact table.
How Will You Add A New Column To An Existing Table In Data Source View?
By using named calculations we can add a new column to an existing table in the data source view.
Why We Need Named Queries?
A named query is used to join multiple tables, to remove unnecessary columns from a table of a database. You can achieve the same in database using Views but this Named Queries will be the best bet whe you don’t have access to create Views in database.
What Is Named Query?
Named query in DSV is similar to View in Database. This is used to create Virtual table in DSV which will not impact the underlying database. Named query is mainly used to merge the two or more table in the datasource view or to filter columns of a table.
How Many Types Of Dimensions Are There And What Are They?
They are 3 types of dimensions
confirm dimension
junk dimension
degenerate attribute
How Many Types Of Attribute Relationships Are There?
They are 2 types of attribute relationships they are
Rigid
Flexible
Rigid: In Rigid relationships where the relationship between the attributes is fixed, attributes will not change levels or their respective attribute relationships.
Example: The time dimension. We know that month “January 2009” will ONLY belong to Year “2009” and it wont be moved to any other year.
Flexible : In Flexible relationship between the attributes is changed.
Example: An employee and department. An employee can be in accounts department today but it is possible that the employee will be in Marketing department tomorrow.
What Is Attribute Relationships, Why We Need It?
Attribute relationships are the way of telling the analysis service engine that how the attributes are related with each other. It will help to relate two or more attributes to each other.Processing time will be decreased if proper relationships are given. This increases the Cube Processing performance and MDX query performance too.
In Microsoft SQL Server Analysis Services, attributes within a dimension are always related either directly or indirectly to the key attribute. When you define a dimension based on a star schema, which is where all dimension attributes are derived from the same relational table, an attribute relationship is automatically defined between the key attribute and each non-key attribute of the dimension.
When you define a dimension based on a snowflake schema, which is where dimension attributes are derived from multiple related tables, an attribute relationship is automatically defined as follows
Between the key attribute and each non-key attribute bound to columns in the main dimension table.
Between the key attribute and the attribute bound to the foreign key in the secondary table that links the underlying dimension tables.
Between the attribute bound to foreign key in the secondary table and each non-key attribute bound to columns from the secondary table.
Post a Comment