January 28, 2019

Srikaanth

Spotify Technology Frequently Asked SSIS Interview Questions Answers

What is Data Flow tab?

This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.

What is the function of control flow tab in SSIS?

On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.

What is the function of Event handlers tab in SSIS?

On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Failes or Stops or Starts ..

What is the function of Package explorer tab in SSIS?

This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.

What is Solution Explorer?

It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.

How do we convert data type in SSIS?

The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.

How are variables useful in ssis package?

Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.
Spotify Technology Frequently Asked SSIS Interview Questions Answers
Spotify Technology Frequently Asked SSIS Interview Questions Answers

Explain Aggregate Transformation in SSIS?

It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :

• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum

In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.

Explain Data Mining query Transformation?

This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.

Explain Derived column Transformation?

Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.

Explain Merge Transformation?

Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.

Explain Merge Join Transformation?

Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.

What is environment variable in SSIS?

An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.

How to provide security to packages?

We can provide security in two ways

1. Package encryption
2. Password protection.

What are Precedence constraints?

Constraints that link executable, container, and tasks within the package control flow and specify condition that determine the sequence and conditions for determine whether executable run.

What is Design time Deployment in SSIS ?

When you run a package from with in BIDS,it is built and temporarily deployed to the folder. By default the package will be deployed to the BIN folder in the Package’s Project folder and you can configure for custom folder for deployment. When the Package’s execution is completed and stopped in BIDS,the deployed package will be deleted and this is called as Design Time Deployment.

Explain Multicast Transformation?

This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.

Explain Percentage and row sampling Transformations?

This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.

Explain Sort Transformation?

This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.

Explain Union all Transformation?

It works in opposite way to Merge transformation. It can take output from more than 2 input paths and combines into single output path.

What r the possible locations to save SSIS package?

You can save a package wherever you want.
SQL Server
Package Store
File System

What is a package?

A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.

What is a workflow in SSIS?

A workflow is a set of instructions on how to execute tasks.
(It is a set of instructions on how to execute tasks such as sessions, emails and shell commands. a workflow is created form work flow mgr.

What is the diff between control flow Items and data flow Items?

The control flow is the highest level control process. It allows you to manage the run-time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow task to the package control flow.

What are the main component of SSIS(project-architecture)?

SSIS architecture has 4 main components

1.ssis service
2.ssis runtime engine & runtime executables
3.ssis dataflow engine & dataflow components
4.ssis clients

Different components in SSIS package?

Control flow
Data flow
Event handler
Package explorer

What are Connection Managers?

It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.

Subscribe to get more Posts :