For Data Engineering workloads within Microsoft landscape, there are multiple options to carry out Data Engineering tasks to extract data from myriad of data sources. Currently three options are available:
- SQL Server Integration Services (SSIS): It is part of Microsoft SQL Server Suite and SSIS is a very well-known popular ETL tool for Data Integration along with rich built in transformations. Introduced in 2005. Mainly for on-premises. Now you can run on-premise as well. Aggregations, splits and joins.
- Azure Data Factory (ADF): Unlike SSIS, ADF is a ELT tool along with Data
Orchestration tool to build pipelines to move data across different
layers. From on-Premise to Cloud and within Cloud landscape. Movement and Orchestration
but not Transformations.
- Data movement & Orchestration
- Extract, Load & Transform
- Transformation activities.
- People familiar with SSIS can use it and existing SSIS packages can also be migrated.
- Azure Data Bricks: Azure Data Bricks is latest entry into this for Data engineering and Data Science workloads, unlike SSIS and ADF which are more of Extract Transform Load (ETL), Extract Load Transform (ELT) and data Orchestration tools, Azure data bricks can handle data Engineering and data science workloads.
Though few basic factors such as volume, variety and velocity of data, which play a vital role in the option(s) to select, following are few differences in the features for each of the option.
|Features||SSIS||ADF||Azure Data Bricks|
|Volume Of Data||Medium||High||High|
|Variety of Data|| Structured |
| Structured & |
|Structured & Unstructured Data|
|Velocity of Data||Batch|| Batch, |
|Development Tools||SQL Server Development Tools||Web Browse||Web Browser|
|Development Interface||Drag & Drop Interface||Drag & Drop |
|Development Languages||VB/C#/BIML||NET/Python/PowerShell||SQL/Python/R & Scala|
Managed, Scale Up
|Pricing||Licensed||Pay as you go||Pay as you go|
|Preparation Collaboration AI/ML|
In a nutshell, although you can compare and contrast these tools, they actually compliment each other. For example you can call existing SSIS packages using Azure Data Factory and trigger Azure databricks notebooks using Azure Data Factory.
In the next series we will look into various usage scenarios and design considerations to choose among SSIS, ADF& Azure Data Bricks.