SSIS vs ADF vs Data Bricks

For Data Engineering workloads within Microsoft landscape, there are multiple options to carry out Data Engineering tasks to extract data from myriad of data sources. Currently three options are available:

  • SQL Server Integration Services (SSIS): It is part of Microsoft SQL Server Suite and SSIS is a very well-known popular ETL tool for Data Integration along with rich built in transformations. Introduced in 2005. Mainly for on-premises. Now you can run on-premise as well. Aggregations, splits and joins.
  • Azure Data Factory (ADF): Unlike SSIS, ADF is a ELT tool along with Data Orchestration tool to build pipelines to move data across different layers. From on-Premise to Cloud and within Cloud landscape. Movement and Orchestration but not Transformations.
    • Data movement & Orchestration
    • Extract, Load & Transform
    • Transformation activities.
  • People familiar with SSIS can use it and existing SSIS packages can also be migrated.

Azure Data Factory – Image Courtesy Microsoft
  • Azure Data Bricks: Azure Data Bricks is latest entry into this for Data engineering and Data Science workloads, unlike SSIS and ADF which are more of Extract Transform Load (ETL), Extract Load Transform (ELT) and data Orchestration tools, Azure data bricks can handle data Engineering and data science workloads.
Azure Data Bricks – Image Courtesy Microsoft

Though few basic factors such as volume, variety and velocity of data, which play a vital role in the option(s) to select, following are few differences in the features for each of the option.

Features          SSIS        ADF Azure Data Bricks
Volume Of Data Medium High High
Variety of Data Structured
Data
Structured &
Unstructured
Data
Structured & Unstructured Data
Velocity of Data Batch Batch,
Streaming &
Real-time
Batch,
Streaming &
Real-time
Development Tools SQL Server Development Tools Web Browse Web Browser
Development Interface Drag & Drop Interface Drag & Drop
Interface and
PowerShell
Code
Development Languages VB/C#/BIML NET/Python/PowerShell SQL/Python/R & Scala
Platform On-Premise,
own
Hardware,
Scale Out
Hybrid,
Managed, Scale Up
Cloud
Managed
Auto Scale
Pricing Licensed Pay as you go Pay as you go
Purpose Integration
Transformation ETL
Movement
Orchestration
ETL/ELT
Preparation Collaboration AI/ML

In a nutshell, although you can compare and contrast these tools, they actually compliment each other. For example you can call existing SSIS packages using Azure Data Factory and trigger Azure databricks notebooks using Azure Data Factory.

In the next series we will look into various usage scenarios and design considerations to choose among SSIS, ADF& Azure Data Bricks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.