Considering the emerging trend among many customers to migrate their existing EDW and data workloads on to Public cloud Platforms. Following are few considerations before moving onto Cloud Platforms:
- Security & Compliance
- Complexity of Overall Architecture etc..
Microsoft Azure offers multiple components either to set up a Green field establishment on Azure or to Migrate (Lift and shift) existing enterprise data workloads onto Azure. Listing the key components which enable hosting data components on Azure.
- Azure Data Lake
- Azure BloB
- Azure SQL
- Azure Event Hub
- Azure IOT Hub
- Azure Stream Analytics
- Azure SQL Data Warehouse
- Azure SSAS (Tabular)
- Azure Data Catalog
- Azure ML
- Azure Bot Services
- Azure HDInsight
- Power BI.
Following is the function of each of the components:
- Azure Data Lake: This component primarily acts as a Landing area to store data of any shape, size and format so that data analysts, developers, data scientists can consume it.It also provides an added advantage of integration with Active Directory to secure assets. Files are organized into Fold and Files and appropriate access can be set up for the same.
It internally has 2 components:
- Azure Data Lake Store
- Azure Data Lake Analytics
- Azure Blob: Azure Blob is a block based storage to store data of any size and format. However it does not have the benefit of integration of Azure AD.
- Azure SQL: Azure SQL Database is an Managed PaaS offering from Microsoft, Unlike traditional SQL Server which you install on a Virtual Machine you take care of Patching and upgrading the instance, SQL Azure which is a Manged service where Patching and updating of SQL Database is taken care by Microsoft for you and abstracts management of the underlying Infrastructure. It support storage of data structures such as relational, XML, JSON and Spatial formats.
- Azure SQL Data Warehouse: Azure SQL Data Warehouse is a a elastic scale Data warehouse on the cloud based on Massively Parallel Processing (MPP) Architecture It leverages MPP to quickly run complex queries across petabytes of data. SQL Data Warehouse as a key component of a big data solution. Data can be imported into SQL DW using Polybase technology from multiple data sources, and then use the power of MPP to execute analytics workloads.
- Azure SSAS: Analysis Services databases that run in-memory or in DirectQuery mode, accessing data directly from backend relational data sources. By using state-of-the-art compression algorithms and multi-threaded query processor, the analytics engine delivers fast access to tabular model objects and data by reporting client applications like Power BI and Excel
- Azure Data Catalog: Azure Data Catalog enables Crowd sourced Data catalog Management, which helps to register, enrich, discover, understand, and consume any data sources being used in the enterprise data warehousing landscape, where is less time is spent in searching for required data and consume them.
- Azure ML:
Machine learning enables computers to learn from data and experiences and to act without being explicitly programmed. Customers can build Artificial Intelligence (AI) applications that intelligently sense, process, and act on information – augmenting human capabilities, increasing speed and efficiency, and helping organizations achieve more.
- Azure BOT Servies: Bot Service provides an integrated environment purpose-built for bot development. You can write a bot, connect, test, deploy, and manage it from your web browser with no separate editor or source control required. For simple bots, you may not need to write code at all.
- Azure HDInsight: Azure HDInsight is a fully-managed cloud service that makes it easy, fast, and cost-effective to process massive amounts of data. Use popular open-source frameworks such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, R & more. Azure HDInsight enables a broad range of scenarios such as ETL, Data Warehousing, Machine Learning, IoT and more.
- Azure Event Hub: Azure Event Hubs is a hyper-scale telemetry ingestion service that collects, transforms, and stores millions of events. As a distributed streaming platform, it gives you low latency and configurable time retention, which enables you to ingress massive amounts of telemetry into the cloud and read the data from multiple applications using publish-subscribe semantics.
- Azure IOT Hub: Azure IoT Hub to easily and securely connect your Internet of Things (IoT) assets. Use device-to-cloud telemetry data to understand the state of your devices and assets, and be ready to take action when an IoT device needs your attention. In cloud-to-device messages, reliably send commands and notifications to your connected devices—and track message delivery with acknowledgement receipts. Device messages are sent in a durable way to accommodate intermittently connected devices
- Power BI : Visualization Layer which enables to create interactive visualizations and helps to derive insights from data.
In the rest of the series of articles, will elaborate further details on each of the components.