On 5th November 2019, I read an early morning announcement of Azure Synapse Analytics. It was formerly known as Azure SQL Data warehouse. This announcement took me down the memory lane since my first major project was on Azure SQL Data warehouse. It was adorned by bleeding-edge technologies like Polybase and MPP.
I have personally loved the Azure sql data warehouse since it gels so well with both traditional, relational world as well as the modern big data world. As far as the traditional world is concerned, it has been integrating well with services like Azure Analysis Services and Power BI. Moreover, most of the SQL constructs that developers have been using are also supported by the SQL DW in then addition to super fast compute that enables faster analytics and reporting.
The Big Data world and Azure SQL Datawarehouse
However, Azure Sql DW has been equally effective with the modern Big Data workloads. The key enabler here is the polybase technology. With polybase, we can read data from multiple sources like Azure blob storage, Azure data lake storage etc. Furthermore, as the name suggests, polybase means data in multiple formats. Hence, we can read data in multiple formats like csv, json etc. at very high speeds using polybase.
As the technology evolved, multiple services like Azure Data Factory, Azure Databricks evolved. While the former is an orchestration engine, the latter is a transformation layer. These technologies complemented Azure SQL Data warehouse, thus positioning it uniquely in the Lambda Architecture of Microsoft Azure.
Also read: Lambda Architecture with Azure Databricks
However, we can see from the above diagram, there are a lot of glues in place here. For instance, we use Azure Data Factory as the orchestration engine, while Azure Databricks is the compute engine. As the number of couplings increase, the complexity goes up. Laterally, dependencies and security threats go up. Hence, in order to bring most of the glue components under one umbrella, Microsoft has come up with Azure Synapse Analytics. But, first things first. Let’s understand what do we mean by a ‘Synapse’.
The Neural Synapse
Human beings find inspiration in nature, while engineering products. Accordingly, Azure Synapse Analytics seems to be inspired by the synapse of a neuron. The synapse of a neuron allows two neurons to communicate with each other. In other words, it acts as a glue between two neurons of a brain. The below diagram should bring more clarity to the concept of a synapse. Now, let’s see how Azure Synapse is similar to the natural one.
Azure Synapse Analytics= Azure SQL data warehouse+Analytics
In the brain, a neuron is a fundamental building block, while a synapse is a link between them. Similarly, in Azure big data, we have multiple building blocks and as mentioned earlier, it would be most desirable to have a system which will act like a synapse between the different building blocks. That desired synapse is Azure synapse analytics. The below diagram should make bring more clarity to the concept.
Initial glimpses of Azure Synapse analytics reveals that it is an end to end analytics solution, beginning with the core component of Azure SQL data warehouse. Moreover, we have integration with Azure Data Factory for data extraction from data lake and blobs, while we have SQL analytics built in to perform analytics on top of Azure Data Lake. Furthermore, we can also use provisioned compute to meet production-level performance. Next, we can use spark to create Machine learning models and more importantly, we can leverage our models using SQL. Lastly, it can integrate with Power BI to create compelling visual at the same time.
Since these are the initial days of the announcement, we won’t have any demos right now. This is more of an out-pour of my own excitement with the new service. However, I implore everyone to watch this video to know more: Azure Synapse Analytics – Next-gen Azure SQL Data Warehouse.