Category: Big Data

Databricks Koalas: bridge between pandas and spark

Imagine that you are an ML engineer. You have a massive task of operationalizing a model trained and tested by your Data Scientists. It is working perfectly well for the...

How to read mismatched schema in apache spark

In today’s world, python dominates the data analysis space, while apache-spark rules the big data paradigm. The former contains a plethora of libraries like pandas...

A first look at Azure Synapse Analytics

Strongly recommended reading: Azure Synapse Analytics: Azure SQL Data Warehouse revamped In the aforementioned article, we gave an introduction to Azure Synapse...

Introducing Azure Data Factory Data Flows

They say that life comes full circle i.e. you start and end at the same point. However, in my humble opinion, that is not true. Life is like an unending spiral. You do...

Managed Identity between Azure Data Factory and Azure storage

Last month Microsoft announced that Data Factory is now a ‘Trusted Service’ in Azure Storage and Azure Key Vault firewall. Accordingly, Data Factory can...

Azure Synapse Analytics: Azure SQL Data Warehouse revamped

Prologue On 5th November 2019, I read an early morning announcement of Azure Synapse Analytics. It was formerly known as Azure SQL Data warehouse. This announcement took...

Lambda Architecture with Azure Databricks

Introducing Lambda Architecture It is imperative to know what is a Lambda Architecture, before jumping into Azure Databricks. The greek symbol lambda(λ) signifies...

Log Analytics with Python Pandas Explode

Prologue to Analytics with Python I strongly recommend reading this article before you go ahead with this article: Log Analytics using STRING_SPLIT  function. This...

Making ADF Web activity synchronous with Logic App

Prologue It is a common scenario in ETL jobs to send Email Notifications. With Azure Data Factory, such an example can be found in this article. Accordingly, we use web...

Log Analytics using STRING_SPLIT function

Motivating Log Analytics Being a Data Engineer comes with its own set of challenges and opportunities since 80 per cent time is spent on cleaning and munging the raw...