Archives post

How to read mismatched schema in apache spark

In today’s world, python dominates the data analysis space, while apache-spark rules the big data paradigm. The former contains a plethora of libraries like pandas...

A first look at Azure Synapse Analytics

Strongly recommended reading: Azure Synapse Analytics: Azure SQL Data Warehouse revamped In the aforementioned article, we gave an introduction to Azure Synapse...

Introducing Azure Data Factory Data Flows

They say that life comes full circle i.e. you start and end at the same point. However, in my humble opinion, that is not true. Life is like an unending spiral. You do...

Managed Identity between Azure Data Factory and Azure storage

Last month Microsoft announced that Data Factory is now a ‘Trusted Service’ in Azure Storage and Azure Key Vault firewall. Accordingly, Data Factory can...

Introduction to SSIS and making it metadata independent

Introducing SSIS In the Microsoft Sql Server Integration services (SSIS), we have a variety of tasks to perform ETL. However, with changing data landscape and newer...

Azure Synapse Analytics: Azure SQL Data Warehouse revamped

Prologue On 5th November 2019, I read an early morning announcement of Azure Synapse Analytics. It was formerly known as Azure SQL Data warehouse. This announcement took...

Comparing SSIS Lookup and SQL Joins

Motivating SSIS lookups Denormalization is the stepping stone of data warehouse creation; usually performed to extract the surrogate key of a dimension to a fact table....

Is Artificial Intelligence a threat to Software Engineers?

The fear of the singularity Every threat is an opportunity The Forbes magazine article  Software Ate The World, Now AI Is Eating Software reminded me of the movie...

Log Analytics with Python Pandas Explode

Prologue to Analytics with Python I strongly recommend reading this article before you go ahead with this article: Log Analytics using STRING_SPLIT  function. This...

What is a Data Hub: Concepts and Guidelines

The V’s of Big Data: Before Data Hub Life was simple when we restricted ourselves to spreadsheets and relational data stores. Data was structured, databases were...