Tag: , ,

Building Analytical System on Azure Data Lake Gen2

We live in the world of Big Data and Analytics. It’s a fast-changing world with new technologies emerging at a fast pace. This pace has increased considerably with...

Azure Data Lake Gen2 and Azure Databricks

Before Azure Data Lake Gen2 and Azure Databricks, In our previous articles, we elaborated about two aspects of Azure Data Lake Gen2 migration i.e. governance and...

Cumulative Distribution in Azure Databricks using Spark SQL

We can solve every problem in multiple ways. In our previous article, we motivated the need to fit cumulative distributions. Moreover, we demonstrated the same in Azure...

Cumulative Distribution in Azure Databricks

Imagine that you receive a requirement to calculate the aggregations like average on a range of percentiles and quartiles, for a given dataset. There are two ways to...

Challenges in Modern Data Processing

Having spent 5 years in the space of Data Analytics, I have come across a few challenges that might hamper an organization’s efforts to mature as a Data-Driven...

Why Databricks is gaining popularity?

In February 2020, Gartner released its magic chart for Data Science. A pleasant surprise, however, was to see Databricks amongst the leaders. Interestingly, it made a...

Databricks Koalas: bridge between pandas and spark

Imagine that you are an ML engineer. You have a massive task of operationalizing a model trained and tested by your Data Scientists. It is working perfectly well for the...

Lambda Architecture with Azure Databricks

Introducing Lambda Architecture It is imperative to know what is a Lambda Architecture, before jumping into Azure Databricks. The greek symbol lambda(λ) signifies...

Azure Databricks tutorial: end to end analytics

Before jumping to the Azure Databricks tutorial, it is good to know the evolution of the Data and AI space. Knowledge production started in ancient Sumerian...