Category: Big Data

Introducing Machine Learning System Design

Why do we need Machine Learning System Design? In their seminal paper, Hidden Technical Debt in Machine Learning Systems, Google researchers expound that only a small...

Data Profiling in Power BI (using Azure Databricks)

In Microsoft, there are two worlds i.e. MS Azure and MS Office 365. They are two two different Active Directories in Microsoft world. Hence, they have their own tools to...

Motivating Entity Resolution for Data Science

Why Entity Resolution? Data is the new oil. Thus, analytical models are the new combustion engines. A combustion engine functions efficiently with good fuel. Similarly,...

An Introduction to Azure Synapse SQL

Evolution of Azure Synapse SQL Azure Synapse was previously known as Azure SQL Datawarehouse. With the re-branding to Synapse, Microsoft added many more layers on top of...

Azure Databricks source in PowerBI

Microsoft PowerBI is  a great tool for Data Visualization. It can connect to a variety of sources. However, databases remain a popular data source. But, what if you...

Overview of the exam DP-900 : Azure Data Fundamentals

Motivating DP-900 : Azure Data Fundamentals Data Engineering is one of the fastest growing career opportunity for people aspiring a career in machine learning and AI....

Motivating Databricks Delta in Azure

Exploratory data analysis entails a lot of ad-hoc analysis. To do so, either they have to rely on databases or file systems like data lakes. Now, to analyze these...

Tutorial: Hierarchical Clustering in Spark with Bisecting K-Means

In the previous article, we covered the standard K-Means Clustering technique on Spark. Read that article here: Tutorial : K-Means Clustering on Spark. In this article,...

Tutorial : K-Means Clustering on Spark

Analytics is discovering insights using data. Traditionally, statistical and visual techniques dominated the field. But, with advances in Machine Learning and AI,...

Migrating from Azure Databricks to Azure Synapse Analytics

In the changing landscape of technology, new tools emerge. Azure Databricks has been a prominent option for end-to-end analytics in the Microsoft Azure stack. In 2019,...