Category: Data Science

Motivating Entity Resolution for Data Science

Why Entity Resolution? Data is the new oil. Thus, analytical models are the new combustion engines. A combustion engine functions efficiently with good fuel. Similarly,...

Log Loss as a performance metric

Introduction to Log Loss Whenever we talk about performance metrics of the classification Machine Learning algorithms, the following names come to our mind: Accuracy...

Python Dedupe Library : Machine Learning to De-Duplicate Data

In Information systems, the biggest challenge faced by organizations is the quality of data. Hence, unclean, messy, and missing data is a common headache across the...

Azure Databricks source in PowerBI

Microsoft PowerBI is  a great tool for Data Visualization. It can connect to a variety of sources. However, databases remain a popular data source. But, what if you...

Overview of the exam DP-900 : Azure Data Fundamentals

Motivating DP-900 : Azure Data Fundamentals Data Engineering is one of the fastest growing career opportunity for people aspiring a career in machine learning and AI....

Motivating Databricks Delta in Azure

Exploratory data analysis entails a lot of ad-hoc analysis. To do so, either they have to rely on databases or file systems like data lakes. Now, to analyze these...

Careers in Machine Learning and AI

There has been a growing fear that AI would eat away jobs from every sector, including Software Services. Read this article of ours to know more: Is Artificial...

Tutorial: Hierarchical Clustering in Spark with Bisecting K-Means

In the previous article, we covered the standard K-Means Clustering technique on Spark. Read that article here: Tutorial : K-Means Clustering on Spark. In this article,...

Tutorial : K-Means Clustering on Spark

Analytics is discovering insights using data. Traditionally, statistical and visual techniques dominated the field. But, with advances in Machine Learning and AI,...

Koalas Dataframe plotting powered by Plotly

In Data Science, Exploratory Data Analysis is an essential process. And as they say, a picture is better than thousand words, visual tools play a key role in...