K Means Clustering in 15 Minutes | K means clustering explained | K means clustering in python

  Рет қаралды 42,741

Unfold Data Science

Unfold Data Science

Күн бұрын

K Means Clustering in 15 Minutes | K means clustering explained | K means clustering in python
#KMeansClustering #KMeansInPython #Unfold Data Science
Hello ,
My name is Aman and I am a Data Scientist.
About this video:
In this video, I explain about K-Means clustering, I also demonstrate how k-means clustering is implemented in python. Below topics are explained in this video.
1. k means clustering algorithm,
2. k means algorithm,
3.k means clustering example,
4 .k means clustering python,
5. k means example in python,
6. what is k-means clustering,
7. how k mean works,
8.k means clustering algorithm python,
9.clustering algorithms,
10.kmeans python,
11.kmeans sklearn tutorial,
12.kmeans example,
13.k means elbow method python,
14.unsupervised learning,
15.k means clustering python,
16.k means clustering,
17.k means clustering algorithm,
18.k means python,
19.clustering in machine learning,
20.clustering python,
21 .k-means in python
About Unfold Data science: This channel is to help people understand basics of data science through simple examples in easy way. Anybody without having prior knowledge of computer programming or statistics or machine learning and artificial intelligence can get an understanding of data science at high level through this channel. The videos uploaded will not be very technical in nature and hence it can be easily grasped by viewers from different background as well.
If you need Data Science training from scratch . Please fill this form (Please Note: Training is chargeable)
docs.google.co...
Book recommendation for Data Science:
Category 1 - Must Read For Every Data Scientist:
The Elements of Statistical Learning by Trevor Hastie - amzn.to/37wMo9H
Python Data Science Handbook - amzn.to/31UCScm
Business Statistics By Ken Black - amzn.to/2LObAA5
Hands-On Machine Learning with Scikit Learn, Keras, and TensorFlow by Aurelien Geron - amzn.to/3gV8sO9
Ctaegory 2 - Overall Data Science:
The Art of Data Science By Roger D. Peng - amzn.to/2KD75aD
Predictive Analytics By By Eric Siegel - amzn.to/3nsQftV
Data Science for Business By Foster Provost - amzn.to/3ajN8QZ
Category 3 - Statistics and Mathematics:
Naked Statistics By Charles Wheelan - amzn.to/3gXLdmp
Practical Statistics for Data Scientist By Peter Bruce - amzn.to/37wL9Y5
Category 4 - Machine Learning:
Introduction to machine learning by Andreas C Muller - amzn.to/3oZ3X7T
The Hundred Page Machine Learning Book by Andriy Burkov - amzn.to/3pdqCxJ
Category 5 - Programming:
The Pragmatic Programmer by David Thomas - amzn.to/2WqWXVj
Clean Code by Robert C. Martin - amzn.to/3oYOdlt
My Studio Setup:
My Camera : amzn.to/3mwXI9I
My Mic : amzn.to/34phfD0
My Tripod : amzn.to/3r4HeJA
My Ring Light : amzn.to/3gZz00F
Join Facebook group :
www.facebook.c...
Follow on medium : / amanrai77
Follow on quora: www.quora.com/...
Follow on twitter : @unfoldds
Get connected on LinkedIn : / aman-kumar-b4881440
Follow on Instagram : unfolddatascience
Watch Introduction to Data Science full playlist here : • Data Science In 15 Min...
Watch python for data science playlist here:
• Python Basics For Data...
Watch statistics and mathematics playlist here :
• Measures of Central Te...
Watch End to End Implementation of a simple machine learning model in Python here:
• How Does Machine Learn...
Learn Ensemble Model, Bagging and Boosting here:
• Introduction to Ensemb...
Build Career in Data Science Playlist:
• Channel updates - Unfo...
Artificial Neural Network and Deep Learning Playlist:
• Intuition behind neura...
Natural langugae Processing playlist:
• Natural Language Proce...
Understanding and building recommendation system:
• Recommendation System ...
Access all my codes here:
drive.google.c...
Have a different question for me? Ask me here : docs.google.co...
My Music: www.bensound.c...

Пікірлер: 115
@karthikraman810
@karthikraman810 3 жыл бұрын
This is One Video where am searching to know the base line of K-Means Algorithm clearly. Thank you very much for your detailed explanation in simple terms about K-Means algo.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks for watching Karthik.
@sandipansarkar9211
@sandipansarkar9211 2 жыл бұрын
finished watching
@AnandKrishnanPuravangara
@AnandKrishnanPuravangara Ай бұрын
Thank you Aman! You are the best teacher!
@balakrishnareddithala8718
@balakrishnareddithala8718 Жыл бұрын
professor from moon.....fly full environment....super sir
@himanshupathak3090
@himanshupathak3090 3 жыл бұрын
Best explanation of K-means clustering... your videos are simple and easy to understand.. waiting for more videos in this series.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Himanshu.
@JainmiahSk
@JainmiahSk 3 жыл бұрын
@@UnfoldDataScience Hi Aman. You are using same data for Training and Testing the Kmeans? and also if we are using two or more clustering algorithms how do you define that which is performing better?
@soniyajinu6578
@soniyajinu6578 2 жыл бұрын
Really nice explanation sir
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Thank you
@sureshkumar-cn5jr
@sureshkumar-cn5jr 8 ай бұрын
Thanks Aman! Great narration, subject is getting clear for beginners
@snehalhon
@snehalhon Жыл бұрын
hello sir .... thank you very much . your are best and making data science easy for student like me 10000 likes
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
Thanks Snehal
@megalaramu
@megalaramu 3 жыл бұрын
Hi aman, your explanation is easier to understand. Especially elbow plots. Could you please take about performance evaluation metrics in unsupervised algorithms.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Sure, in next video, I ll pick this topic in detail.
@ajayreddyvanga6556
@ajayreddyvanga6556 3 жыл бұрын
Useful 🙌❤️
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Glad you think so Ajay.
@btkcodedev
@btkcodedev 3 жыл бұрын
Amazing Explanation, great
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks For watch Bala.
@johnfernandes455
@johnfernandes455 2 жыл бұрын
Nice video. Simple n clean
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Thanks for your positive feedback
@vedprakash8846
@vedprakash8846 3 жыл бұрын
Thank You! this was very helpful
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks for watching Ved.
@9902152322
@9902152322 2 жыл бұрын
keep exploring sir, explanation is excellent. waiting for the next video. thank you
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Welcome
@johndaniel7569
@johndaniel7569 2 жыл бұрын
Really a good info on K Means!! Thanks
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Glad it was helpful!
@dilnawazahmed949
@dilnawazahmed949 2 жыл бұрын
You r just awesome explained difficult things in an easy way ✅
@soheilaahmadi4807
@soheilaahmadi4807 Жыл бұрын
you are great. May God bless you
@vishnujatav6329
@vishnujatav6329 3 жыл бұрын
Really good
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Vishnu.
@achumohan5908
@achumohan5908 2 жыл бұрын
Thank you bro for your detailed explanation 🙂 Kuddos !!👏
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Welcome Mohan.
@ahmadjaradat3011
@ahmadjaradat3011 15 күн бұрын
thanks
@krishnamishra8598
@krishnamishra8598 2 жыл бұрын
When u say mean of data points.. will it be mean of difference between randomly initialised centroid & data points?
@ML_Engineerr
@ML_Engineerr 5 ай бұрын
Very helpful
@milliesadie486
@milliesadie486 2 жыл бұрын
Thank you
@amalageorge394
@amalageorge394 Жыл бұрын
excellent
@mikuniaroz
@mikuniaroz Жыл бұрын
Great explanation! love it
@dees900
@dees900 Жыл бұрын
great video. thank u.
@dr.shambhujha3999
@dr.shambhujha3999 Жыл бұрын
Rather than choosing random centroid id is better to choose centroid with maximums distance
@rusiraliyanage6643
@rusiraliyanage6643 2 жыл бұрын
very clear explaination sir :)
@sainathpawar9655
@sainathpawar9655 3 жыл бұрын
Very nice explanation.can you please explain DBSCAN as well. And difference between KMeans and DBSCAN
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
I will add dbscan video as well Sainath. Thanks for watching.
@ganeshgunjal4220
@ganeshgunjal4220 Жыл бұрын
explanation is very nice and understandable. please provide dataset link also. i am stuck there.
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
check description
@vaishalikadwey4457
@vaishalikadwey4457 2 жыл бұрын
wonderful explanation, very informative video. Sir please make video on PAM CLARA also
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Sure I will
@sangeethaagoudar2175
@sangeethaagoudar2175 2 жыл бұрын
Thank you sir!
@nandinimatamacedatascince1407
@nandinimatamacedatascince1407 3 жыл бұрын
basically we need to have intra cluster has to be minimum and inter cluster distance has to be maximum in clustering method, how will it taken care by at a time in clustering . could you explain about it ? thanks in advance :)
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
To achieve that, points will be shifted between clusters in various iterations of model training as explained. Wherever those two criteria are satisfied, it will be chosen as final allocation
@shaikhuzma786
@shaikhuzma786 4 ай бұрын
@datascienceworld7041
@datascienceworld7041 2 жыл бұрын
Incase of Inertia it will sum up the distances For example suppose k = 2 it will create 2 clusters and it will add the 2 cluster to show the Inertia value??? Is that correct
@travelofftradition
@travelofftradition Жыл бұрын
Hi Aman!! I'm currently studying in Germany. Thanks a lot for explaining K means in plain english. This is by far simplest video to understand the concept. However I have one doubt. Suppose we have 5 variables or 10 variables in a table. Then how K means works? In your case there were only two variables so the scatter plot can be easily made. If there are 5 variables then also K means develop the scatter plot first and determine euclidean distance or how does it works? In addition to that I have another doubt, I have data related to bank customers in 5 tables, how would you suggest to apply K means when we have multiple tables? Thanks Mohit
@kunals.1161
@kunals.1161 3 жыл бұрын
Hello Sir, Thanks for such a informative video. I have a request, will you please take a session on how we can implement Logistic Regression with Regularization (Ridge or Lasso) and Cross Validation(K-fold) , as I always get stuck there!
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks kunal for asking. I will try to create video on it.
@santoshr1708
@santoshr1708 Жыл бұрын
Thank you sir
@karthickkarthi2401
@karthickkarthi2401 3 жыл бұрын
hi sir i think u are fine i have doubt regarding programming language in python i know the basics the requirements that satisfies the ML, DS, DL is that enough i don't know the core python can u pls tell me iis it sufficient
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
For starting its ok Kathik, don't count it as road blocker. You can learn many things on the job. Just be mentally prepared to do so.
@karthickkarthi2401
@karthickkarthi2401 3 жыл бұрын
@@UnfoldDataScience ok sir tq
@ratheeshmsuresh7368
@ratheeshmsuresh7368 Жыл бұрын
Brother, K Value I have got from the Elbow Method and Silo Score (K Value) seems to be different. What does it tells? Am I wrong
@alkabuxi8592
@alkabuxi8592 Жыл бұрын
Very well explained. Can i use the concept of k means clusterring in R language.
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
Yes ofcourse
@shreyjain6447
@shreyjain6447 2 жыл бұрын
If I get a real world dataset where I need to perform clustering, should I first split into train, test and valid and then scale and perform clustering algo? Also if I want to evaluate how accurate my clusters are how should I proceed about it?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Answer of first question - yes. Second question, please watch this video: kzbin.info/www/bejne/lZvKYoh8pMqZZ8U
@shreyjain6447
@shreyjain6447 2 жыл бұрын
@@UnfoldDataScience Do you have a video on clustering where you split the data into train test and valid? Because every video/tutorial I have seen does not perform splitting on a clustering algorithm. A link to the video/article where this is done would be enough!
@vanamanu4283
@vanamanu4283 2 жыл бұрын
Brother will you do a video fro adaptive-K means algorithm brother
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Sure
@GhostRider....
@GhostRider.... 2 жыл бұрын
very nice explanation and implementation sir, please provide the excel file also(file is not present in google drive)
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
oops, I searched and could not find , its a simple excel only, please populate some numbers in two columns and you can use as customer data.
@61_shivangbhardwaj46
@61_shivangbhardwaj46 3 жыл бұрын
Thnx sir😊
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Most welcome
@hardikvegad3508
@hardikvegad3508 3 жыл бұрын
Amazing Explanation Aman. Would you please show us with code how to get a performance matrix such as silhouette or any other which you consider worth sharing. Thank you.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Hardik. Yes good question. All these I will discuss as part of interview questions on K-means which is next video planned.
@gowthamjeevanantham6144
@gowthamjeevanantham6144 3 жыл бұрын
hi just one doubt initializing the centroid second step which is randomly initialized or is there specific reason to select
@suryakanth1000
@suryakanth1000 2 жыл бұрын
In this example, you have used two columns in the dataset for clustering. At the end when visualizing the clusters, the plotting was done between these two columns. But if we have more than 2 columns in our data, how do we visualize the clusters after clustering?
@rajathslr
@rajathslr 2 жыл бұрын
Try to imagine a collection of points levitating in ur room(x,y, z axis) and you use a balloon to cover them all without moving the points. That balloon is the cluster and the points are spread across xy and zy axis.
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Very good answer Rajath.
@nishah4058
@nishah4058 Жыл бұрын
Didn't gety rajath k..how you decide when you have no of variables more than two
@archanamaurya89
@archanamaurya89 3 жыл бұрын
How do we determine the number of iterations to move the centroid and what if it still not enough to classify the datapoints into correct clusters.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
We do not know in advance. We need to hit and trial.
@archanamaurya89
@archanamaurya89 3 жыл бұрын
@@UnfoldDataScience thanks for the reply
@datafuturelab_ssb4433
@datafuturelab_ssb4433 3 жыл бұрын
If more than 2 features are there then how to implement kmeans. Which features are considered for clustering.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Good question, if more features are there all the features will go as a dimension to algorithm. Also there are feature reduction technique such as PCA which is used in this scenario.
@kunals.1161
@kunals.1161 3 жыл бұрын
@@UnfoldDataScience Hello Aman Sir, will you please take such a data for example and explain it with python practicle?
@datafuturelab_ssb4433
@datafuturelab_ssb4433 3 жыл бұрын
@@UnfoldDataScience thanks for the guidance Related query If we do pca in that scenario how we can make conclusion out of that.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
PCA will make us loose the meaning of original variable. So any analysis on original variables post PCA will be difficult. However, PCA can be useful for better clustering in multi variable scenario.
@datafuturelab_ssb4433
@datafuturelab_ssb4433 3 жыл бұрын
@@UnfoldDataScience thank you very much
@artificial-intelligence5753
@artificial-intelligence5753 2 жыл бұрын
Is there a formula for inertia?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Inertia formula in this context, I am not very sure, I will get back on it.
@gadmuhirwa5226
@gadmuhirwa5226 Жыл бұрын
from request import PandaRequest ModuleNotFoundError: No module named 'request'
@ajmalbashaa881
@ajmalbashaa881 3 жыл бұрын
Apply K-Means clustering with K=2,3,4,5,6,7,8,9,10 for all features of 56 datasets and find the optimal number of clusters using the Silhouette Coefficient and Davies-Bouldin index. 2. Store your results with a single excel file with multiple rows, i.e., one row for each project and Column used to represent Silhouette Coefficient and Davies-Bouldin index. 3. Represent your results using visualization techniques. Note:56 datasets include 56 excel sheets with 125 rows and 20 columns. 21st clumn indicats class. kindly, help me with this.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
This looks like your project/assignment. What help do u expect from me?
@ajmalbashaa881
@ajmalbashaa881 3 жыл бұрын
@@UnfoldDataScienceyes, Can you please share your mail id? I can show the details to you.
@jamesa.esquivel4158
@jamesa.esquivel4158 2 жыл бұрын
Excellent Tutorial! May I know where can I download the CustomerData.xlsx dataset? Thanks!
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M
@jamesa.esquivel4158
@jamesa.esquivel4158 2 жыл бұрын
@@UnfoldDataScience Hello sir! The folder does not include the CustomerData.xlsx file. Where can I download a copy of it? Thanks!
@ganeshgunjal4220
@ganeshgunjal4220 Жыл бұрын
@@jamesa.esquivel4158 same here. if u foound, can u paste link here please?
@battingbaba7077
@battingbaba7077 2 жыл бұрын
Make sure to tell ur concept in normal language it is more complicated
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Ok
@gadmuhirwa5226
@gadmuhirwa5226 Жыл бұрын
but they are always giving me this error: ModuleNotFoundError: No module named 'request' after installing requests module the problem remains
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
Time to time module gets updated, pls check in latest documentation
@rajankp5735
@rajankp5735 3 жыл бұрын
Sir how it applicable to machines??
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Same way like typical ML algorithms
@akashghuge8235
@akashghuge8235 3 жыл бұрын
Why K means, why not M means or A means?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
We will discuss in interview questions video.
@pushpapawar4045
@pushpapawar4045 3 жыл бұрын
Sir nieve bayes learning please
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Will add :)
@hermannalidje6442
@hermannalidje6442 2 жыл бұрын
Hey guys I am a new student in data science please somebody that can train me I will pay for Thx
@gadmuhirwa5226
@gadmuhirwa5226 Жыл бұрын
thank you
@muhammedthayyib9202
@muhammedthayyib9202 2 жыл бұрын
Consider 2 clusters. What if the distance of a point is equal to both clusters. ?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Highly unlikely, consider multiple dimension and distance metric calculation, it will go upto many decimal points.
@muhammedthayyib9202
@muhammedthayyib9202 2 жыл бұрын
Ok. Thank you 💟
K-Means Clustering From Scratch in Python (Mathematical)
33:10
NeuralNine
Рет қаралды 32 М.
GIANT Gummy Worm Pt.6 #shorts
00:46
Mr DegrEE
Рет қаралды 84 МЛН
哈莉奎因怎么变骷髅了#小丑 #shorts
00:19
好人小丑
Рет қаралды 52 МЛН
Man Mocks Wife's Exercise Routine, Faces Embarrassment at Work #shorts
00:32
Fabiosa Best Lifehacks
Рет қаралды 4,4 МЛН
StatQuest: K-means clustering
8:31
StatQuest with Josh Starmer
Рет қаралды 1,6 МЛН
K Means Clustering Intuition
14:36
Krish Naik
Рет қаралды 149 М.
Clustering with DBSCAN, Clearly Explained!!!
9:30
StatQuest with Josh Starmer
Рет қаралды 305 М.
K-Means Clustering Algorithm with Python Tutorial
19:20
Andy McDonald
Рет қаралды 102 М.
GIANT Gummy Worm Pt.6 #shorts
00:46
Mr DegrEE
Рет қаралды 84 МЛН