Three Clustering Algorithms You Should Know: k-means clustering, Spectral Clustering, and DBSCAN

Рет қаралды 12,403

Күн бұрын

Пікірлер: 17

@502amvideos8 8 ай бұрын

I read somewhere else that the normalized laplacian is Lnorm = D^(-1/2) L D^(-1/2) with L = D - W can you clarify why it is different here in your explanation please, thanks for you videos

@DrDataScience 8 ай бұрын

It's the same thing! If you simplify it, you get the same thing.

@copaceanubobi6101 3 жыл бұрын

i have a raman spectra for brain tumor . Is suitable to make spectral clustering for a tensor 3d(60*60*1735) where the frequencies of the spectrum are found?

@DrDataScience 3 жыл бұрын

Good idea but you need to convert the 3D tensor into 1D so you can define the similarity matrix.

@yasserothman4023 3 жыл бұрын

How do we check convergence in knn ?

@DrDataScience 3 жыл бұрын

Good question! You can plot the value of the cost function vs the number of iterations.

@yasserothman4023 3 жыл бұрын

How do we access the performance of knn ? What performance metrics should be used ?

@DrDataScience 3 жыл бұрын

Great question as well! I will post another video on how to evaluate any clustering method. A popular one is normalized mutual information or NMI.

@iancheung3587 3 жыл бұрын

@Dr. Data Science Hey I am wondering if you can help me out with a question. so let's say I have an empirical distribution of n groups and I want to cluster "distributions". Is it possible if I calculate the pairwise earthmover's distance and put it all in an adjacency matrix, and then use the clustering algorithm?

@DrDataScience 3 жыл бұрын

It depends on the distribution of those clusters. If you can model them using a Gaussian distribution, then use a Gaussian Mixture Model. However, I am wondering if you know the distribution of each cluster or group, why do you want to cluster data points?

@iancheung3587 3 жыл бұрын

@@DrDataScience I want to cluster the distributions of the groups. I have n groups, each group comes with its own distribution. But the n distributions are all roughly exponential with prob diff param. The data is tipping in different countries

@priyadharshini4078 Жыл бұрын

Hello sir... I didn't get the output.. No error also

@yasserothman4023 3 жыл бұрын

for spectral clustering 1-how do you create the similarity matrix ? you mean we connect all data points with each other and assign weights based on the gaussian kernel ? 2-if so what is the variance of the gaussian distribution ? 3- i cannot imagine how to carry out the Knn on U can you elaborate more ? Thanks

@DrDataScience 3 жыл бұрын

1) Yes, we use the Gaussian kernel to compute similarities. 2) That's a hyperparameter that should be tuned. 3) You just need to give the matrix U as the input to k-means clustering, i.e., clusters the n rows of the matrix U.

@yasserothman4023 3 жыл бұрын

@@DrDataScience so for U of dim 4x3 we need to cluster the 12 points we have in U into 3 clusters ?

@DrDataScience 3 жыл бұрын

Let's say you want to find k=2 clusters and U is 4x3. Then, you want to cluster 4 data points each represented by 3 features into 2 groups.