you and Dr.Grande are aking this topic actually understandable. thank you for quality content.
@machadolopes3 ай бұрын
The best and concise lesson about K-means. Thanks.
@xeniavlasenko98303 жыл бұрын
This is the 5th video I wath on K-Means and it FINALLY made sense. Thank you so much!
@DataDemystified3 жыл бұрын
I'm so glad to hear that! Is there something in particular that made the content here more understandable? I ask so that I can make sure to incorporate that type of teaching in my other videos. Thanks!
@xeniavlasenko98303 жыл бұрын
@@DataDemystified I guess commenting along the way on how to interpret the results/ how all these program steps and numbers in the tables are part of the "story" was particularly helpful :)
@DataDemystified3 жыл бұрын
@@xeniavlasenko9830 Thank you for the feedback! I will make sure to incorporate it into new tutorial videos!
@nawilliam27542 жыл бұрын
After a long search , finally something easy to understand
@bernardoluca66133 жыл бұрын
Fantastic explanation! nothing to do with all those videos out there! keep going like this!
@DataDemystified3 жыл бұрын
Thanks!
@jessicamartin14462 жыл бұрын
Great! I was able to complete my entire assignment, using only this video
@anass2243 Жыл бұрын
I really thank you for this great series of videos they have been so much useful in my research
@dsavkay10 ай бұрын
Great advanced info, subscribed!
@lydialim19933 жыл бұрын
Wonderful series! Keep it up!
@DataDemystified3 жыл бұрын
Thank you! Any topics you'd specifically like to see covered?
@lydialim19933 жыл бұрын
@@DataDemystified Any chance you'll do one on Structural Equation Modelling? Like I know it's a bunch of regressions under the hood, but it would be nice to see a proper demo of how to use one in real life.
@DataDemystified3 жыл бұрын
@@lydialim1993 Great idea, but I don't know if that'll happen any time soon. The challenge is that you need the AMOS package for SPSS, which most people don't have (including me, at the moment). That said, I'll look into how much demand there is for something like this! Thanks for the suggestion!
@StevenWang82 Жыл бұрын
Thank you very much, this video is very easy to understand !!
@vindaflyfox8 ай бұрын
Hello, I am wanting to follow this process by doing a hierarchical cluster analysis to determine the k for my k-means analysis. My question is, my variables are not all on the same scale so in the hierarchical cluster analysis I will need to convert them into z-scores or something similar so they are comparable. How does this impact the k-means cluster analysis? Do I need to do an extra step here or will my variables already be converted and able to be used again after the hierachical analysis?
@tacs3 Жыл бұрын
thank you so much! for this one and the hierarchical one!
@thaibinhtran508315 күн бұрын
Hi Prof, thank you for this video. Could I please ask how I can explore clusters of categorical data such as clusters of diseases or medications? Thank you in advanced.
@zahraalinam629 ай бұрын
In case the Sig for some variables is bigger than .001 what should we do? Should we screen and remove them and do the cluster analysis again?
@lingkan1984 Жыл бұрын
To cluster analysis for multimorbidity, is there any special format to arrange the data?
@zahraalinam629 ай бұрын
Which method of hierarchical or K-means is the most appropriate for dichotomous variables with binary coding (0,1) showing the presence and absence of a variable?
@GenuineReciprocity3 жыл бұрын
Your videos are so easy to understand and its so amazing how many people your kindness has been helping! I have a small question and was wondering if you can share your insight about it if you have time available. A study that I am trying to replicate has categorized individuals based on whether they score above or below the mean on two variables (i.e., high high, high low, low high, low low - 4 categories). I was advised that that technique was crude and that I should instead use a cluster analysis to categorize the groups. Why would cluster analysis be a better statistical analyses than what the original authors did in categorizing the variables? Sorry to trouble you! I look forward to more of your incredibly helpful videos!
@divyajaiswal4330 Жыл бұрын
Can k means clustering data be represented graphically? If yes, how?
@naren17054 ай бұрын
Thank you for the excellent videos on clustering. I have a dataset in which the majority of the variables are categorical. Which clustering method would be best for categorical variables? If I convert the categorical variable to have values '1', '2', '3' and so on, could i then use this 'numerised' dataset in hierarchical or k-mean clustering? I would greatly appreciate your thoughts on these questions.
@GhadeerShm Жыл бұрын
hi can I did references for the way how you had selected the variables ? or what it is called ?
@LXiao333 жыл бұрын
brilliant! thank you for uploading this video!
@DataDemystified3 жыл бұрын
My pleasure!
@LXiao333 жыл бұрын
@@DataDemystified I wonder whether I should choose cluster analysis in SPSS or perform latent class analysis using Mplus to identify the underlying groups in my data, I am still a bit confused. Can you kindly provide some advice? Thank you.
@DataDemystified3 жыл бұрын
@@LXiao33 That entirely depends on your research question. Without knowing that, I really can't answer your question. Sorry!
@erikailles95983 жыл бұрын
You are a hero!
@DataDemystified3 жыл бұрын
Ha. Thank you!
@tacs3 Жыл бұрын
how can we plot this data in spss the way R does? is there a way?
@musiknation72182 жыл бұрын
I need to do assignments between kmean and improved kmean cluster analysis,can pls tell me how to do that
@diborateferi41912 жыл бұрын
How can run k-mean clustering algorithm for 40 sample with four variables to group the sample to into two?
@joycethegreat92599 ай бұрын
During my conjoint analysis, there is no important value and utilities because spss stated "no analysis is performed because there are no valid cases" how to solve this. I did cluster analysis to get the utilities and std.error of each cluster but after performing conjoint to my one cluster, conjoint won't show results. Please help. I have no missing values, no duplication and whatsoever.
@roshikaranjan4 ай бұрын
Can K-means cluster analysis be performed with all categorical variables?
@dr.sethrose3 жыл бұрын
Thank you for a great explanation! I was wondering how to view demographic characteristics between each established cluster. For example, viewing percentage breakdowns of age, gender, etc. in each cluster. Thanks!
@miakirk70103 жыл бұрын
Very clear explanations. Thank you.
@DataDemystified3 жыл бұрын
Thanks!
@martinpeikert6746 Жыл бұрын
So clear, thank you so much!
@rabeeyafarooq27889 ай бұрын
How do we define the names as to what is increasing and what is not
@musiknation72182 жыл бұрын
How to do improved k mean cluster analysis
@elhamtorkashvand Жыл бұрын
thank you for the great video, would you please explain about how to apply elbow method to find cluster number?
@sachikogaming11372 жыл бұрын
Is it necessary to correlate first the variables before proceeding to clustering. Is it important to select only variables that are correlated, for analysis.
@DataDemystified2 жыл бұрын
Nope. Clustering does not require variables to be correlated.
@mariabecker18033 жыл бұрын
Hi, I was wondering how to read in cluster centers from an external file (after having done the hierarchical clustering) as SPSS always shows error messages (not correct format or one variable name is incorrect). Do you have a video for that? or any solution to my problem?
@DataDemystified3 жыл бұрын
Sorry you're having trouble with that. I don't have a video on the topic and don't often import cluster centers from an external file. Is there a reason you are doing it that way rather than natively running the analysis on the data?
@mariabecker18033 жыл бұрын
@@DataDemystified yes, I am using k-means clustering in order to validate the cluster centers/numbers of clusters that I have calculated with hierarchical clustering. Therefore, I want to use the cluster centers that I have (from the hierarcical clustering) as a starting point and see what changes when I do the k-means clustering. However, no matter what I do (even when I do everything according to the literature) I get error messages and SPSS has troubles reading in the cluster centres from an external file. Would you know what I could do to avoid the error messages and get my results?
@DataDemystified3 жыл бұрын
@@mariabecker1803 Got it. One option is to just re-run your hierarchical clustering with the original data and then, in the same data file, run the k-means clustering. Save the cluster membership for both analyses, and then do your comparison. If that's not possible and the import isn't working, you can always do it manually. As in, sort the data by some identifier and copy and paste the column of data from your original data (where the hierarchical analysis is) into the new data file (where you plan to run k-means). I hope that helps!
@mariabecker18033 жыл бұрын
@@DataDemystified Thank you! I have tried that already and it works to compare the two in the same data file. This is not the problem. However, I saw that the cluster memberships are completely different (hierarchical and kmeans), therefore I wanted to do the k-means clustering with the same cluster centers as I discovered in the hierarchical in order to see where the difference is when both have the same starting point, if that makes sense? It is just that there is no other way in order to put in the starting points (cluster centers) manually and only do it with the read in, I guess? which in my case is not working. Therefore, I do not know how to proceed.
@DataDemystified3 жыл бұрын
@@mariabecker1803 My only suggestion at this point is to make sure you are using Ward's Method in your hierarchical clustering. That tends to give results closest to k-means. Good luck!
@mahdifareghi3916 Жыл бұрын
Hello if any video about anaaysis kmean results deeper
@mehmettolgataner88789 ай бұрын
Is it the same on SPSS29?
@ezeugochukukere15382 жыл бұрын
This is very helpful. Oddly enough the reason i came across this video was because i was searching on how to calculate the initial cluster centers in SPSS. I need them for my R script to perfectly replicate the K mean clusters analysis i run in spss...inputting the initial cluster centers calculated in SPSS provides the exact same results for the final cluster solution in R ...it was the first thing you said we don't need but i am pretty desperate in my search to find out how those initial cluster centers are calculated. Any help you could provide would be huge
@abdullahisani9746 Жыл бұрын
Thanks for the demonstration
@mariabecker18033 жыл бұрын
Dear Jeff, I was wondering if I could ask you one more question. As I am working with z-scores and trying to compare the means (of z-scores) at the end of the cluster analysis in order to show the difference of variables within and between the clusters, I encountered very high means of z-scores ranging up to 4 or 5. Could this be an indication of outliers? Would you suggest me to remove all the outliers before the analysis or would this change the dataset too much and you would just report it as it is? Thank you!!
@DataDemystified3 жыл бұрын
4-5 on a z-score is pretty high. We typically consider statistical outliers as being more than 3 standard deviations from the mean (which translates to a z-score of 3 or more). The choice to remove data, based on outliers, however, is a lot more complex. Did you pre-specifiy that you would do so? Are you doing it because your results, inclusive of the outliers don't "Look good"? The point is to make sure that your exclusion isn't going to artificially inflate Type 1 error (p-hacking). Good luck!
@mariabecker18033 жыл бұрын
@@DataDemystified I did not pre-specify that I would do that. Just compared to other cluster analysis, with other data, and their results (mean z-scores), mine are very high, so I thought that I might have done a mistake and that it would be best to remove the outliers. However, I do not want to manipulate my data. Maybe it is enough to just mention the high z-scores but leave them in the data? Thank you!
@DataDemystified3 жыл бұрын
@@mariabecker1803 I don't know what context you're reporting in (academic paper, school assignment, etc...) but transparency is always a good thing. At minimum, add a footnote with the explanation. Better yet is a robustness check that is explicitly exploratory: see what happens when you drop those outliers. Do the results meaningfully change? If they do, report that and speculate as to why. If they don't, report that as well with a note about how your results are robust to their removal.
@mariabecker18033 жыл бұрын
@@DataDemystified Dear Jeff, it´s part of my dissertation so I really want to do a thorough job. I will definetly do a robustness check and am curious to see what will change. So thank you for your advice!
@aviralbhatt16643 жыл бұрын
Hello, I have a doubt and I would really appreciate if you could clarify it. So do we use Hierarchial Cluster Analysis to identify the potential clusters and then K-Means to understand how those clusters are different from each other?
@DataDemystified3 жыл бұрын
We use Hierarchical Cluster analysis to identify the most likely # of clusters. We then use k-means to actually create those clusters and explore them. Hope that helps!
@aviralbhatt16643 жыл бұрын
@@DataDemystified yes it does thanks alot 🙌
@tracyquetzal94772 жыл бұрын
Hi Professor, very good presentation. I would like to know how can you understand your cluster in order to label them? What patterns do you find to classify your cluster?
@katiesharp80803 жыл бұрын
Hi I love your videos, really helping me analysis my dissertation data :) I was wondering if you had any videos that touched on how to identify the characteristics of your clusters? i.e. age, gender, those sort of things?
@DataDemystified3 жыл бұрын
I don’t, but basically you’re just going to run either t-tests/ANOVA or cross tabs. You’d use the cluster number as the independent variable and your demographic as the dependent variable. I have a bunch of videos on those techniques in the SPSS playlist on this channel. Good luck!
@aarinwood45222 жыл бұрын
Great series of videos -- thank you! I do have one follow up question: What are the sample size requirements for Cluster Analysis? Thank you!