Plotting TF-IDF and K-Means Data with Matplotlib (Topic Modeling in Python for DH 02.04)

  Рет қаралды 9,748

Python Tutorials for Digital Humanities

Python Tutorials for Digital Humanities

Күн бұрын

Пікірлер: 15
@luca-o-sharp
@luca-o-sharp 2 жыл бұрын
This is such a beauty! Thanks for sharing it all in such an easy to understand way!
@python-programming
@python-programming 2 жыл бұрын
Thanks! No problem at all! So happy others are finding it useful!
@felixgiles2617
@felixgiles2617 Жыл бұрын
Great video, thank you! In case of overlapping data points I think you would get better results too if you call model.fit_predict() with scatter_plot_points from line 112 (reduced vectors) for the kmean_indices. It also helps for plotting the centroids of the clusters.
@python-programming
@python-programming Жыл бұрын
Thanks for mentioning this! I will check this out!
@MrSerazzo
@MrSerazzo 2 жыл бұрын
You saved my life
@rafaanimir8820
@rafaanimir8820 Жыл бұрын
Thank you so much for this series I have learned so much from you.
@python-programming
@python-programming Жыл бұрын
I am so happy to hear that!! Thanks!
@miketueller5916
@miketueller5916 3 жыл бұрын
You mention in the video that the description will have a link to stackoverflow. I don't see it right now. Can you add that? Thanks!
@python-programming
@python-programming 3 жыл бұрын
Thanks for catching that! Here you are: stackoverflow.com/questions/27494202/how-do-i-visualize-data-points-of-tf-idf-vectors-for-kmeans-clustering -- I'm adding it to the description now.
@maxrichardson5252
@maxrichardson5252 Жыл бұрын
hello, every time i remake the graph the colours change for the plot why does it do this and how would i fix
@KatorZChannel
@KatorZChannel 2 жыл бұрын
What i need to do with a large data?
@fancynex
@fancynex 2 жыл бұрын
I have an IndexError: list index out of range =(
@irreadings
@irreadings 2 жыл бұрын
same here. colors[d] doesn't work because d sometimes has a value larger than the length of the colors list.
@irreadings
@irreadings 2 жыл бұрын
Oh, I got what the problem was. He changed the "true_k" variable to 5 (whereas formerly it was 20).
@fancynex
@fancynex 2 жыл бұрын
​@@irreadings thank you! I'll even go back to the code and fix it so that it finally works =)
What is Laten Dirichlet Allocation LDA (Topic Modeling for Digital Humanities 03.01)
5:56
Python Tutorials for Digital Humanities
Рет қаралды 10 М.
TF-IDF in Python with Scikit Learn (Topic Modeling for DH 02.03)
35:14
Python Tutorials for Digital Humanities
Рет қаралды 24 М.
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 20 МЛН
Random Emoji Beatbox Challenge #beatbox #tiktok
00:47
BeatboxJCOP
Рет қаралды 52 МЛН
Friends make memories together part 2  | Trà Đặng #short #bestfriend #bff #tiktok
00:18
Osman Kalyoncu Sonu Üzücü Saddest Videos Dream Engine 275 #shorts
00:29
How to Create Bigrams and Trigrams and Remove Frequent Words (Topic Modeling for DH 03.04)
19:11
Python Tutorials for Digital Humanities
Рет қаралды 25 М.
Creating a text classification model in spacy 3x (Topic Modeling in Python for DH 04.02)
19:53
Python Tutorials for Digital Humanities
Рет қаралды 21 М.
What are Topics and Clusters (Topic Modeling in Python for DH 01.02)
14:38
Python Tutorials for Digital Humanities
Рет қаралды 12 М.
All Machine Learning algorithms explained in 17 min
16:30
Infinite Codes
Рет қаралды 307 М.
GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem
19:15
DBSCAN Clustering Coding Tutorial in Python & Scikit-Learn
40:31
The Best Way to do Topic Modeling in Python - Top2Vec Introduction and Tutorial
15:08
Python Tutorials for Digital Humanities
Рет қаралды 29 М.
K-Means Clustering with Python
26:53
Rocio Chavez Ciencia de Datos
Рет қаралды 80 М.
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 20 МЛН