Natural Language Processing (Part 5): Topic Modeling with Latent Dirichlet Allocation in Python

  Рет қаралды 83,418

A Dash of Data

A Dash of Data

5 жыл бұрын

This six-part video series goes through an end-to-end Natural Language Processing (NLP) project in Python to compare stand up comedy routines.
- Natural Language Processing (Part 1): Introduction to NLP & Data Science
- Natural Language Processing (Part 2): Data Cleaning & Text Pre-Processing in Python
- Natural Language Processing (Part 3): Exploratory Data Analysis & Word Clouds in Python
- Natural Language Processing (Part 4): Sentiment Analysis with TextBlob in Python
- Natural Language Processing (Part 5): Topic Modeling with Latent Dirichlet Allocation in Python
- Natural Language Processing (Part 6): Text Generation with Markov Chains in Python
All of the supporting Python code can be found here: github.com/adashofdata/nlp-in...

Пікірлер: 72
@sanketg10
@sanketg10 5 жыл бұрын
Really nice to see all the failed attempts, so many instructors jump to final solution!
@insidetopo101
@insidetopo101 4 жыл бұрын
This is the best LDA video I've ever seen. It is always easier to understand with examples.
@eugenekhcha
@eugenekhcha 5 жыл бұрын
You’re amazing. Everything was so brilliantly explained. Thank you so much
@claudiuclement
@claudiuclement 4 жыл бұрын
This is by far the best LDA explanation video. Awesome job!
@basantmounir
@basantmounir Жыл бұрын
What a high quality video. Please make more content, you're talented and you very clearly know what you're talking about in depth. The intuition of LDA in 10 minutes just shows pure mastery.
@ritikgupta6825
@ritikgupta6825 Жыл бұрын
yes she's talented and so do you. ....hi myself Ritik
@hsweezytube
@hsweezytube 4 жыл бұрын
Thank you Alice so so so much For this amazing illustration and application! 🌻
@ekhtiarsyed
@ekhtiarsyed 5 жыл бұрын
Simply amazing!
@dogukancerrahoglu2970
@dogukancerrahoglu2970 3 жыл бұрын
I've watched many videos and confused too much but you helped me a lot. You've explained everything very well. Thank you so much!
@kushalreddy_09
@kushalreddy_09 3 жыл бұрын
One of the best tutorials. I cant say more. Thank you
@TheJaeheehwang
@TheJaeheehwang 4 жыл бұрын
I had fun watching your videos. They were very real, applicable and informative! Thank you for these videos and I'll look forward to more videos from you. :)
@douzigege
@douzigege 4 жыл бұрын
Much better than DeepLearning's NPL series!!! What a gem in KZbin. Thank you!
@JoyBoseRoy
@JoyBoseRoy 2 жыл бұрын
Thanks for making it so clear and how LDA is actually tuned in practice. Really helped!
@junecnol79
@junecnol79 5 жыл бұрын
wow~ excellent explanation. thanks
@srikanta80
@srikanta80 4 жыл бұрын
Thanks for simplifying the lda methods. This is for the first time even a layman can understand how lda works
@sahil0094
@sahil0094 4 жыл бұрын
What a great tutorial. Thanks for this
@ramnareshraghuwanshi516
@ramnareshraghuwanshi516 4 жыл бұрын
Explained.. thanks for uploading video!!
@akshitbhalla874
@akshitbhalla874 3 жыл бұрын
Thanks for the simple explanation!
@madhavimourya1157
@madhavimourya1157 4 жыл бұрын
Really loved your way of explaination. Thanks for making LDA topic modelling simple to understand.
@ritikgupta6825
@ritikgupta6825 Жыл бұрын
I hope you're not married yet............availabe to be us
@joseradowvisky6292
@joseradowvisky6292 5 жыл бұрын
Genius!!! Thanks!!!
@KayYesYouTuber
@KayYesYouTuber 4 жыл бұрын
Thank you Alice. Very useful video
@acentropuebla8928
@acentropuebla8928 4 жыл бұрын
Thank you so much. I learned a lot from your videos.
@xevenau
@xevenau 6 ай бұрын
Love these simple learning tutorials. Keep it up!
@Caroline-ro2jf
@Caroline-ro2jf 3 жыл бұрын
Great tutorial. Highly recommend
@akramsystems
@akramsystems 4 жыл бұрын
Bless your soul!
@kanimozhikalaichelvan6927
@kanimozhikalaichelvan6927 5 жыл бұрын
Excellent explanation. Very useful video. Thanks a lot :-)
@himanshukumarsharma9098
@himanshukumarsharma9098 3 жыл бұрын
Amazing. 🔥 🔥 🔥 🔥 🔥
@Gokulhraj
@Gokulhraj 2 жыл бұрын
Thanks for making it easy to understand :)
@hxvideo
@hxvideo 2 жыл бұрын
Such a nice and great summary, keep it up!!
@trustinginekwe9273
@trustinginekwe9273 2 жыл бұрын
Nothing to say, other than a massive THANK YOU!
@jayktharwani9822
@jayktharwani9822 5 жыл бұрын
The explanation is crystal clear, but i would be more thankful if you explain the mathematical part.
@praveenkumarmaduri4126
@praveenkumarmaduri4126 4 жыл бұрын
Great explanation and nice work
@swapnatalari4323
@swapnatalari4323 4 жыл бұрын
excellent explanation.thank you
@4abdoulaye
@4abdoulaye 3 жыл бұрын
Great, thanks.
@Pidamoussouma
@Pidamoussouma 2 жыл бұрын
Nice video. Showcasing what did not work is so important.
@tulsipatro4662
@tulsipatro4662 3 жыл бұрын
An amazing video! Just curious to know how do you created the "cv_stop" pickle including the whole vocabulary of the corpus.
@poizn5851
@poizn5851 Жыл бұрын
really helpful thank you😄
@indurthi
@indurthi 3 жыл бұрын
Thank you that was great
@usharambalik6046
@usharambalik6046 5 жыл бұрын
Excellent video!! Can the approach model opinions or argumentative text?
@susovandey1875
@susovandey1875 4 жыл бұрын
Great...You explained it in an awesome way. Btw..Can this be used in text clustering??
@lucyh711
@lucyh711 3 жыл бұрын
Thank you so much for presenting this LDA technique, very informative and practical. I wonder if I could use the Python code to analyze documents in other languages, such as Chinese? Thanks!
@soumyadrip
@soumyadrip 4 жыл бұрын
🔥🔥🔥
@solopackerpodcast2093
@solopackerpodcast2093 4 жыл бұрын
Nice! really enjoyed the explanation! We're trying to google around to see a technique to identify a SEQUENCE of topics within documents, to test hypothesis that most of "these" documents follow a similar order of topics. If you happen to know a resource we can check out, we'd appreciate the nudge =) best wishes and stay safe
@RajaSekharaReddyKaluri
@RajaSekharaReddyKaluri 5 жыл бұрын
LDA is a unsupervised technique, after assigning each word to one of the topics randomly during first iteration how does it change its assignment to other topics in next or subsequent iterations? On what basis it is changing the assignment. Please provide an explanation.
@abhilashrajiv7270
@abhilashrajiv7270 3 жыл бұрын
First of all, lets say LDA has randomly assigned a set of words and topics to document 1. Then LDA goes back and reads the actual document 1 and checks if the words it randomly assigned to document 1 is present at a decent frequency in that document 1. If no, then it makes the changes in the assignments.
@anshumanguha1753
@anshumanguha1753 Жыл бұрын
Amazing material! Late in 2022, I am listening to this. As hugginggface/deep learning-based systems take a lot of GPU/CPU dollars, it is not economically feasible to deploy them frequently. The statistical learning techniques must retract.
@fintech1378
@fintech1378 8 ай бұрын
wow this is NLP before LLM, unbelievable
@PierLim
@PierLim 3 жыл бұрын
This is great, for some reason the volume is a little low though watching on iPhone
@xruan6582
@xruan6582 3 жыл бұрын
24:13 "animals don't occur often in Document #1" --> how can you make sure ? It is a random assignment! "banana doesn't occur much in Topic B" --> how this information is obtained? is it based on prior knowledge, or derived from document #1?
@kyriakoskyriakou
@kyriakoskyriakou 4 жыл бұрын
First of all, thanks for the really explanatory video. Really enjoyed that! I have a question: how you generated the cv_stop.pkl file? Thanks in advance.
@kushalreddy_09
@kushalreddy_09 3 жыл бұрын
Hey did you figure this out yet?
@pr1395
@pr1395 Жыл бұрын
Is there a function or a way to say what is the topic discussed without the user?
@Barneyismyname
@Barneyismyname 3 жыл бұрын
Would it make sense to remove curse words and profanity from the corpus, since they are not really topics? Perhaps if a comedian just uses a lot of curse words, the actual topic his or hers show is about gets lost.
@balasubramaniamdakshinamoo2694
@balasubramaniamdakshinamoo2694 4 жыл бұрын
I try to run your Topic Modelling -passing my csv file , am getting TypeError: no supported conversion for types: (dtype('O'),) error. I changed your code as shown below since i don't have pickle import pandas as pd data = pd.read_csv('C:/Users/tbadi/TestIncidentDataCSV.csv') >>> my local file from gensim import matutils, models import scipy.sparse tdm = data.transpose() tdm.head() sparse_counts = scipy.sparse.csr_matrix(tdm) corpus = matutils.Sparse2Corpus(sparse_counts) #cv = pickle.load(open("cv_stop.pkl", "rb")) id2word = dict((v, k) for k, v in cv.vocabulary_.items())
@dataruncoach
@dataruncoach Жыл бұрын
Was there a solution to this
@saibhargavLanka21
@saibhargavLanka21 4 жыл бұрын
Nice Video but Voice very low . Please have better mike .
@liyuqi3779
@liyuqi3779 2 жыл бұрын
12:48 The Coding Example
@douzigege
@douzigege 4 жыл бұрын
One might this installs for it to work: import nltk nltk.download('punkt') nltk.download('averaged_perceptron_tagger')
@VijayBhaskarSingh
@VijayBhaskarSingh 5 жыл бұрын
H iAlice, Topic A: Bananas -40%?, Kale 30%, Breakfast ..? Topic B: Kittens- 30%, Puppies 20% How did you assign the percentages? Also, When word is randomly assigned to a topic, probability of the word in the topic is 50%. With 50% of chance how is a topic for a word determined in the next iteration? So my question is how does reassignment happen in the next iteration?
@jifanz8282
@jifanz8282 4 жыл бұрын
I got the same question.
@rachhek
@rachhek 4 жыл бұрын
haha the words in the topics are so inappropriate. But great video !
@dok3820
@dok3820 4 жыл бұрын
???: "you can look at nouns and adjectives.......and now there are looking pretty good!" topic1: joke, mom, ass, hell, dog topic2: mom, door, dick, stupid topic3: friend, n****, gay, long I dunno about that... Haha jk I get that it's just a small dataset and you were just showing some techniques. Great vid! Thx
@noli-timere-crede-tantum
@noli-timere-crede-tantum Жыл бұрын
18:15 "only nouns" shows words like "I" and "thank" and "living" and "lets". Not a great example.
@BiologyIsHot
@BiologyIsHot 3 жыл бұрын
Anyone else planning to look into Ali Wong after this?
@jameslucas5590
@jameslucas5590 3 ай бұрын
Dirichlet is German is it's pronounced dee-ree-kley
@prateekjaiswani8755
@prateekjaiswani8755 2 жыл бұрын
In India I can't Show These Fucking Words To My Mentors ...
@patrykkoakowski4357
@patrykkoakowski4357 Жыл бұрын
Key words in your topics are ridiculous
@amynguy
@amynguy 5 жыл бұрын
ML and NLP are the most boring and useless subjects ever
@balasubramaniamdakshinamoo2694
@balasubramaniamdakshinamoo2694 4 жыл бұрын
I try to run your Topic Modelling -passing my csv file , am getting TypeError: no supported conversion for types: (dtype('O'),) error. I changed your code as shown below since i don't have pickle import pandas as pd data = pd.read_csv('C:/Users/tbadi/TestIncidentDataCSV.csv') >>> my local file from gensim import matutils, models import scipy.sparse tdm = data.transpose() tdm.head() sparse_counts = scipy.sparse.csr_matrix(tdm) corpus = matutils.Sparse2Corpus(sparse_counts) #cv = pickle.load(open("cv_stop.pkl", "rb")) id2word = dict((v, k) for k, v in cv.vocabulary_.items())
An Introduction to Topic Modeling
26:39
Summer Institute in Computational Social Science
Рет қаралды 62 М.
World’s Deadliest Obstacle Course!
28:25
MrBeast
Рет қаралды 114 МЛН
Khóa ly biệt
01:00
Đào Nguyễn Ánh - Hữu Hưng
Рет қаралды 19 МЛН
ROCK PAPER SCISSOR! (55 MLN SUBS!) feat @PANDAGIRLOFFICIAL #shorts
00:31
Matti Lyra - Evaluating Topic Models
45:05
PyData
Рет қаралды 24 М.
Latent Dirichlet Allocation (LDA) with Gibbs Sampling Explained
33:09
Aladdin Persson
Рет қаралды 7 М.
4 Year Old Speaks 7 Languages!! 🤯 @BestLittleBigShots
6:31
Little Big Shots
Рет қаралды 28 МЛН
LDA Topic Models
20:37
Andrius Knispelis
Рет қаралды 180 М.
Topic Modeling with Python
50:14
PyTexas
Рет қаралды 67 М.
(Original Paper) Latent Dirichlet Allocation (algorithm) | AISC Foundational
43:11
LLMs Explained - Aggregate Intellect - AI.SCIENCE
Рет қаралды 16 М.
The Best Way to do Topic Modeling in Python - Top2Vec Introduction and Tutorial
15:08
Python Tutorials for Digital Humanities
Рет қаралды 27 М.
Мечта Каждого Геймера
0:59
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,7 МЛН
сюрприз
1:00
Capex0
Рет қаралды 1,6 МЛН
Iphone or nokia
0:15
rishton vines😇
Рет қаралды 1,8 МЛН