Code With Me : Building a Spam Filter !

  Рет қаралды 9,860

ritvikmath

ritvikmath

3 жыл бұрын

Ever wonder how a spam filter works? Let's code one together!
Bag of Words Video: • Bag of Words : Natural...
Link to Code : github.com/ritvikmath/KZbin...
Link to Data : www.kaggle.com/uciml/sms-spam...
My Patreon : www.patreon.com/user?u=49277905

Пікірлер: 17
@tamojitmaiti
@tamojitmaiti 3 жыл бұрын
Have been consistently watching your videos for a while now. It's amazing how clear, concise and succinctly you manage to explain everything. Just a suggestion, it'd be massively helpful if you could link/mention relevant books pertaining to topics in the video description. In the age of SEO laden blog posts, filtering out noise amidst Google searches is a pain.
@drsandeepvm5622
@drsandeepvm5622 2 жыл бұрын
Thanks for your exceptionally simple explanation 👍
@gracikk
@gracikk 2 жыл бұрын
Really good explanation. Thank you!
@kdhlkjhdlk
@kdhlkjhdlk 3 жыл бұрын
In my experience, using word embedding (fasttext is good for the character ngrams) + gradient boosting is far better than naive bayes.
@photonicsauce7729
@photonicsauce7729 Жыл бұрын
This is just naive bayes maybe. Cool
@YohaneesHutagalung
@YohaneesHutagalung 3 жыл бұрын
Could you make a video about stock predictions with RNN Lstm
@Blaze098890
@Blaze098890 Жыл бұрын
spoiler: it doesn't really work
@ccuuttww
@ccuuttww 3 жыл бұрын
LOL I think either your dataset or something went wrong in most situation Naive Bayes always has around 93% accuracy even u do the following technique lemmize steeming remove stopword N-Grams but anyway this video give us a concept of how it works with python
@Islamic_Tv984
@Islamic_Tv984 Жыл бұрын
link please
@jessicatriplev9802
@jessicatriplev9802 3 жыл бұрын
I think you've made a mistake in: test_spam_df = spam_df.iloc[int(len(spam_df)*0.7):] The testing set should have been 30% not 70% of the data. That's perhaps why the validation result was so good.
@gracikk
@gracikk 2 жыл бұрын
test_spam_df = spam_df.iloc[int(len(spam_df)*0.7):] It basically means that we save in our test set only those observations which index is more or equal to len(spam_df)*0.7
@jessicatriplev9802
@jessicatriplev9802 2 жыл бұрын
@@gracikk Right. But that means that 70% of the data will end up in the testing set. He must have meant 30%.
@gracikk
@gracikk 2 жыл бұрын
​@@jessicatriplev9802I can't send you a colab link with a simple example, youtube delete it. But you can try it on your own to be sure that everything is fine
@joshuasigelman8141
@joshuasigelman8141 Жыл бұрын
I think it's fine because he is taking the first slice UNTIL the 70% mark and then the testing set FROM the 70% mark.
@GeyzsonKristoffer
@GeyzsonKristoffer 9 ай бұрын
Bypass the model by repeating words that are biased towards the non-spam, multiple times.
@BurkenProductions
@BurkenProductions 2 жыл бұрын
Wrong language for this. Need php
@Lizergus
@Lizergus Ай бұрын
lmao
Part of Speech Tagging : Natural Language Processing
10:40
ritvikmath
Рет қаралды 39 М.
Creating a Spam Filter using Naive Bayes
33:37
Nick Stugard
Рет қаралды 18 М.
Which one of them is cooler?😎 @potapova_blog
00:45
Filaretiki
Рет қаралды 10 МЛН
Must-have gadget for every toilet! 🤩 #gadget
00:27
GiGaZoom
Рет қаралды 3,7 МЛН
бесит старшая сестра!? #роблокс #анимация #мем
00:58
КРУТОЙ ПАПА на
Рет қаралды 1,4 МЛН
The day of the sea 🌊 🤣❤️ #demariki
00:22
Demariki
Рет қаралды 53 МЛН
How To Really Stop Getting Spam Email
8:53
macmostvideo
Рет қаралды 223 М.
If __name__ == "__main__" for Python Developers
8:47
Python Simplified
Рет қаралды 380 М.
i built the world's worst IOT coffee machine
7:09
Low Level Learning
Рет қаралды 73 М.
Fighting Spam on YouTube with TensorFlow & Python
6:03
Simply Explained
Рет қаралды 25 М.
Naive Bayes, Clearly Explained!!!
15:12
StatQuest with Josh Starmer
Рет қаралды 1 МЛН
Code With Me : Gibbs Sampling
8:18
ritvikmath
Рет қаралды 14 М.
The Sigmoid : Data Science Basics
11:34
ritvikmath
Рет қаралды 38 М.
Python Hash Sets Explained & Demonstrated - Computerphile
18:39
Computerphile
Рет қаралды 107 М.
Which one of them is cooler?😎 @potapova_blog
00:45
Filaretiki
Рет қаралды 10 МЛН