Text Preprocessing | NLP Course Lecture 3

  Рет қаралды 97,147

CampusX

CampusX

Күн бұрын

In this video, we'll break down the steps involved in getting text data ready for analysis. Think of it as cleaning and organizing text so that it's easier to understand and work with. This process helps us get valuable insights when we're dealing with large amounts of text information.
Code used: www.kaggle.com/campusx/text-p...
Assignment Links:
api.themoviedb.org/3/movie/to...
api.themoviedb.org/3/genre/mo...
============================
Do you want to learn from me?
Check my affordable mentorship program at : learnwith.campusx.in
============================
📱 Grow with us:
CampusX' LinkedIn: / campusx-official
CampusX on Instagram for daily tips: / campusx.official
My LinkedIn: / nitish-singh-03412789
Discord: / discord
E-mail us at support@campusx.in
✨ Hashtags✨
#DataScience #TextPreprocessing #Stemming #Tokenization
⌚Time Stamps⌚
00:00 - Intro
1:01 - Introduction
4:03 - Lowercasing
7:53 - Remove HTML Tags
12:44 - Remove URLs
15:16 - Remove Punctuation
23:29 - Chat word treatment
26:20 - Spelling Correction
28:11 - Removing Stop words
31:25 - Handling Emojis
34:11 - Tokenization
49:18 - Stemming
57:50 - Lemmatization
1:01:33 - Assignment

Пікірлер: 144
@harinair3002
@harinair3002 7 ай бұрын
Anyone following this playlist, my recommendation to them is to please do the assignment, I was shocked at how little we learn by just watching, I did the assignment and what can I say, I was stuck a lot of times and at the end, I completed and now I regularly do Text Preprocessing by making my datasets from Rapid APIs, It gives one soo much flexibility to work on a dataset they created.
@surajnikam3327
@surajnikam3327 6 ай бұрын
Mam can you explain me or refer some notes or videos on using API's and Create own Dataframe
@komalkumbhare4789
@komalkumbhare4789 5 ай бұрын
Hey Hari! The assignment links given above are not directing to the tmdb website, and if I search of TMDB directly on google, it doesn't work as well. Can you tell me how you did that?
@sampath4150
@sampath4150 17 күн бұрын
hello have you saved that code ,its been removed i need it immediately
@venualli3917
@venualli3917 10 сағат бұрын
Would you please let me know resources for practice
@usmanhaider5255
@usmanhaider5255 17 минут бұрын
Session Was SO Good. Assignment Was SO SO SO SO Amazing To Do. Thank For Your Hard Work Sir.
@GamerBoy-ii4jc
@GamerBoy-ii4jc 2 жыл бұрын
Again Sir your are a great person on you tube.. your explanation in every domain and for every topic is great...i followed you ML playlist A-Z and now i start watching NLP.. i hope you will complete your ML series soon and this too and also making great series for us with new and needed emerging thigs ...Thanks Alot Sir!
@shikhasoni9346
@shikhasoni9346 2 жыл бұрын
your lectures really help me to understand NLP Text Preprocessing , Thank you so much!
@sukantb1980
@sukantb1980 2 жыл бұрын
You are a rare gem , I can simply put that in clear short words❤️❤️
@bhanu0925
@bhanu0925 2 жыл бұрын
Exactly, rarest !!
@sachi-4750
@sachi-4750 2 жыл бұрын
You are really a great teacher, thank you so much for coming up with such informative videos, Thanks a lot
@Riya-zb1iz
@Riya-zb1iz Жыл бұрын
This series is amazing!
@prashantlakde
@prashantlakde 2 жыл бұрын
Ur way of explaination shows ur concept clearity and ur efforts to prepare this topic...keep it up.
@siddharth4251
@siddharth4251 11 ай бұрын
Thank a lot Nitish ....i dont have enough words to express my gratitude.
@siddharthbhardwaj7664
@siddharthbhardwaj7664 2 жыл бұрын
Hi, Could you please make the next video on the same IMDB data set and show us how to analyze the linguistic features of the training dataset? I have recently gone through your previous NLP (Movie Review Sentiment Analysis) videos. However, I was quite interested in finding out how can we analyze the linguistic features and what all different algorithms can we apply apart from the Naive Bayes on the same IMDB dataset. PS - your videos are amazing!!! the way you teach the concepts has helped me to understand the basics of NLP. Thank you so much!!
@sarithajaligama9548
@sarithajaligama9548 4 ай бұрын
Very good explanation. your explaining every single details. it's very helpful for beginners. and assignements also very intresting. i feel like why im not found your channel before but lucky to have right now
@miteshkumar7739
@miteshkumar7739 2 жыл бұрын
Your lecture are really helpful...all consept are very clear
@mohaiminrahat4974
@mohaiminrahat4974 2 жыл бұрын
Sir you are a lifesaver.Thankyouuuuuu
@NishantKumar-dw5er
@NishantKumar-dw5er Жыл бұрын
very detailed explanation. Kudos to you.
@raj-nq8ke
@raj-nq8ke 2 жыл бұрын
Gold contents. Thanks for the video
@raj4624
@raj4624 2 жыл бұрын
so far so good.....awesome x 100
@samt5682
@samt5682 2 жыл бұрын
Literally, All In One !
@manishachaurasia3405
@manishachaurasia3405 Жыл бұрын
Series is amazing sir 👏 kindly provide the regex lecture in the description
@pralaymondal3324
@pralaymondal3324 2 жыл бұрын
Thank you, you are just awesome. Much waited for this video. You explain things better than other youtubers. Keep it up...!!!
@rajeevranjan5007
@rajeevranjan5007 2 жыл бұрын
Nice assignment Sir. Thankyou
@BTStechnicalchannel
@BTStechnicalchannel Жыл бұрын
Thanks! for the great content!! One small suggestion can you also give us sometime to write code you are explaining otherwise it becomes theoritical.
@shipradhiman08
@shipradhiman08 2 жыл бұрын
Awesome lecture 🤗🤗🤗❤️❤️❤️❤️
@pankajbeldar9799
@pankajbeldar9799 Жыл бұрын
You are God for me in learning data science
@cipher4811
@cipher4811 2 жыл бұрын
Sir I have been following you for long time and glad that I found your channel and learning so much from you and for that I am greatful and thank you from bottom of my heart. Till now I was working with Google colab but as I am moving towards deep learning now I think it's time for me to buy high end laptop.. But I am at a loss which one should I pic if I go for rtx 3080 then the price is way to much for me ... Having this confusion for past few weeks can you please please please suggest me a laptop for ml&Al&dl learning projects and my budget is 1400-1500$ I will be greatful . Or you may make a video on this topic
@stunninghealer7442
@stunninghealer7442 4 ай бұрын
You are the best sir😊.
@MRBAM
@MRBAM 2 жыл бұрын
Its helpful for me ❤️
@rafibasha4145
@rafibasha4145 2 жыл бұрын
please tag notbook in description,also please complete NLP playlist
@abhishekpathak9654
@abhishekpathak9654 Жыл бұрын
Your videos are full of knowledge. Thanks a lot for this 🙏 you deserve more subscribers... it can attract more viewers if you divide your videos into smaller parts. People generally don't want to engage with long lectures.
@shaiksalavuddin5976
@shaiksalavuddin5976 2 жыл бұрын
Sir thank you so much😊
@tanmayshinde7853
@tanmayshinde7853 2 жыл бұрын
Does anyone know how to apply word/sentence tokenizer on columns? if you know please reply.
@jandaabdulla9335
@jandaabdulla9335 2 жыл бұрын
Congo sir for third video🥳🥳
@deepankarmullick3121
@deepankarmullick3121 2 жыл бұрын
Amazing video but from where can i download the notebooks. I would also request you to share the notebook url's in the video description.
@saurabhdeshmane8714
@saurabhdeshmane8714 Жыл бұрын
sir could you please share notebook, it is not available on given link
@NaryVip
@NaryVip 2 жыл бұрын
You didn't link the video for regular expression in description, can u update it
@abdulqadar9580
@abdulqadar9580 Жыл бұрын
You are Amazing Sir Love from Pakistan.
@dilipkumarbk7657
@dilipkumarbk7657 Жыл бұрын
The way of teaching is cool loved it. One doubt 12:00 in remove_html_tags() it only removes the tags but in real time when we scrap data from a website it contains tags like style, script etc which aren't required in the text mining or NLP process. Just wanted to know is there any other better approach or method that could solve this thing. Thanks in advance for everyone who tries to solve this.
@gautampatadiya6096
@gautampatadiya6096 5 ай бұрын
Thanks!
@jasonbourn29
@jasonbourn29 10 ай бұрын
I checked both methods (removing punctuation)but they are similar in speed sometimes the second one is slower why is it so
@satyamtiwari7680
@satyamtiwari7680 Жыл бұрын
Easy way to remove punctuations. import string import re def remove_punctuation(text): # Define the set of punctuation characters punctuations = string.punctuation # Remove punctuation using regular expressions text_no_punct = re.sub('[' + re.escape(punctuations) + ']', '', text) return text_no_punct
@unknown-ho4wk
@unknown-ho4wk 7 ай бұрын
that was awsome tutorial can you pls link to your Regular expression video ?
@pankajnaik1574
@pankajnaik1574 Жыл бұрын
You are the best
@pradumankumar7607
@pradumankumar7607 2 жыл бұрын
sir can you please share the link of "chatword" used in chatword treatment
@rishabhvarshney2234
@rishabhvarshney2234 2 жыл бұрын
Can we get the pdf of code that you have written in ths vedio
@trackbackresearch
@trackbackresearch 2 жыл бұрын
Thankyou Sir .
@bhushanbowlekar4539
@bhushanbowlekar4539 Жыл бұрын
Sir at timestamp 3.30 you said you will provide notebook , can you please provide that , Thank you
@youtubekumar8590
@youtubekumar8590 Жыл бұрын
Thanku Bhaiya
@Akashphs7217
@Akashphs7217 2 ай бұрын
Hi Sir. Regarding the assignment, how can we meagre genre id and genre type with movies data-frame? I got stuck there.
@shrutianand285
@shrutianand285 2 жыл бұрын
How to use textblob for a large dataset?
@kislaykrishna5599
@kislaykrishna5599 2 жыл бұрын
great content
@SurajitDas-gk1uv
@SurajitDas-gk1uv 6 ай бұрын
Thank you
@ritakathrotiya
@ritakathrotiya 4 ай бұрын
In the assignment, Can anyone have the solution on how to change genres ID to it's Name ?
@piyushpathak7311
@piyushpathak7311 2 жыл бұрын
Sir when you will start series on Deep learning..
@ahmedullahkhan9166
@ahmedullahkhan9166 11 ай бұрын
where is the notebook link? the above link only showing csv file.
@surajnikam3327
@surajnikam3327 6 ай бұрын
Can Anyone explain me how to create dataframe for assignment using thia API . PLEASE!🙏
@riiyyyaaaa
@riiyyyaaaa 4 ай бұрын
Hi Sir, Can you please re add the data links here as unable to load it.
@anupprasad695
@anupprasad695 2 жыл бұрын
One suggestion: sir, ek udemy course banaiye.... Data science bootcamp...
@sachin2725
@sachin2725 Жыл бұрын
please tag notebook used in this video in description,
@bhanuprakash5060
@bhanuprakash5060 Жыл бұрын
where is notebbok of this lecture?? could u please just upload the notebook
@anshuman_madhav
@anshuman_madhav 2 жыл бұрын
While using the lowercase conversion function shown at 7:23 , I am getting below warning,even though conversion is successful. Can you let me know if any other way is there to do conversion or we can ignore the warning? A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
@campusx-official
@campusx-official 2 жыл бұрын
Ignore
@manucmgowda
@manucmgowda Жыл бұрын
Sir the notebook link is dysfunctional .....pls upload the notebook discussed in the video
@ShivaniSharma-tk4bl
@ShivaniSharma-tk4bl Жыл бұрын
@campusX I cant find the codes. can you plz plz give the link?
@kumarabhishek1064
@kumarabhishek1064 2 жыл бұрын
where is the template notebook?
@rahulrajbhar7012
@rahulrajbhar7012 2 жыл бұрын
How to explain a data science project in interview for fresher please make it one video.
@ajitkulkarni1702
@ajitkulkarni1702 9 ай бұрын
Hello Sir, can you reshare code, the link you shared has no code....Thanks !
@furry2fun
@furry2fun 11 ай бұрын
can anyone send the link to the notebook, the given link does not work
@faizahmed8015
@faizahmed8015 10 ай бұрын
56:30 with 'e' probable hai... I understand but it was confusing me. And Thank you Sir such a good video ❤
@ashishsom3849
@ashishsom3849 Ай бұрын
I am not able to find the notebook of the code. Could anyone please help?
@maheshbhatt1505
@maheshbhatt1505 Жыл бұрын
please someone help me with converting that chat words file into dictionary
@bhushanbowlekar4539
@bhushanbowlekar4539 Жыл бұрын
can you please share the colab file
@siddharthkarale3100
@siddharthkarale3100 4 ай бұрын
Getting problem while doing assignment as I have no idea how to get data into a dataframe using api.
@anitabhandari3886
@anitabhandari3886 3 ай бұрын
@campusX : can you please suggest how can we use text for regression (for eg. use comments to predict number of subscribers)
@abhishekvashistha2398
@abhishekvashistha2398 26 күн бұрын
code used is not available in the link. if anyone has please share.
@piyushpawar75
@piyushpawar75 7 ай бұрын
I got an error by using spacy library which is OSError
@zkhan2023
@zkhan2023 2 жыл бұрын
Thanks sir
@snrmedia8965
@snrmedia8965 2 жыл бұрын
Nice video👍
@waqaralam7519
@waqaralam7519 Жыл бұрын
sir code page nai mil raha hai kaggle me ,can any one help?
@allwithinone1345
@allwithinone1345 2 жыл бұрын
thank sir
@vijayraghuwanshi4486
@vijayraghuwanshi4486 11 ай бұрын
anyone tried the assignment? if please reply I have few doubts
@imamasafeer4536
@imamasafeer4536 5 ай бұрын
Where is the video on Regular Expressions?
@mridang2064
@mridang2064 Жыл бұрын
Dhanyavaad. Can you also start a series on web development ? You're just an excellent teacher
@Codingon_lup
@Codingon_lup Жыл бұрын
hey
@Codingon_lup
@Codingon_lup Жыл бұрын
are you working in NLP or other in python? i need your help can you help me?
@dipeshsilwal8098
@dipeshsilwal8098 2 жыл бұрын
Hello sir your code is unavailable please make it available.
@user-sk6hn9jm3f
@user-sk6hn9jm3f Жыл бұрын
how to make this dataset ?
@adityasoni1639
@adityasoni1639 2 жыл бұрын
the notebook/code is not available .!!!
@gauravverma3700
@gauravverma3700 2 жыл бұрын
Awesome
@SLADE-VA
@SLADE-VA 5 ай бұрын
Couldn't find the Notebook link!
@tapanpati9452
@tapanpati9452 Жыл бұрын
can any one share the notebook ?
@miteshkumar7739
@miteshkumar7739 2 жыл бұрын
Hello Sir , make a video for R programming language plezzz......
@anooshkaa
@anooshkaa 4 ай бұрын
notebook ka koi saved version nahi dikhara hai.
@anshumanmahabhoi5771
@anshumanmahabhoi5771 7 ай бұрын
where is the notebook ?
@tusarmundhra5560
@tusarmundhra5560 9 ай бұрын
awesome
@mdaliarmaghan8292
@mdaliarmaghan8292 2 ай бұрын
Can you please provide solution for this assignment
@swet_gokugod9382
@swet_gokugod9382 8 ай бұрын
Great
@potjason2132
@potjason2132 4 ай бұрын
actually tokenization doesn't work in dataset. can u write code to tokenize only the reviews in ur dataset
@shyamtyagi95
@shyamtyagi95 2 жыл бұрын
Nice video
@KumR
@KumR 4 ай бұрын
Done.
@freshersadda8176
@freshersadda8176 2 жыл бұрын
❤️
@usmanyousaaf
@usmanyousaaf Жыл бұрын
sir note book link ?
@AshishSharma-tf3fy
@AshishSharma-tf3fy 2 ай бұрын
sir TMDB website is blocked in india
@MRBAM
@MRBAM 2 жыл бұрын
👍
@yashjain6372
@yashjain6372 Жыл бұрын
best
Inside Out 2: Who is the strongest? Joy vs Envy vs Anger #shorts #animation
00:22
Who has won ?? 😀 #shortvideo #lizzyisaeva
00:24
Lizzy Isaeva
Рет қаралды 65 МЛН
Fast and Furious: New Zealand 🚗
00:29
How Ridiculous
Рет қаралды 12 МЛН
OpenAI's New SearchGPT Shakes Up the Industry, Google Stock CRASHES!
10:10
End to End NLP Pipeline | NLP Pipeline | Lecture 2 NLP Course
1:18:52
Natural Language Processing|TF-IDF Intuition| Text Prerocessing
8:27
Natural Language Processing (NLP) Tutorial with Python & NLTK
38:10
freeCodeCamp.org
Рет қаралды 362 М.
ETL Pipeline in MLOps | Data Management in MLOps | CampusX
32:46
Inside Out 2: Who is the strongest? Joy vs Envy vs Anger #shorts #animation
00:22