TWITTER SENTIMENT ANALYSIS (NLP) | Machine Learning Projects

TWITTER SENTIMENT ANALYSIS (NLP) | Machine Learning Projects | GeeksforGeeks

Рет қаралды 107,892

GeeksforGeeks

Күн бұрын

Пікірлер: 134

@GeeksforGeeksVideos 11 ай бұрын

Related Article: www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/

@shlokkumar6257 7 ай бұрын

siddharthan sir, congratuations for joining geeks for geeks.

@alexramos587 11 ай бұрын

Great content from Siddhardhan.

@Sany1920 5 ай бұрын

you got it bro!

@Sagardeep_Das 2 ай бұрын

Thank you so much gfg. I even applied neutral tweets in doing this sentiment analysis. I will try to do this for other social media datasets.

@codehub8326 11 ай бұрын

Pls make a video on career recommendation after secondary school using ML

@bharathikathiresan 8 ай бұрын

Very Clearly explanation

@sunshineandrainbow5453 4 ай бұрын

Beautifully explained!!! Thanks a lot sir 🙏🏻🙏🏻🙏🏻 PS : How can we use it to make a project which will take input from the user

@janaraj295 8 ай бұрын

Nice explanation, completed the project👍

@rasdhdg 8 ай бұрын

Hi brother. Can you please provide me the code . I will pay you the amount you demand . 🙏🙏🙏🙏🙏🙏

@1anu_ra-dha. 8 ай бұрын

Have you run this project??

@janaraj295 8 ай бұрын

@@1anu_ra-dha. Yes

@janaraj295 8 ай бұрын

@@1anu_ra-dha. Yes

@reema9088 6 ай бұрын

Have U run the project ?

@rachanikhilrnr 11 ай бұрын

But what if I want to test on new data

@sumitsamanta7416 Ай бұрын

u said abt overfitting and we can clearly see the model is overfitted so can we improve the accuracy and make the model little underfit ?

@rahulmahatha9086 11 ай бұрын

link to the notebook please....

@AbhimanyuKaushik-i5p 3 ай бұрын

stemming would remove the comma as well as the apostrophe but we need to keep the apostrophe . so what should be the re for that

@ravindarmadishetty736 11 ай бұрын

Nice explanation

@janmjaysingh2719 7 ай бұрын

How can we use our own tweet or comment to test the machine. As for now we are using index as median. Tell me if i can actually write my own sentence and check whether its negative or positive?😢

@ymmich2143 4 ай бұрын

I have the same doubt. Please let me know as well if you have found the answer.

@ChithraiSelvan-d8y 2 ай бұрын

@@ymmich2143 I aslo having same doubts

@ChithraiSelvan-d8y 2 ай бұрын

@DeionDz 12 күн бұрын

Convert your tweet into a vector. using the process you showed and then follow the same predication pattern.

@crickethighlights7185 8 ай бұрын

Thanks for the video

@saicharanchepuri8145 4 ай бұрын

Just a suggestion it would have been simpler if you directly utilized the methods as they are from the libraries rather than storing each method in a different variable. It kind of makes the code look heavy and also confusing and difficult to follow. Thanks for the tutorial.

@prasadmahadik3514 3 ай бұрын

Hi Sir, Can we split data in X customer_Id with Text ? So we can get which customers said negative reviews ?

8 ай бұрын

I have implemented the above code it took some time to stem then I implemented another code with lemmatization it was fast. In general, Lemmas are more computationally expensive than Stemming, when I checked this code again I found a potential change - for each loop, it's calling StopWords instead you can define it once within some variable and call that variable inside the loop. After making these changes it took me 2 mins for stemming using the above code.

@soukarya_ghosh6612 8 ай бұрын

can u give the dataset link ?

8 ай бұрын

@@soukarya_ghosh6612 You can find it in video itself at 3:30

@thegeeks2002 6 ай бұрын

can you please tell me how's this part of code after changes you mentioned?

6 ай бұрын

@@thegeeks2002 Before it took around 1 hour I guess after these changes it took 2 mins, However I ended up using Lematization even it took 2mins. But this tutorial is really helpful for me to get started with this usecase and then explore more advanced concepts for it.

@shreyanshgupta1587 6 ай бұрын

do you have the code with you rn??

@saurabhkushwah5388 3 ай бұрын

Greate Exeplaination sir.

@syedasabeeh-g4j 8 ай бұрын

very useful. but my code does not showing stemmed_content and it's taking only 2 to 3 mins to run instead of 55 mins. what can be the error sir

@sizzrizz6074 8 ай бұрын

You didn't add blank space between the quotation marks at the .join statement.

@tejiriamrasa3258 7 ай бұрын

@@sizzrizz6074 that isn't the cause of the problem.

@anuragpaul453 8 күн бұрын

sir, there are other options to make this step faster. it took me only 5 mins to perform stemming. Here's the code: import pandas as pd import re from nltk.corpus import stopwords from nltk.stem.porter import PorterStemmer from concurrent.futures import ThreadPoolExecutor from tqdm import tqdm # Initialize stopwords stop_words = set(stopwords.words('english')) def stemming(content): try: port_stem = PorterStemmer() # Instantiate inside the function stemmed_content = re.sub('[^a-zA-Z]', ' ', content).lower() return ' '.join(port_stem.stem(word) for word in stemmed_content.split() if word not in stop_words) except Exception as e: print(f"Error processing content: {content}. Error: {e}") return "" # Return an empty string on error def process_data(df): with ThreadPoolExecutor() as executor: return list(tqdm(executor.map(stemming, df['text']), total=len(df))) # Process the DataFrame in chunks chunk_size = 50000 # Adjust based on your memory capacity num_chunks = len(twitter_data) // chunk_size + 1 stemmed_contents = [] for i in tqdm(range(num_chunks)): start = i * chunk_size end = min((i + 1) * chunk_size, len(twitter_data)) chunk = twitter_data.iloc[start:end] stemmed_chunk = process_data(chunk) stemmed_contents.extend(stemmed_chunk) # Add the stemmed content back to the DataFrame twitter_data['stemmed_content'] = stemmed_contents

@eedemallika505 5 ай бұрын

Great Explanation

@gaganm6639 11 ай бұрын

How can we increase the accuracy of the model

@shaikhahsan100 7 ай бұрын

try different algorithms, maybe some other algorithm works better and gives a better accuracy. Also, if the dataset is not vast like the one on the video try gathering atleast 1000-10000 of data and train again to check the accuracy.

@albingeorgekurian4396 7 ай бұрын

@@shaikhahsan100 have any other suggesstion

@Revathi-hb9rs 4 ай бұрын

sir,please provide the complete end to end neural network style transfer projects with web application

@subratbhattarai3518 6 ай бұрын

Is there a way to convert the vectorized numbers to string again?

@GooSoft 3 ай бұрын

Which library is used like tool?

@growingfire 4 ай бұрын

Siddharthan Sir

@ritvikraj8384 6 ай бұрын

Can you send the PPT file which you showed in the beginning of the video?

@tarandeep6861 5 ай бұрын

Great eplanation thanks a lot

@Gharwali_Baatein 10 ай бұрын

sir i have tried the same dataset with same code but the accuracy score is much different then yours ,its causing an overfitting condition ,how it can be improved

@saurabhojha2832 10 ай бұрын

Kitni aayi thi bhaii?

@saurabhojha2832 10 ай бұрын

Btao

@Gharwali_Baatein 10 ай бұрын

55%@@saurabhojha2832

@sohailaborhamy7855 6 ай бұрын

The same problem,you solve it or not ?

@ssenthilnathan3 Жыл бұрын

Is this Siddharth? That guy who has a telegram channel named Machine Learning and also a youtube channel? If so! Great to see you buddy❤..

@FarrelHaryomukti 2 ай бұрын

thanks.

@mdafroj9846 9 ай бұрын

Hi can help me i got error at the line 20 of code twitter data['stemmed_content']=Twitter_data['text'].apply(stemming)

@srijantiwari9232 9 ай бұрын

Solve hua?

@raunakkakkar1231 8 ай бұрын

it should be twitter_data not Twitter_data (no capital T)

@suyashsawant8928 7 ай бұрын

@@raunakkakkar1231 Hi can you help me too I had an error at the same line twitter_data['stemmed_content'] = twitter_data['text'].apply(stemming) and the error was NameError : name 'twitter_data' is not defined

@bindhuaddepalli580 3 ай бұрын

@@suyashsawant8928 coz there is no file named twitter data with you

@elbechir35 11 ай бұрын

Sır! If we don't have the target column in our Twitter dataset, what should we do?

@albingeorgekurian4396 7 ай бұрын

yes we do..check whether u have selected sentiment140 from kaggle

@jananeejananee2786 10 ай бұрын

siddharthan is here(as instructor)

@valac2085 7 ай бұрын

the way he saved model was not enough, you have to lay a data pipeline for new data and also save the vectorizer

@karishmarajput1871 4 ай бұрын

how to add kaggle.json Path in Jupyter notebook ............please reply?

@AbhimanyuKaushik-i5p 3 ай бұрын

what is up and down sampling

@rishikeshnanaware 9 ай бұрын

X has 179715 features, but LogisticRegression is expecting 460873 features as input. I am getting this error while executing predict function on test data

@archana2467 7 ай бұрын

How you solved this error

@rishikeshnanaware 7 ай бұрын

@@archana2467 for training data use fit_transform function and for testing data use transform() function

@rishikeshnanaware 7 ай бұрын

@@archana2467 use fit_transfer function for training data and transfer function for testing data instead of using fit_transfer for testing data

@rishikeshnanaware 7 ай бұрын

@@archana2467 use fit_transform function for training data and transfer function for testing data instead of using fit_transform for testing data

@vildanbozkurt2762 6 ай бұрын

@@archana2467 ı had the same mistake. random_state should be equal to 2 or yo should check taht you have written the X_test equation correctly in the "converting the textual data to numerical data" section . I mean ; X_test = vectorizer.transform(X_test)

@AbhimanyuKaushik-i5p 3 ай бұрын

what to do when null values are found in dataset

@manjukumari3313 5 ай бұрын

please share the collab notebook link

@sizzrizz6074 8 ай бұрын

Can anyone help my prog shows X_test not defined when running the pickle file in another window . What to do

@UchihaItachiKonohagakure 5 ай бұрын

check that line of code where you split the dataset into training and testing sets. where the random state = 2 was mentioned. Also check whether you executed that snippet or not. that could also be an error.

@divyaagarwal3563 11 ай бұрын

At the time of fetching api from kaggle I am getting a key error: content length Pls help

@garima2158 7 ай бұрын

stemming process is taking too long time to execute, due to which i'm not able to apply modelling

@albingeorgekurian4396 7 ай бұрын

we need to optimize the stemming function. like in list comprehension we are using stopwords.words('english') instead of that create a variable outside the function and use that variable instead.. here I guess it will optimize the code than the previous version

@jkscout 17 күн бұрын

I didn't know 1.6 million was actually 16 million.

@shivanshmaurya8520 2 ай бұрын

Twitter API is not free, what to do now ??

@sohailaborhamy7855 6 ай бұрын

The output of counts_value not the same your output, why this ?

@rahulmahatha9086 11 ай бұрын

can you provide the source code please

@navneet-dubey75 Ай бұрын

Here from Seasons of AI 2024👋🏼

@SWAMINATHANS-id4so 5 ай бұрын

sir how to get colab lin

@yhd0808 10 ай бұрын

1:08:00

@charchitsharma3535 Ай бұрын

can anybody tell where is the collab file link

@SWAMINATHANS-id4so 5 ай бұрын

sir how to get this project coalab sheet. you told me that link in description.i searched it,i did not get

@deepikaravi9292 4 ай бұрын

Go and search for google colab in google and then click on new notebook .so u will get it

@mganesh8473 9 ай бұрын

sir why we are not perfoming lemmatization

@javeedma2764 8 ай бұрын

based on requirement.lemmatization is some what slow when compared to stemming

@yhd0808 10 ай бұрын

51:27

@letseducate123 3 ай бұрын

Where is the code??

@SWAMINATHANS-id4so 5 ай бұрын

how to get colab link

@T1s_kashyap 3 ай бұрын

ho siddharthan sir 😁😁😁😁😁😁😁😁

@154yashraj3 4 ай бұрын

snaives bayes multinomial model has a higher accuracy

@monemtajbid3363 Ай бұрын

can you plz give me all souce code of this project

@AbhishekKumar-wc4nx 7 ай бұрын

can anyone tell me where we can find the all code in this is project

@GeeksforGeeksVideos 7 ай бұрын

You can find everything in this article : www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/

@ak47gamer810 9 ай бұрын

28:14

@boleto7467 2 ай бұрын

Can you share the codes

@yuktasharma7921 Ай бұрын

Can u please provide the source code?

@BTS__Army18 2 ай бұрын

Where is the code

@deepikat4520 8 ай бұрын

my training data accuracy is 99% and test data accuracy is 50 my model is overfitted how to avoid it

@BrainyiFY_o0 8 ай бұрын

L1 and L2 regularization are techniques used to prevent overfitting by adding a penalty term to the loss function during training

@rasdhdg 8 ай бұрын

Hi brother. Can you please provide me the code . I will pay you the amount you demand . 🙏🙏🙏🙏🙏🙏

@Vkg_Universe 8 ай бұрын

I also face the same problem

@deepikaravi9292 4 ай бұрын

50.22

@1anu_ra-dha. 8 ай бұрын

How to deploy this model on web?? Can anybody please help....?? 🙏🙏

@1anu_ra-dha. 8 ай бұрын

@cll1670 thanks for help

@albingeorgekurian4396 7 ай бұрын

what about creating an interface This is just a suggestion if u have any other ideas please share them.

@chahatverma4085 7 ай бұрын

22.25 code shows error. can anyone help?

@albingeorgekurian4396 7 ай бұрын

can u elaborate more

@Alex-oz5kn Жыл бұрын

It's ML not YemYel

@jubayer1640 8 ай бұрын

😄

@OfficeUse-oo4kx 8 ай бұрын

What the point you considered 😅

@officialteflon5971 7 ай бұрын

😂😂😂

@Zainab-en9fd 8 ай бұрын

Can anyone tell me is twitter api free

@thedailymotivational 7 ай бұрын

@ChandraShekarYadav-qp8pz 3 ай бұрын

develop brain tumor detection project using cnn

@albingeorgekurian4396 7 ай бұрын

hello friends i have completed the more than half tutorial still need to go.. in the tutorial I have seen that the stemming is taking to much time (almost 50m) so I just optimized it little bit.... code: [ ] pattern =re.compile('[^a-zA-Z]') [ ] english_stopwords = stopwords.words('english') [ ] port_stemmer = PorterStemmer() [ ] def stemming(content): stemmed_content = re.sub(pattern,' ',content) stemmed_content = stemmed_content.lower() stemmed_content = stemmed_content.split() stemmed_content = [port_stemmer.stem(word) for word in stemmed_content if not word in english_stopwords] stemmed_content = ' '.join(stemmed_content) return stemmed_content [ ] twitter_data['stemmed_content'] = twitter_data['text'].apply(stemming) mine just completed in 6m

@infinity8982 6 ай бұрын

I can't thank you enough 🛐finally I can continue the video😭

@amogu_07 6 ай бұрын

@@infinity8982did u complete it bro is it working??

@vanshikajain3014 3 ай бұрын

can u provide the link of source code or share it with me

@yuktasharma7921 Ай бұрын

Hey can u please provide the whole source code if u have done

@ashwinipatil5733 5 ай бұрын

hello I am making web app on it using flask but i am facing this error(tweepy.errors.Unauthorized: 401 Unauthorized 89 - Invalid or expired token.) because i have not basic account so can you help me to solve error .please reply