Extracting Reddit Data With R and the package RedditExtractoR (2023 Update)

  Рет қаралды 4,070

James Cook

James Cook

Күн бұрын

Пікірлер: 36
@_nikolas
@_nikolas 5 ай бұрын
James Cook Sir you cooked with this video
@robertc2121
@robertc2121 6 күн бұрын
Really helpful demo, thank you !!
@owlofjune
@owlofjune 8 ай бұрын
This was super helpful for someone who has never coded anything before. Thank you lots!
@seanlandsman7516
@seanlandsman7516 Жыл бұрын
As a post-secondary educator myself, I have a knack for identifying strong teachers and you have a gift for explanation! Your students are lucky to learn from you. Thanks for all these resources.
@JamesCookUMA
@JamesCookUMA Жыл бұрын
That's so kind. I appreciate your encouragement!
@jadesierra7691
@jadesierra7691 Жыл бұрын
This is super great work that you're doing. I really appreciate how you explain every part of the code in such a simple way!
@caspercascade4623
@caspercascade4623 10 ай бұрын
oh m y gosh this video is gonna save my entire msc project THANK YOU
@marcelmertens8341
@marcelmertens8341 11 ай бұрын
Awesome, Thank you, it helped me a lot during my MSc Thesis
@KevinVan-pu7cr
@KevinVan-pu7cr 7 ай бұрын
Thank your for your sharing and your interesting tutorial!
@JamesCookUMA
@JamesCookUMA 7 ай бұрын
Thank you! Cheers!
@mahtabtalafian7369
@mahtabtalafian7369 8 ай бұрын
Hi James! Thank you, it was super helpful. Just one quick question. You limited the Period to Day, but R extracted data with different dates. How did it happen?
@JamesCookUMA
@JamesCookUMA 8 ай бұрын
Hi! Two points: 1) yes, Reddit sometimes returns interesting results like that; it's important to examine and sort through data to ensure that you have the information collected within the bounds that you'd intended to collect. 2) data with different dates can appear because the posts are what's selected, and then comments are collected that are associated with the posts.
@jutsie
@jutsie 7 ай бұрын
Thanks alot!
@selmakoso5625
@selmakoso5625 7 ай бұрын
Thanks a lot for this insightful video! It truly is helping me with my master thesis. I do have a question. How do you extract the data for the whole year? (for example 2023)
@JamesCookUMA
@JamesCookUMA 7 ай бұрын
Sadly (but understandably), Reddit limits the amount of information that one can obtain through methods like these. I've seen examples in a low-communication subreddit where it's possible to go back a few months, but in high-communication subreddits your search may obtain only a few weeks' worth of information. Pushshift has some stored data here -- academictorrents.com/details/89d24ff9d5fbc1efcdaf9d7689d72b7548f699fc -- but use at your own risk.
@Alberto-ng2gs
@Alberto-ng2gs Жыл бұрын
Hi James! Thanks so much for this video, it's so helpful and well explained!! I just have one question: I'm trying to search threads and subsequent comments by keyword, would it be possible to do such a thing? I guess it would be somewhere with comment=get_thread_content(dataset$url)? Btw I went to UNC, so this was a perfect example :D
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, Alberto! Glad to hear this example hit home substantively. Regarding the option, see here: cran.r-project.org/web/packages/RedditExtractoR/RedditExtractoR.pdf -- and look at the find_thread_urls command, which has an option for... yes, a keyword search! You're in luck.
@An-kz1ef
@An-kz1ef Жыл бұрын
Is it still possible to construct a SNS visualization like in one of your older videos? I don't see the construct_graph option on the new version of the RedditExtractoR package. Thank you for your time!
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Thanks for this great question. You are right. The authors of that package have removed that option. What you would have to do to visualize a network now is to integrate another package, such as igraph.
@zacks9897
@zacks9897 10 ай бұрын
Hi James, amazing video! Just wondering if this still works with reddits new API terms?
@JamesCookUMA
@JamesCookUMA 10 ай бұрын
Hi, @zacks9897 -- and thanks for writing. APIs change all the time, so this is no guarantee for the future, but the good news is that yes, this approach still works!
@melaniekoskie1914
@melaniekoskie1914 6 ай бұрын
Can I use this method to extract the contents of one specific post in a subreddit that has been archived (9 months ago)?
@anitejbharadwaj2865
@anitejbharadwaj2865 Жыл бұрын
how do you export the dataset from R?
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, and thanks for writing. The command write.csv is one standard way to do that. Here's a resource on that command: www.statology.org/export-data-frame-to-csv-in-r/
@antoninmoryson7400
@antoninmoryson7400 Жыл бұрын
Hello james, I am trying to work through my research extracting data. The only thing is min only stops at page 3 instead of going on like yours does, any idea why is that ?
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, Antonin. There are many possibilities, and without looking at your script I can't say for sure what might be going on with you. Perhaps the top possibility is that what you're looking for is simply less frequent in occurrence. If this is what's happening, consider changing the period="day" option to period="week" and seeing what happens.
@Alokparekh
@Alokparekh 2 ай бұрын
Sir, when I run the command for URLs, I am not getting a new data set. Could you please help me with that?
@JamesCookUMA
@JamesCookUMA 2 ай бұрын
Hello! The fix depends on what you've written for your line of code and what you've received as an error code. You'd need to share those in order for me to help. :)
@neloyislam4002
@neloyislam4002 Жыл бұрын
How do I extract the whole subreddit
@JamesCookUMA
@JamesCookUMA Жыл бұрын
The only way I know of to do this would be to go outside the Reddit API, which has limits that prevent extracting an entire subreddit, and to instead write a computer program to load every thread in turn and then “scrape” (read and convert into constituent parts) each thread. You could write a Python program for that.
@element4studios
@element4studios Жыл бұрын
@@JamesCookUMA Especially with reddits new policies, this may be a good thing to pursue. There is a lot of interesting data there that would be a shame to lose.
@ey5722
@ey5722 11 ай бұрын
@@JamesCookUMA Can you give more information or direction of the python program? Thank you.
@JamesCookUMA
@JamesCookUMA 11 ай бұрын
Hi, @@ey5722 . This video describes a method for R, a different programming language from Python used largely in research methodology.
@ey5722
@ey5722 11 ай бұрын
@@JamesCookUMA So can I use the R code to get the whole subreddit? I have a paper in which I need this kind of data. Thanks
A First Program using R and TwitteR to Collect someone's Tweets
13:14
Tableau in R for $0 (Introducing GWalkR)
11:20
Business Science
Рет қаралды 6 М.
Mia Boyka х Карен Акопян | ЧТО БЫЛО ДАЛЬШЕ?
1:21:14
Что было дальше?
Рет қаралды 11 МЛН
Disrespect or Respect 💔❤️
00:27
Thiago Productions
Рет қаралды 35 МЛН
Perfect Pitch Challenge? Easy! 🎤😎| Free Fire Official
00:13
Garena Free Fire Global
Рет қаралды 72 МЛН
Wait for it 😂
00:19
ILYA BORZOV
Рет қаралды 11 МЛН
PRAW - Using Python to Scrape Reddit Data!
28:31
BitsInBytes
Рет қаралды 6 М.
Learn R in 39 minutes
38:56
Equitable Equations
Рет қаралды 737 М.
Preprocessing Data in R for ML with "caret" (2021)
19:24
RichardOnData
Рет қаралды 11 М.
Scraping comments and posts from reddit in Python from scratch
13:36
Text analysis / mining in R - how to plot word-graphs
25:38
Tom Henry - data science with R
Рет қаралды 29 М.
Basic, Elementary, Flexible Social Media Sentiment Analysis In R
1:13:05
Text Analysis Basics
28:08
Summer Institute in Computational Social Science
Рет қаралды 19 М.
Mia Boyka х Карен Акопян | ЧТО БЫЛО ДАЛЬШЕ?
1:21:14
Что было дальше?
Рет қаралды 11 МЛН