Extracting Reddit Data With R and the package RedditExtractoR (2023 Update)

  Рет қаралды 4,497

James Cook

James Cook

Күн бұрын

Пікірлер: 37
@_nikolas
@_nikolas 8 ай бұрын
James Cook Sir you cooked with this video
@seanlandsman7516
@seanlandsman7516 Жыл бұрын
As a post-secondary educator myself, I have a knack for identifying strong teachers and you have a gift for explanation! Your students are lucky to learn from you. Thanks for all these resources.
@JamesCookUMA
@JamesCookUMA Жыл бұрын
That's so kind. I appreciate your encouragement!
@owlofjune
@owlofjune 11 ай бұрын
This was super helpful for someone who has never coded anything before. Thank you lots!
@jadesierra7691
@jadesierra7691 Жыл бұрын
This is super great work that you're doing. I really appreciate how you explain every part of the code in such a simple way!
@robertc2121
@robertc2121 2 ай бұрын
Really helpful demo, thank you !!
@JamesCookUMA
@JamesCookUMA 2 ай бұрын
Glad to hear it!
@caspercascade4623
@caspercascade4623 Жыл бұрын
oh m y gosh this video is gonna save my entire msc project THANK YOU
@KevinVan-pu7cr
@KevinVan-pu7cr 9 ай бұрын
Thank your for your sharing and your interesting tutorial!
@JamesCookUMA
@JamesCookUMA 9 ай бұрын
Thank you! Cheers!
@marcelmertens8341
@marcelmertens8341 Жыл бұрын
Awesome, Thank you, it helped me a lot during my MSc Thesis
@mahtabtalafian7369
@mahtabtalafian7369 10 ай бұрын
Hi James! Thank you, it was super helpful. Just one quick question. You limited the Period to Day, but R extracted data with different dates. How did it happen?
@JamesCookUMA
@JamesCookUMA 10 ай бұрын
Hi! Two points: 1) yes, Reddit sometimes returns interesting results like that; it's important to examine and sort through data to ensure that you have the information collected within the bounds that you'd intended to collect. 2) data with different dates can appear because the posts are what's selected, and then comments are collected that are associated with the posts.
@jutsie
@jutsie 10 ай бұрын
Thanks alot!
@An-kz1ef
@An-kz1ef Жыл бұрын
Is it still possible to construct a SNS visualization like in one of your older videos? I don't see the construct_graph option on the new version of the RedditExtractoR package. Thank you for your time!
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Thanks for this great question. You are right. The authors of that package have removed that option. What you would have to do to visualize a network now is to integrate another package, such as igraph.
@Alberto-ng2gs
@Alberto-ng2gs Жыл бұрын
Hi James! Thanks so much for this video, it's so helpful and well explained!! I just have one question: I'm trying to search threads and subsequent comments by keyword, would it be possible to do such a thing? I guess it would be somewhere with comment=get_thread_content(dataset$url)? Btw I went to UNC, so this was a perfect example :D
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, Alberto! Glad to hear this example hit home substantively. Regarding the option, see here: cran.r-project.org/web/packages/RedditExtractoR/RedditExtractoR.pdf -- and look at the find_thread_urls command, which has an option for... yes, a keyword search! You're in luck.
@selmakoso5625
@selmakoso5625 10 ай бұрын
Thanks a lot for this insightful video! It truly is helping me with my master thesis. I do have a question. How do you extract the data for the whole year? (for example 2023)
@JamesCookUMA
@JamesCookUMA 10 ай бұрын
Sadly (but understandably), Reddit limits the amount of information that one can obtain through methods like these. I've seen examples in a low-communication subreddit where it's possible to go back a few months, but in high-communication subreddits your search may obtain only a few weeks' worth of information. Pushshift has some stored data here -- academictorrents.com/details/89d24ff9d5fbc1efcdaf9d7689d72b7548f699fc -- but use at your own risk.
@zacks9897
@zacks9897 Жыл бұрын
Hi James, amazing video! Just wondering if this still works with reddits new API terms?
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, @zacks9897 -- and thanks for writing. APIs change all the time, so this is no guarantee for the future, but the good news is that yes, this approach still works!
@anitejbharadwaj2865
@anitejbharadwaj2865 Жыл бұрын
how do you export the dataset from R?
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, and thanks for writing. The command write.csv is one standard way to do that. Here's a resource on that command: www.statology.org/export-data-frame-to-csv-in-r/
@melaniekoskie1914
@melaniekoskie1914 9 ай бұрын
Can I use this method to extract the contents of one specific post in a subreddit that has been archived (9 months ago)?
@Alokparekh
@Alokparekh 4 ай бұрын
Sir, when I run the command for URLs, I am not getting a new data set. Could you please help me with that?
@JamesCookUMA
@JamesCookUMA 4 ай бұрын
Hello! The fix depends on what you've written for your line of code and what you've received as an error code. You'd need to share those in order for me to help. :)
@antoninmoryson7400
@antoninmoryson7400 Жыл бұрын
Hello james, I am trying to work through my research extracting data. The only thing is min only stops at page 3 instead of going on like yours does, any idea why is that ?
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, Antonin. There are many possibilities, and without looking at your script I can't say for sure what might be going on with you. Perhaps the top possibility is that what you're looking for is simply less frequent in occurrence. If this is what's happening, consider changing the period="day" option to period="week" and seeing what happens.
@neloyislam4002
@neloyislam4002 Жыл бұрын
How do I extract the whole subreddit
@JamesCookUMA
@JamesCookUMA Жыл бұрын
The only way I know of to do this would be to go outside the Reddit API, which has limits that prevent extracting an entire subreddit, and to instead write a computer program to load every thread in turn and then “scrape” (read and convert into constituent parts) each thread. You could write a Python program for that.
@element4studios
@element4studios Жыл бұрын
@@JamesCookUMA Especially with reddits new policies, this may be a good thing to pursue. There is a lot of interesting data there that would be a shame to lose.
@ey5722
@ey5722 Жыл бұрын
@@JamesCookUMA Can you give more information or direction of the python program? Thank you.
@JamesCookUMA
@JamesCookUMA Жыл бұрын
Hi, @@ey5722 . This video describes a method for R, a different programming language from Python used largely in research methodology.
@ey5722
@ey5722 Жыл бұрын
@@JamesCookUMA So can I use the R code to get the whole subreddit? I have a paper in which I need this kind of data. Thanks
A First Program using R and TwitteR to Collect someone's Tweets
13:14
БОЙКАЛАР| bayGUYS | 27 шығарылым
28:49
bayGUYS
Рет қаралды 1,1 МЛН
Tableau in R for $0 (Introducing GWalkR)
11:20
Business Science
Рет қаралды 7 М.
Scraping comments and posts from reddit in Python from scratch
13:36
Text analysis in R. Demo 1: Corpus statistics
23:12
Kasper Welbers
Рет қаралды 21 М.
🌍 How to WEB SCRAPE in RStudio 🌍
14:28
Dean Chereden
Рет қаралды 4,8 М.
Text analysis / mining in R - how to plot word-graphs
25:38
Tom Henry - data science with R
Рет қаралды 29 М.
R programming for beginners - statistic with R (t-test and linear regression) and dplyr and ggplot
15:49
Automated Web Scraping in R Part 1| Writing your Script using rvest
16:34
Data Science Dojo
Рет қаралды 38 М.