Man your videos are so useful and easy to follow, you should have waay more views and subscribers! Keep up the great work, man, you're the best!
@MuhammedShameel-qd3oi15 күн бұрын
Man this means a lot I have one project coming but i have no knowledge in this field but as i had interest in football I chose my topic to be something related to football and had no clue how i would get the data
@oscarmelendezcodina622510 күн бұрын
I'm doing the same, I want to do a web like SofaScore, so Idk who to really do it, so if you have knowledge in this, please contact me
@YTykh4 ай бұрын
Thanks a lot for this man! I've been trying to do a little personal project on data analytics and this was really useful. Subbed! :)
@robertc21218 ай бұрын
Outstanding, your content is really really good and incredibly useful. I didnt know you could add the atts within a read_html call... liked and subscribed :)
@brunosaintclair29118 ай бұрын
u have the best content in youtube for sure
@Ndhming5 күн бұрын
when i set the id for the squad &player stats category; the player shooting table id show that can not found table, how can i solve this problem
@DreamTim10 ай бұрын
Hey, great video! read_html is a great solution, but I think it might run into a problem when dealing with edge cases like duplicate names (when different people has the same name) and could treat them as one. I've added the href element to the table as unique identifier, but not sure if it can be done through read_html
@Indiancitizen57710 ай бұрын
Hi , really love the tutorial, I have one question, when trying to read the html in a dataframe, I'm hitting a HTTP Error 403: Forbidden , which I have seen many times when trying to scrape data from FBRef. Would really appreciate any workarounds for this, thanks.
@McKayJohns10 ай бұрын
can you try something like this? headers = {"User-Agent": "pandas"} df = pd.read_html( "fbref.com/", storage_options=headers )
@McKayJohns10 ай бұрын
Or pass in an actual User-Agent
@ArmanTheAnalyst7 ай бұрын
@@McKayJohns I have tried many times even using User-Agent, but it's not working, Can you make a detailed video on it
@babababa2019 ай бұрын
Hey I’m curious about your course - how are there reviews if it’s only available for pre order?
@TheMizgan2 ай бұрын
Thanks mate. But what to do if we want to scrap the table after pressing toggle per 90 on the FBRef website? Generally, the non-toggled per 90 version is scraped
@Eduardo17Oliveira9 ай бұрын
Excellent content!!
@sandipanibasu9240Ай бұрын
Hey McKay, I was reading through the Use of Data on the fbref page, I'm just in two minds if we can still scrape the data and use it for visualization and share on social media or we have to request data officially?
@Iamoscar085 ай бұрын
your videos are so easy to follow
@McKayJohns5 ай бұрын
Thank you!
@Parrahimovic8 ай бұрын
Hey man, been following you on Twitter for a couple years now, I'm a big fan, you inspired me to study my first Data analytics course a couple months ago, my knowledge is quite limited as of right now, what previous knowledge do I need to have in order to successfully complete the course?
@McKayJohns8 ай бұрын
You should be fine with no prior experience! There's a discord community where you can get help from me and others as well if you get stuck :)
@blyatcyker998 ай бұрын
@@McKayJohns Hey man, can i have the invite for the discord?
@FootballObserver-n2x7 ай бұрын
Great video, I note its only the first 6 rows that are pulled, how do you get the entire table to be pulled
@McKayJohns6 ай бұрын
Hm which page are you looking at?
@veeckhout94108 ай бұрын
Hi man, amazing video! I tried it myself and it worked for the table you used and other tables that were like team related. But when I tried for example the passing table of Bundesliga 2 (so all the players and their stats) it did not work bcs it says it can't find any tables. While I'm quite sure I'm using the right ID. I tried it for different leagues etc but it didn't work. Do you know what could have gone wrong and how to solve it?
@McKayJohns8 ай бұрын
With some tables fbref is kinda weird.. Maybe try this code to see if it works: import requests import bs4 import pandas as pd response = requests.get('fbref.com/en/comps/32/stats/Primeira-Liga-Stats') soup = bs4.BeautifulSoup(response.content) comments = soup.find_all(text=lambda text:isinstance(text, bs4.Comment)) commented_out_tables = [bs4.BeautifulSoup(cmt).find_all('table') for cmt in comments] commented_out_tables = [tab[0] for tab in commented_out_tables if len(tab) == 1] df = pd.read_html(str(commented_out_tables[0]))[0]
@veeckhout94108 ай бұрын
@@McKayJohns Thanks man! I appreciate the help and your videos!
@pramodhsairam96705 ай бұрын
@@McKayJohns I was facing the same issue and couldnt match the table id as well. You saved me man. Thanks a ton for your work. Keep Rocking 🔥
@stargalextr54009 ай бұрын
Hey thanks for the tutorial, but I face an urllib.error.HTTPError: HTTP Error 403: Forbidden error. So how did you navigate that
@YTykh4 ай бұрын
I had the same issue. Please let me know if you've found a solution. ;_;
@Tonitaco8 ай бұрын
How could you export that data into an excel or csv from there? What is your opinion on VSCoder?
@McKayJohns8 ай бұрын
If you turn it into a pandas dataframe you can export it that way to either csv or excel And I prefer PyCharm over VScode for python work but Jupyter Notebooks is where I usually tell everyone to get started
@kevinpaul55419 ай бұрын
hey, would it be okay if i use this data for my dissertation purpose? because im trying to use machine learning with this data
@McKayJohns9 ай бұрын
Yeah that should be fine
@shourya54204 ай бұрын
Hey, fbref have now implemented cloudfare which prevents scrapping. Is there any workaround?
@Virgilplaydirty2 ай бұрын
brightdata site but its paid
@RahulPathakoti-ig2py7 ай бұрын
Hi McKay Johns, Really appreciate the easy way to scrape the table from websites, I need help to scrape data from 2007 to 2024 (premier league) for all teams. could you please help me with this? Thanks Rahul
@McKayJohns7 ай бұрын
The easiest way to do it is to get a list of all of the urls you want to scrape and then use a for loop over each one and scrape the table you need off each page
@RahulPathakoti-ig2py7 ай бұрын
@@McKayJohns thanks a lot , I'll try to scrap those URLs !!
@wiktoria84377 ай бұрын
@@RahulPathakoti-ig2py Let me know if it worked :D
@JuanGonzalez-hs4lf9 ай бұрын
Does anyone know a video that does this in R??
@reasetilo6 ай бұрын
Just learn python bro😭 I was on here asking the same question two years ago, gave up and learned python
@satyajeetpatil817710 ай бұрын
Great stuff
@tomkmb41204 ай бұрын
Did my comment get deleted from this video!? wtf
@samali3239 ай бұрын
74 for a course kinda wild but great video
@Chris_03039 ай бұрын
awesome!!
@SymonsKerwin-d1m4 ай бұрын
Auer Corners
@andrewchen23498 ай бұрын
Thank you for your video! May I know if you encountered this problem when read_html? ``` urllib.error.HTTPError: HTTP Error 429: Too Many Requests ``` Thank you
@andrewchen23498 ай бұрын
Oh NVM I figured it out: add 'User-Agent' in headers. Thank you!
@McKayJohns8 ай бұрын
Nice haha sorry for the late reply
@jose99martin6 ай бұрын
@@andrewchen2349 how did you add the user-agent? i have tried it and i still get the same error headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url,headers=headers) df = pd.read_html(response.text, attrs={"id":ID_TABLE_STANDARD})[0]
@jerzyjaneczek93286 ай бұрын
@@andrewchen2349 Hi there, what did you change your 'User-Agent' value to in headers?