I just learned this trick 2 days back. one of my friend showed me this method... and i was wondering why hasn't anyone uploaded a video on this. and here it comes.... please do keep making these videos.... they are really helpful...
@JohnWatsonRooney3 жыл бұрын
Glad you enjoyed it!
@Diego-ry6bo3 жыл бұрын
This is so helpful and educational John! Keep it up mate! Love your work.
@maximuscryptosx94242 жыл бұрын
Wow. This is exactly what I was looking for. Simply brilliant. Thank you!
@JohnWatsonRooney2 жыл бұрын
Thanks!
@dadimoszhanzhad84403 жыл бұрын
Bro... What ! this is next level scrapin.. Beyond The Complexities of code, Yet With all the features, Thank You Very Much ! I Love You !
@JohnWatsonRooney3 жыл бұрын
Thanks very kind!
@wallstreetx52412 жыл бұрын
😁 SUPER HELPFUL one of the best coding learning videoes, I ever watched!! you've gained a sub for life!
@JohnWatsonRooney2 жыл бұрын
Thank you I’m glad you enjoyed it!
@matiascavalcante4698 Жыл бұрын
Saved a lot of trouble using this method, thanks!
@JohnWatsonRooney Жыл бұрын
Glad it helped!
@marioandresheviacavieres19232 жыл бұрын
Thanks for the awesome tip, cheers from Seattle!!
@fatimaelmansouri93383 жыл бұрын
This is probably the best video I've seen on APIs ! this topic is so poorly covered on KZbin! Amazing content thank you for this !!
@JohnBillot3 жыл бұрын
Superb, clearly presented and explained. Thank you so much.
@JohnWatsonRooney3 жыл бұрын
Thank you!
@VictorVaughn16 ай бұрын
Awesome video! Do you have a video on what to do with all the information that you just scraped, examples of how to use it?
@caiopjv2 жыл бұрын
So helpful! Much easier for what I was trying than BeatifulSoup.
@hossamgamal86613 жыл бұрын
Amazing video as always keep up the good work
@wernerbrasil2 жыл бұрын
Excellent tutorial! Big fan of your videos
@JohnWatsonRooney2 жыл бұрын
Thanks!
@ThuyTran-bc2mt3 жыл бұрын
what about making a scrapy splash tutorial? I hope you will make it
@JohnWatsonRooney3 жыл бұрын
I have one on my channel already, but will be doing more as I do more scrapy videos
@ThuyTran-bc2mt3 жыл бұрын
@@JohnWatsonRooney it's so great to hear that. I have learned a lot with your videos
@playtune9217 Жыл бұрын
Instead of new API calls, can I get data from the browser's network tabs when the API returns data on the client's browser?
@JohnWatsonRooney Жыл бұрын
If i understand you correctly, yes you can - if you use playwright or selenium you can access the network events and have it return you the json data each time i loads up a page. I use this method for some sites, depending on what I am doing and how they respond
@vignesh_waran2 жыл бұрын
Thankyou so much for this video
@AidarIsayev2 жыл бұрын
Thank you! This's really a game changer. )
@nikolamilicevic86652 жыл бұрын
This is extremely useful, thanks for the tutorial!
@lordmo34163 жыл бұрын
Your structuring is amazing. Since the website calls data from the API every 10 seconds or so, why did I get banned when I automated an interval to request updated data from the API? Is there a workaround not to get banned? Like, what other criteria does the website use to recognize a bot?
@lordmo34163 жыл бұрын
@Loja Outweb how did you fix yours?
@leonardoplaza76772 жыл бұрын
@Loja Outweb He mentioned the website probably works with cloudflare to avoid DDOS attacks. That's why they will block your IP if you make constant requests. Try rotating IPS like he mentioned or just lower the requests by searching every minute.
@seankw28802 жыл бұрын
@Parth Kulkarni he has another video on that kzbin.info/www/bejne/rHvalIponKh2e5Y
@abdul2651 Жыл бұрын
Omg its so useful!!!!! Got subbed. Thaks!
@JohnWatsonRooney Жыл бұрын
Thanks for the sub!
@grahamlindsay9798 Жыл бұрын
That is really useful, thank you for that.
@JohnWatsonRooney Жыл бұрын
Glad it was helpful!
@kamaleshpramanik76452 жыл бұрын
Thank you very much Sir.. learning so much.
@A.R.-rs8 ай бұрын
Can I apply this method on flashscore websites? I guess that site doesn't have api url
@football-scalper3 ай бұрын
I can't determine the game minute - is there a solution?
@justinjchambers Жыл бұрын
Thanks so much for this tutorial. I was wondering if there is a work around when a site isn't returning any such xhr data, regardless of what links and buttons you click to try and initiate a response?
@i701Dev3 жыл бұрын
Thanks for this video!
@ppena1202 жыл бұрын
Super helpful. Thanks
@abhijeetbonde86353 жыл бұрын
can you please try to make a video on how to scrape websites that are using cloudflare protection?
@goncalosilva49749 ай бұрын
How could I get the current minutes?
@hobo_16162 жыл бұрын
Thank you so much!
@munyaradzijeche7365 Жыл бұрын
How do I prefix team names with their log position on soccer upcoming fixture? How do I add Points per game PPG column? Please assist
@black_platypus2 жыл бұрын
Why have I wasted so much time manually reading out HTML results? 🤯 I guess I feared the XHR requests might be too inscrutable or there might be too many hurdles, like cookie management, request tokens/nonces etc. How often do you run into trouble with those?
@JohnWatsonRooney2 жыл бұрын
It’s often down the individual site, but it’s usually just a new cookie needed. Sometimes parsing the html is the best way though! Explore the site first and then decide your approach
@black_platypus2 жыл бұрын
@@JohnWatsonRooney I will! Thank you for being so helpful in the name of empowering the users again! ❤
@bisratgetachew83733 жыл бұрын
Great Video
@mth6311 Жыл бұрын
So im trying to create a live events feed as a personal project for premier league games, so goals, cards, assists, etc etc. Would it be possible to use this method and not get banned somehow? What if i made 6 different scripts to scrape 6 different score websites? Therefore id only be sending 1 request per minute to each site Could this work?
@matheosmattsson28112 жыл бұрын
Could you do a video on something similar but where the API wants a key? I copied the request like you did into insomnia, but I cannot replicate it in there. The response says "no API key provided". I am unable to figure out how the client code in the browser embeds the api key without the request on the network tab knowing about it... The site I am trying to scrape seems to use Vue, if it makes any difference. I tried to inspect the "initiator" javascript file but obviously it is minified and unreadable.
@JohnWatsonRooney2 жыл бұрын
I usually find adding the full headers works, we are then telling the backend we are a browser and we need the information - I'd have to check the site example you mentioned though to see. You can email it to me if you like, email on my YT page.
@matheosmattsson28112 жыл бұрын
@@JohnWatsonRooney Yeh I thought I had left something out earlier when I tried it a couple of weeks ago. I then saw your video and figured I would give it another shot with copying everything "automatically" copy -> cUrl cmd, but it did not help (earlier I made the request myself "from scratch"). I will email you the site and details. Thanks!
@XiagraBalls2 жыл бұрын
@@matheosmattsson2811 This method will only work for public APIs - where private API keys aren't required. Usually you encrypt your key details into a hash, send it over and its decrypted by the server and your key is extracted there. This means that all an anonymous user would see in the headers from the Network tab is the encrypted hash and you can't just use an existing hash as it will also include a timestamp.
@void-qy4ov3 жыл бұрын
For protected API, do you think it is possible to make the first call with selenium, grab a token, and from this point use it in calls toward API using requests ?
@JohnWatsonRooney3 жыл бұрын
Yes I think so, you can take the cookie from your selenium request and reuse it in other parts of the code
@void-qy4ov3 жыл бұрын
@@JohnWatsonRooney it seems that it is easier with selenium-wire, since you actually get access to all requests/responses including the headers
@kissoffire762 жыл бұрын
@@JohnWatsonRooney did you mean by mimicking logging oneself in there in the 1st place by using Selenium, so as to make this secret part of the Header (call it a token or cookie or whatever the site owner stated it is) accessible? I am just making a strategy as how to scrape API protected JSON stored reviews, sliced by a company name, for my master thesis. However, with no BEARER statement and code of Authorization (which you ONLY CAN SEE by Postman-analyzing a JSON GET request ONLY when logged in there) it returns only JSON 0 page (regardless of how many there might be per company) with 2 reviews only (out of 10 per JSON when logged in). So if I try to put all the code from Postman in my Web Scraping script Header, i.e. with the Bearer code, and ignore Selenium log-in, I am afraid I would miss some part of the server communication protocol and will be blocked or banned (robots.txt doesn't state anything is forbidden though). What do you advise? Btw, you make awesome tutorials, dude! I am literally living in them these days!
@sharankrishna98153 жыл бұрын
Hey!! Thanks for this! Its very informative! :)) I have a doubt regarding scraping, could you help me w it?? Question: I have a list of 100 (X0, X1, X2..., X99) products along w their pricing (P0, P1, P2....., P99). Is it possible to scrape the google shopping price data for all the 100 products? And if the prices of the individual products, say for instance product X0's price on google shopping is greater than the given price (P0), update that as the new price in a new column? Your input would be much appreciated! Thank you!! :)
@nnld2182 жыл бұрын
Hi sir, any ways for scrape video stream (live video) football?
@plavali_znaem2 жыл бұрын
Was trying to scrape Internet speeds from speedtest with this method, got only 2 tick boxes under "Name" section under "Fetch/XHR" tab on inspector. In "response" there is several letters only, for first tick box it's "1d" and for the other it's "1gfi". Is there anyone knowledgable enough to help me to find a way around this? Or speedtest webpage doesn't use the API and tables in the first place? (There are speedtests which I would want to scrape, and the very speeds are placed on the graph curves, so I was thinking the graphs are auto-generated based on some table).
@kuhicop8 ай бұрын
for bet365 any ideas? :(
@RicardoMilbrath2 жыл бұрын
Is possible get statics in real tiime? Bad english (brazilian boy) :)
@coalitea8 ай бұрын
This is exactly what I was looking for to scrape off live data on bitcoin etc. But sir, is this illegal?
@mgmyo70662 жыл бұрын
Is that possible with node is sir?
@JohnWatsonRooney2 жыл бұрын
Yes of course, I don’t know Node or JavaScript that well though I’m afraid!
@CrazyFanaticMan2 жыл бұрын
Cloudflare didn't even give me a chance, blocked my IP instantly 😂😂
@JohnWatsonRooney2 жыл бұрын
Ah yeah that’s a real possibility, I use my vpn for testing usually but even then a lot of those IPs are blocked already so it’s much harder.
@TheDzideek12 жыл бұрын
@John Watson Rooney I got banned by SofaScore "The system identified you as a scraper and banned the IP. To use the data on the website contact the owner and request permission"
@JohnWatsonRooney2 жыл бұрын
unfortunately that's a part of it, you'll need to use proxies ideally to continue - it kinda turns into an arms race
@abhijitmondal78313 жыл бұрын
Wow. That's amazing 🔥 I really like your work.
@JohnWatsonRooney3 жыл бұрын
Thanks!
@DjElio100 Жыл бұрын
Thanks man
@tubelessHuma3 жыл бұрын
You are right. It would be first step to check for any API to make our life easy. Thanks John.💖
@Ionut.C6 ай бұрын
Hello, it works great. What should I do if I want the odds before the matches start? Let's say that every morning I want to copy the quotas. I notice that each match has a numerological event identifier, how do I identify this numerological event so that I can copy the odds and the next day I can enter next to each event the score that was recorded? Thank you and all the best!
@adnan-hz7ed Жыл бұрын
can i access the "Statistics" too this way? like if i wanted to make a code that checks if the home team has 4 shots on target and the away team has 0 and other conditions like that
@JohnWatsonRooney Жыл бұрын
Yes I think you could, do the same process but on the page where the stats load up and find the api
@adnan-hz7ed Жыл бұрын
@@JohnWatsonRooney hmm thanks i will keep trying. Seems a bit difficult since some live games have livestats when you click on them but couldnt find any keys in the json file they all were false altough some shouldv been true
@andriuslopes6377 Жыл бұрын
Thank you very much !! I was having trouble extracting data from dynamic websites.
@luisparada54432 жыл бұрын
I hope I can buy you a beer sometime man. I appreciate this video for real. Thank you! +1 Follower
@brunogarcia23362 жыл бұрын
John! Amazing video! I am starting with coding and was nice to learn a lot with you. Question: How can I set up one filter for live games? For example, just show the live games with 0x0 on score, or with away team score once? Is it possible to filter the live games with parameters? It would be amazing to learn form you this as well. Thank you for your effort!
@XiagraBalls2 жыл бұрын
I think the API would simply return no-score draws as just that - 0 : 0
@uttamsharma63583 жыл бұрын
Will you start a discord server?
@JohnWatsonRooney3 жыл бұрын
I’ve thought about it, I will at some point and I’ll post it up so you guys know. Just not sure when yet!
@scg5650812 жыл бұрын
Thanks for all the tutorials John. As a newbie to Web Scraping and data science (never too old to learn at 58), I’m loving the intuitive and plain English approach you have in your demonstrations. having watched the ‘Scraping News’ video and now this one, I wonder how you could refine the script to include a search for the search bar and then suggesting a topic that is then searched for. I.e. I have a news feed favourite site that has a search bar that I can refine my chosen reading material, say ‘Ukraine’ for example, and it goes and fetches all the news from around the world on that topic. It’s then that I’d like to scrape the newsfeeds and then that your newsfeed script comes into its own. Great if you could demonstrate a video that overcomes the search aspect before the automated scraping. Thanks and keep up the videos. Easily my favourite go to learning resource.
@jordanherget44532 жыл бұрын
Well if you look at the network calls when you search something you should be able to track down the endpoint they use for searches. You should then be able to call that endpoint yourself and scrape the data that way.
@StormWolf012 жыл бұрын
Well, scrapping data from the actual API server as opposed to the webpage itself is actually a great idea. Thanks for the vid.
@Swqtt2 жыл бұрын
Great video, it is a lot more useful to work api then with Selenium. I improved my time to download everything from 5 to like 1 minute. Thanks
@niccolotomei316 Жыл бұрын
Thank you!
@Analyse_US3 жыл бұрын
Will this approach work with dynamic web pages? Or is requests-html still the best approach for dynamic pages?
@JohnWatsonRooney3 жыл бұрын
Yes it will - it cuts out the need to get the data from the page, I’d recommend checking this way out first and see if it can work for you. If it’s not available then rendering the page is the next option
@boiboi19882 жыл бұрын
Thanks for this tutorial John. Really appreciate what you are teaching here. It solved my web scraping problem. :)
@JohnWatsonRooney2 жыл бұрын
That’s great I’m glad it helped
@h.screation28173 жыл бұрын
Sir which theme you use in vs code???
@JohnWatsonRooney3 жыл бұрын
Gruvbox material - it’s in the extensions
@h.screation28173 жыл бұрын
@@JohnWatsonRooney thank you sir
@pkpkpk_9811 Жыл бұрын
This is a perfect simple video. However, if the api called is changed how can you parse it since the old one brings old data??? Thanks in advance.
@Cubear992 жыл бұрын
Can you do a new youtube about Amazon for 2022? Amazon has been changed. I tried it but does not work anymore gives me 504. I tried in Java and does give me all the info.
@killian.16032 жыл бұрын
the video is really well explained, thanks for that. However I'm trying to add a condition for tennis games, how should I add the coming set "period" on this API to python
@wangdanny1782 жыл бұрын
Ok I think this video solved the problem of yesterday posted in another episode about hidden api. THANKS JOHN!
@fernandaalves7110 ай бұрын
Your work is amazing! Thanks for helping me a lot with these scraping practices!
@craftsntech25002 жыл бұрын
So helpful. Thanks for the concept shared freely
@JohnWatsonRooney2 жыл бұрын
Thanks glad you enjoyed it
@craftsntech25002 жыл бұрын
@@JohnWatsonRooney Really... You know I spend lots of time doing this via selenium python, but this just made my life much easier.
@gisleberge43633 жыл бұрын
A few questions. If you peform this API endpoint strategy as suggested here, aren't you creating some kind of "imbalance" in the requests that the server (?) could easily detect as automated computer activity and not a real person? Something that one needs to considering avoiding being blocked when you scrape the API like suggested here (except from the obiouvs, don't do it too fast etc)? Else, also believe Captcha is not an issue here (which can be a hassle sometimes)?
@JohnWatsonRooney3 жыл бұрын
Yes you can absolutely be detected and blocked still. If scraping lots of data proxy’s are a must. With most sites doing it this way you need the cookie generated from your browser - this cool data is transferred when we used insomnia and that allows us access
@kuniling3 жыл бұрын
I find your web scraping videos the most useful and user friendly in youtube. I'm just wondering if there is a way to scrape an html file from the local hard drive for practising purposes since I spend some time travelling with no internet connection, in addition, I think it would be nice to avoid overloading a server when practising.
@JohnWatsonRooney3 жыл бұрын
Sure, save the html to file and open it in Python - it will load into bs4 for scraping practise on the go
@kuniling3 жыл бұрын
@@JohnWatsonRooney wonderful, thank you so much.
@wangdanny1782 жыл бұрын
I find another problem. When i run scoreslive.py, it raise the exception JSONDecodeError, would you pls help me with that? thanks ahead