The Simplest way to Scrape Faster.
10:43
Website to Dataset in an instant
13:15
More spiders, more data
12:18
4 ай бұрын
still the best way to scrape data.
41:01
Scrapy in 30 Minutes (start here.)
30:02
Пікірлер
@arleendo
@arleendo 3 сағат бұрын
My company doesn't allow access to PATH editing, so the command playwright install does not work. How do I initialize playwright from the command line manually? Thanks!
@FolioCharm
@FolioCharm 15 сағат бұрын
I'm interested in automating media (video, image...) upload/download, are there any python libraries that tackle such tasks or do you have any tutorial on this subject?? Thanks.
@JohnWatsonRooney
@JohnWatsonRooney 12 сағат бұрын
There are but it’s not something I’ve ever done I’m afraid! I know it’s possible though
@5508Vanderdekken
@5508Vanderdekken 15 сағат бұрын
Damn why didn’t I think of automating my job
@drachenfliger1368
@drachenfliger1368 Күн бұрын
Nord
@sylcexport9642
@sylcexport9642 Күн бұрын
Hi John it seems google api change the way we write things up, here's what I had to write to get it to work: sh.update([['third']], 'A1')
@AjiPirjian
@AjiPirjian Күн бұрын
🔥🔥🔥
@BrownStain_Silver
@BrownStain_Silver Күн бұрын
I'm just finishing up Automate the Boring Stuff now. I've been writing some programs for work as well. I'm a few months in and it's a lot of fun. So far i've automated a few work processes. The most meaningful one is reduce a potentially 30+ hour task to hitting run on a program, providing 3 user response, and a 2-5 minute wait.
@potatopc
@potatopc Күн бұрын
this is life saving!!!
@joshuadias2468
@joshuadias2468 2 күн бұрын
Fantastic video
@aperxmim
@aperxmim 2 күн бұрын
please provide the sample file?
@iitsTech
@iitsTech 2 күн бұрын
Great video ty!
@amirsohail2368
@amirsohail2368 3 күн бұрын
Hey john, Can you please make a video to scrape desktop application data.
@markuswierer1755
@markuswierer1755 3 күн бұрын
As a person who doesn't understand much about coding: is having your username and password openly in a script the norm? It kinda... "feels" wrong... I assume you could make those two things an interactive prompt to type in everytime you start the program instead of having it written in the code permanently. But that won't help much if it is scheduled to run automatically at certain times without user interaction... Again, I don't understand much about programming, so I'm curious if that is a security concern in the "real world"?
@JohnWatsonRooney
@JohnWatsonRooney 2 күн бұрын
You’re right, it isn’t. It’s a security risk. But I wanted to keep it as simple as possible here. I would store the user and pass in environment variables or similar. But as I designed this script to only run locally with me there I didn’t worry about it
@markuswierer1755
@markuswierer1755 2 күн бұрын
@@JohnWatsonRooney Thank you very much for your reply! Good to know! Then this is something I have to look further into, should it come to a practical application for me. But at least my "feeling" wasn't wrong. That's something, I guess. Keep up the good work, I enjoyed the video. :)
@pcservicelondon
@pcservicelondon 3 күн бұрын
awesome video :) thank you
@kashifrana6798
@kashifrana6798 3 күн бұрын
Amazing content. Is there any video to work api with postman?
@cpaandy3380
@cpaandy3380 3 күн бұрын
i was shocked how slow BeautifulSoup compared to cheerio!!!, i thought because scraping is a big thing in python everything will be optimized but thats not the case!!!!
@muhammadsalmandata
@muhammadsalmandata 3 күн бұрын
how to scrape zillow website please make video on it
@alexmckinley79
@alexmckinley79 4 күн бұрын
My god, I love using playwright, but had no idea about codegen! Thank you sir.
@JNET_Reloaded
@JNET_Reloaded 4 күн бұрын
erv2 $ sudo apt install -y chromium-chromedriver Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package chromium-chromedriver how do i get this on my rpi 5 arm raspian 64bit?
@JNET_Reloaded
@JNET_Reloaded 4 күн бұрын
cant install from command line and u didnt put a link to site
@tvcodemate
@tvcodemate 4 күн бұрын
You are great 👍🏽 I have question, i am going to make videos about web scraping. My question is that does it ok with KZbin rules? I mean i want to make tutorials about scraping imdb or amazon. KZbin doesn’t have problem with this? Thanks
@JohnWatsonRooney
@JohnWatsonRooney 4 күн бұрын
Yeah go for it, I’ve got lots of web scraping videos and others do too
@tvcodemate
@tvcodemate 4 күн бұрын
@@JohnWatsonRooney perfect, thank 👍🏽
@HumanityAsCode
@HumanityAsCode 4 күн бұрын
This is awesome but if you have access to APIs rather than UIs it's a lot more efficient to do it without any UI involved at all. Ideally software developers would be involved in helping eliminate so much busy work in the first place. I try to be helpful to other departments in that way where some small changes to a tool someone uses every day can make all the difference.
@Mousoukyou
@Mousoukyou 4 күн бұрын
Very good info, thanks!
@stanTrX
@stanTrX 5 күн бұрын
Thanks good man
@dj0men666
@dj0men666 5 күн бұрын
Does the system not have an export or an api? Would be much easier than hacking around the front end. If there was no export then i can see its use - or even running things in the UI - ive used test software to automate things for me before :) Its great seeing the computer do what you woudl be doing normally while you get on with something else more productive.
@cbacca2999
@cbacca2999 5 күн бұрын
Can you put the tool name "Playwright for Python" in the video description so your video will come up in searches for Playwright? I don't know what automation tools are out there, and I just learned Python, so I'm looking at random videos about automation right now.
@tmb8807
@tmb8807 6 күн бұрын
Good stuff. Nice to see Polars being adopted. I would prefer to use Python myself but sadly it’s an absolute no-no for our IT. So I’ve got stuck in with Power Query and I’ve found that it is actually very capable. The concepts are pretty similar. The mashup language is apparently derived from F#. Main difference I’ve found with writing the code is just that commands/functions have to be nested rather than chained. It’s more verbose and can be less readable but ~all the functionality is there.
@JohnWatsonRooney
@JohnWatsonRooney 5 күн бұрын
Yeah power query is great for what it is, I have friends in data jobs who are masters with it. For me it just didn’t make sense though as I had the option to learn and use Python and end up with a wider more useful skill set
@tmb8807
@tmb8807 5 күн бұрын
@@JohnWatsonRooney Completely agree. Much more transferrable.
@AshleyMush
@AshleyMush 6 күн бұрын
Wow, this is much cleaner than selenium
@arcticslay6012
@arcticslay6012 6 күн бұрын
May I ask why you chose this method over the playwright codegen method you used in your past vid?
@JohnWatsonRooney
@JohnWatsonRooney 6 күн бұрын
Sometimes codegen doesn’t pick the best selectors so I often do it myself
@saurabhjain2437
@saurabhjain2437 6 күн бұрын
async for loop would have been more readable…
@capability-snob
@capability-snob 6 күн бұрын
input() is a nice alternative to sleep you may like - it waits until you press enter in the shell, so you can wait as little or as much as you like.
@andrewlimmmmm
@andrewlimmmmm 6 күн бұрын
thank you bro you are the best
@TheTruepikvic
@TheTruepikvic 7 күн бұрын
What about captcha?
@njl1282
@njl1282 8 күн бұрын
I did just this and now work remotely making six figures at the same company
@MrMantas211
@MrMantas211 8 күн бұрын
Did the aii transfer to other languages?
@BreetaiZentradi
@BreetaiZentradi 8 күн бұрын
At a job I had, one of the more important clerical staff lost 4 hours a week, comparing Folder A to Folder B, and Folder B to Folder C, and nothing what had changed with what folders were added, or moved. I wrote a script that would daily create a directory listing, DIFF with yesterdays listing, produce a report of ADDs, MOVEs and DELETEs and email it to them daily, then make today's listing "yesterdays listing" for the next run. This saved them 4 wasted work weeks a year. In addition they had the changes daily now instead of weekly. Once that was available it turned out there were 7 or 8 other people who benefited from this information. I just changed the program to email them all. It is amazing how one little script can save so much time.
@Gunat737
@Gunat737 8 күн бұрын
Interesting tutorials
@brocktechnology
@brocktechnology 9 күн бұрын
I didn't follow any of that but it looks really useful so I'm gonna discuss it with ChatGPT and see if I can figger out how to do it.
@alizeshan6432
@alizeshan6432 9 күн бұрын
For anyone who's confused: The slice containing numbers is named as "letters" and the letters are names as "numbers". @john, thanks for this. It is quite helpful. Dealt with something like this a few weeks ago. Ended up using re.split to solve my problem.
@xilllllix
@xilllllix 9 күн бұрын
i'm going through ALL of your videos and just finished this one! learning so much it's incredible!
@shaftymaze
@shaftymaze 9 күн бұрын
You changed my life. Only your videos. No joke.
@shaftymaze
@shaftymaze 9 күн бұрын
I mean there was a lot of grind. Except these videos made it possible.
@usr829
@usr829 10 күн бұрын
Very useful. Thank u so much
@Chill018
@Chill018 10 күн бұрын
nicely explained and all... however what about when you need to navigate a website once you are logged in? or when a website has recaptcha or cloudflare protection? I have been struggling quite a log with different websites that are not so simple like a dummy site u r using
@imranelectricandhardwareba9332
@imranelectricandhardwareba9332 10 күн бұрын
Nord
@user-ry5zu1wo4e
@user-ry5zu1wo4e 10 күн бұрын
Thanks
@dfcastro
@dfcastro 10 күн бұрын
I do some automations of some activities that are repetitive and boring. On my case I used a bash script.
@adventurefilms
@adventurefilms 10 күн бұрын
What resource did you come across at 1:03? Apologies, the title and author all ran together.
@aslmate
@aslmate 11 күн бұрын
Do you think this could work with multiprocessing or multithreading in python? Like opening up many instances of the same process at the same time? How would you change the code so that every error is handled during runtime and to make sure it cannot fail?
@alexkaspi
@alexkaspi 11 күн бұрын
The CS references really hit home for me. 🤣🤣🤣🤣 great video!
@TheNazbul
@TheNazbul 11 күн бұрын
some website gives 403 error. how can we solve that?