Selenium - Real World Web Scraping - Challenges and Solutions

  Рет қаралды 7,363

Code [RE] Code

Code [RE] Code

Күн бұрын

In this video, you will learn how to solve a real-world scrape using Selenium, including export to CSV.
See the same site scraped using Scrapy: • Python and Scrapy - Sc...
Source: gist.github.co...
Visit my site for free course on Scrapy: coderecode.com/
~-~~-~~~-~~-~
Please watch: "Making Scrapy Playwright fast and reliable"
• How to make Scrapy Pla...
~-~~-~~~-~~-~

Пікірлер: 30
@cadbane9072
@cadbane9072 2 жыл бұрын
omg. that stale element error has driving me nuts the past few days. Thank you! You explained it reallllllyyyy well. you got a fan!
@mikewaters6980
@mikewaters6980 2 жыл бұрын
You’re a great teacher- one of the best I’ve seen.
@codeRECODE
@codeRECODE 2 жыл бұрын
Thank you, Mike. Your words mean a lot!
@meamitacharya
@meamitacharya 4 жыл бұрын
Please upload tutorials on XPath selectors
@codeRECODE
@codeRECODE 4 жыл бұрын
Putting on the list :-)
@engineerbaaniya4846
@engineerbaaniya4846 4 жыл бұрын
Please upload more videos
@Song-dc5hl
@Song-dc5hl 2 жыл бұрын
Hi codeRECODE, thanks for your video. I have an issue with the headless mode, I am trying to scrape another website, and I can get the info with normal mode. But when I apply headless, it does not return any info when I use the xpath selector. The reason I want to use headless is that i want to run my code on linix server. thank you again for your video.
@arpitakar3384
@arpitakar3384 Ай бұрын
jai hind sir loved to follow today
@haideralihassan5053
@haideralihassan5053 3 жыл бұрын
Sir If we open freelancing platforms sites at same time while running all this will it any affect profiles as Chrome is opening and closing automatically? Please answer this before I start.
@codeRECODE
@codeRECODE 3 жыл бұрын
Ideally, this is what you should do - write the code and run it yourself to see what happens. Less ideal answer to this question is that selenium uses a different profile. Your other browsing sessions will not be affected. Furthermore, if you set to run in headless mode, you will not even know when it opened and closed.
@haideralihassan5053
@haideralihassan5053 3 жыл бұрын
@@codeRECODE thanks sir.
@haideralihassan5053
@haideralihassan5053 3 жыл бұрын
Hi sir I have created Google maps scraper with selenium and used technique to save data like in this video. And it's all working well. But problam is that if script get any error just before it ends correctly the data will be not saved. Tell any technique that data save in Csv accordingly as script runs. Just like scrapy. If script stops any where in middle then we get data that script collected till that time instead of getting nothing.
@codeRECODE
@codeRECODE 3 жыл бұрын
Use try-except-finally and write the data in the finally block.
@KhalilYasser
@KhalilYasser 3 жыл бұрын
Thank you very much. As for the code on GitHub in the line `school_name_selector = '.school-title ~ h1`, I think it should be `school_name_selector = '.school-title h1'`
@codeRECODE
@codeRECODE 3 жыл бұрын
I wanted to check this before reply but I haven't got to that yet. Thanks anyways :-)
@codecafe_bits
@codecafe_bits 4 жыл бұрын
Please make a video on using Scrapy with Selenium
@codeRECODE
@codeRECODE 4 жыл бұрын
Good idea! Adding to the list.
@robertsmith4019
@robertsmith4019 2 жыл бұрын
It's much less trouble to use WebDriverManager.
@codeRECODE
@codeRECODE 2 жыл бұрын
Yes, that does make me things easier.
@manimaranr7509
@manimaranr7509 4 жыл бұрын
Please do the video , which has more hyperlink pages in java script websites...
@codeRECODE
@codeRECODE 4 жыл бұрын
Do you have any specific problem that you are facing?
@shabbirahmed9734
@shabbirahmed9734 3 жыл бұрын
Thanks a lot!!!
@codeRECODE
@codeRECODE 3 жыл бұрын
Happy to help
@muhammadrizwan-tz9oz
@muhammadrizwan-tz9oz 2 жыл бұрын
After printing the third school getting: Index out of bound exception
@rangabharath4253
@rangabharath4253 4 жыл бұрын
awesome sir
@abukaium2106
@abukaium2106 4 жыл бұрын
raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: how i solve the problem?
@codeRECODE
@codeRECODE 4 жыл бұрын
Look into Explicit and Implicit waits in the documentation. selenium-python.readthedocs.io/waits.html
@stalluri11
@stalluri11 3 жыл бұрын
What's your email address?
@codeRECODE
@codeRECODE 3 жыл бұрын
coderecode.com/contact-us/
@daddyrichie
@daddyrichie 4 жыл бұрын
Very Nice and Very Helpful. Thank you ! May I ask if you have any plan to make a tutorial that includes a site login, navigation and monitor of XHR traffic once on the correct page. (Your $1 course "schools" video has some of the elements I'm referring to.) What I'm trying to solve is logging in, getting to the correct page gracefully with natural pauses and lags like a human does when clicking through. An example could be a live auction where a current bid on an item is served and visible on the UI and on the XHR stack in real time along with other info flowing in. The purpose fulfilled would be to capture time, date, item, sale price and build a "comparable sales" database. To be a bidder in any auction, one most likely has to login and navigate to the specific auction channel of interest, sit there and watch the items come and go and manually record what an item sold for.
This is how I scrape 99% websites via LLM
22:44
AI Jason
Рет қаралды 245 М.
КОТЁНОК МНОГО ПОЁТ #cat
00:21
Лайки Like
Рет қаралды 2,8 МЛН
I shouldn't be in the house  It's so embarrassing
00:22
Funny Parent-Child Videos
Рет қаралды 9 МЛН
Thank you 😅
00:15
Nadir Show
Рет қаралды 46 МЛН
I didn’t expect that #kindness #help #respect #heroic #leohoangviet
00:19
Beautifulsoup vs Selenium vs Scrapy - Which Tool for Web Scraping?
6:54
John Watson Rooney
Рет қаралды 79 М.
Scrapy Crawl Spider - A Complete Guide
19:11
Code [RE] Code
Рет қаралды 17 М.
Web Scraping Multiple Pages with Python and Selenium + CSV File
38:40
Scraping Dynamic JavaScript Websites - Beautiful Soup Python
11:38
Dynamic Site Scraping - Digging Deeper into APIs
9:02
Code [RE] Code
Рет қаралды 4,7 М.
Perform Web Scraping On Wikipedia- Data Science
8:34
Krish Naik
Рет қаралды 23 М.
Python and Scrapy - Scraping Dynamic Site (Populated with JavaScript)
15:40
КОТЁНОК МНОГО ПОЁТ #cat
00:21
Лайки Like
Рет қаралды 2,8 МЛН