Cleaning up 1000 Scraped Products with Polars

  Рет қаралды 5,691

John Watson Rooney

John Watson Rooney

Күн бұрын

Пікірлер: 27
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/JohnWatsonRooney/ . You’ll also get 20% off an annual premium subscription.
@crissydogg
@crissydogg 8 ай бұрын
Brilliant as always
@fredde7356
@fredde7356 8 ай бұрын
Hey John, can you please continue the scraping livestream with your test site? Would love to see how to handle the drop-down menus, Java script and how to handle stricter cloudflare rules Would be happy to hear about some news, if you plan to continue
@JohnSmith-y8o
@JohnSmith-y8o 8 ай бұрын
I'm parsing my scraped data into Pydantic models. Never looked into dataframes pandas etc. should I? :D
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
definitely, it's great for analysis and transforming data
@Rice0987
@Rice0987 8 ай бұрын
I've got next message in VS: Missing required CPU features. Install the `polars-lts-cpu` package instead of `polars` to run Polars with better compatibility. AND it starts working!
@Rice0987
@Rice0987 8 ай бұрын
This is additional option if you have such issue. So, if you have it you need to install that second package after installing polars.
@historiadeunahora6383
@historiadeunahora6383 6 ай бұрын
this also helped me, thanks!!
@prashantbhosale6745
@prashantbhosale6745 8 ай бұрын
hi, can you show us how to extract all the data related to the title/field from the pdf file.
@Septumsempra8818
@Septumsempra8818 8 ай бұрын
Has anyone built an engine to mimic guman behavior? How does mone move the mouse and scroll etc to mimic human behavior. I have a site that requires a from to be filled in to change location. But it triggers captcha without fail. Does anyone have tips on mimicking human behavior or how to type something into a captch form
@pypypy4228
@pypypy4228 8 ай бұрын
What are the advantages over Pandas?
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
It’s going to be faster but for what I do there aren’t any really - I just wanted to try it out
@gokulyc
@gokulyc 8 ай бұрын
Code / repo link?
@einekleineente1
@einekleineente1 8 ай бұрын
0:20 you promised to link to the other video wehre you got the data... now I am sad that the link is not there .. and I never would get the idea to look at your channel to see which videos you posted in the last days.. ;-)
@DM-py7pj
@DM-py7pj 8 ай бұрын
might be video titled "Website to Dataset in an instant" based on quick scan of field names.
@mishmohd
@mishmohd 8 ай бұрын
Cleaning data is so passe, we up here dry cleaning data
@DM-py7pj
@DM-py7pj 8 ай бұрын
Looks v similar to PySpark.
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
i've not used that but looks good
@bakasenpaidesu
@bakasenpaidesu 8 ай бұрын
.
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
..
@abdulrahmanharoon3165
@abdulrahmanharoon3165 8 ай бұрын
Is it faster than pandas?
@bakasenpaidesu
@bakasenpaidesu 8 ай бұрын
10x
@JohnWatsonRooney
@JohnWatsonRooney 8 ай бұрын
It’s written in rust and yes it is generally faster than pandas
@abdulrahmanharoon3165
@abdulrahmanharoon3165 8 ай бұрын
Thanks @@JohnWatsonRooney
@Septumsempra8818
@Septumsempra8818 8 ай бұрын
Kinda the same speed. It shines when the data gets big. Once your file is 1Gb+ it makes pandas look old.
@abdulrahmanharoon3165
@abdulrahmanharoon3165 8 ай бұрын
Thanks @@Septumsempra8818
This is a Scraping Cheat Code (for certain sites)
32:08
John Watson Rooney
Рет қаралды 5 М.
This is How I Scrape 99% of Sites
18:27
John Watson Rooney
Рет қаралды 197 М.
Players push long pins through a cardboard box attempting to pop the balloon!
00:31
Как Я Брата ОБМАНУЛ (смешное видео, прикол, юмор, поржать)
00:59
УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts
00:49
Smart Sigma Kid #funny #sigma
00:33
CRAZY GREAPA
Рет қаралды 36 МЛН
Why I chose Python & Polars for Data Analysis
24:33
John Watson Rooney
Рет қаралды 7 М.
Selenium Web Scraping is too Slow. Try This.
22:11
John Watson Rooney
Рет қаралды 10 М.
Is Your Scraper Slow? Try THIS Simple Method
10:43
John Watson Rooney
Рет қаралды 6 М.
This script I threw together saves me hours.
13:38
John Watson Rooney
Рет қаралды 21 М.
Scrapy in 30 Minutes (start here.)
30:02
John Watson Rooney
Рет қаралды 18 М.
Modern Python logging
21:32
mCoding
Рет қаралды 207 М.
5 Useful F-String Tricks In Python
10:02
Indently
Рет қаралды 334 М.
The Biggest Issues I've Faced Web Scraping (and how to fix them)
15:03
Scraping with Playwright 101 - Easy Mode
19:56
John Watson Rooney
Рет қаралды 14 М.
Players push long pins through a cardboard box attempting to pop the balloon!
00:31