Web Crawler - System Design Interview Question

  Рет қаралды 8,523

TechPrep

TechPrep

Күн бұрын

This is a solution to the classic web crawler system design interview question. It addresses the main problems most interviewers would want to see handled, as well as discussing additional areas that may be discussed in the interview.
⏰ Time Stamps ⏰
0:00 Use cases
0:42 Requirements
1:15 Estimates
3:06 Architecture overview
9:06 URL frontier
11:21 System flow
12:26 Additional discussion points
Preparing for a technical interview?
👉 Checkout techprep.app/yt to nail your next interview

Пікірлер: 11
@Robloxgod4
@Robloxgod4 4 ай бұрын
The KZbin algorithm has picked up your channel. Really good content
@SirDrinksAlot69
@SirDrinksAlot69 4 ай бұрын
Hashes. You can even halve them for example and so long as the interviewer doesnt have any rules around specific length then add digits until it clears, can do things to make that fast as well. Hashes also help obfuscation so it's harder to scan and obtain the short urls and it makes looking up duplicates easier.
@LouisDuran
@LouisDuran 2 ай бұрын
I like that these are short and sweet. It shouldn't take an hour to explain TinyURL or web crawler. Thanks!
@TechPrepYT
@TechPrepYT 22 күн бұрын
Exactly 👍
@rajaryanvishwakarma8915
@rajaryanvishwakarma8915 4 ай бұрын
Great video man
@ChimiChanga1337
@ChimiChanga1337 4 ай бұрын
Excellent! Could also talk about what kind of network protocols will be used for services to talk to eachother?
@LearningNewThings0407
@LearningNewThings0407 2 ай бұрын
Is it Font queue prioritizer or Front queue prioritizer ?
@jjlee4883
@jjlee4883 4 ай бұрын
Awesome video. Would it make sense for the url seen detector and url filter to come after the html parser step?
@TechPrepYT
@TechPrepYT 4 ай бұрын
Thanks for the comment! You wold want the duplicate detection to occur directly after the HTML parser as we don't want to process the same data and extract the same URLs from the same page and that's why the URL Seen Detector and URL filter happen later on in the system. Hope this makes sense!
@WINDSORONFIRE
@WINDSORONFIRE 20 күн бұрын
How does the design of a web crawler not include geo located servers etc?
@dibll
@dibll 4 ай бұрын
During duplicate detection step, how Content Cache is being used? Could someone please explain?
System Design Interview Question: Design URL Shortener
13:25
Hayk Simonyan
Рет қаралды 4,9 М.
Web Crawler System Design Concepts Nobody Talks About
21:42
Pratiksha Bakrola
Рет қаралды 5 М.
I CAN’T BELIEVE I LOST 😱
00:46
Topper Guild
Рет қаралды 108 МЛН
Nutella bro sis family Challenge 😋
00:31
Mr. Clabik
Рет қаралды 12 МЛН
Design a Payment System - System Design Interview
31:40
High-Performance Programming
Рет қаралды 431 М.
Twitter / Newsfeed  System Design Interview Question
13:01
TechPrep
Рет қаралды 10 М.
Top 7 Ways to 10x Your API Performance
6:05
ByteByteGo
Рет қаралды 314 М.
System Design: Design a URL Shortener like TinyURL
16:00
Code Tour
Рет қаралды 80 М.
System Design Interview Walkthrough: Design Twitter
23:04
Hello Interview - Tech Interview Preparation
Рет қаралды 24 М.