Data Engineering Interview - Netflix Clickstream Data Pipeline

  Рет қаралды 3,852

Exponent

Exponent

Күн бұрын

Пікірлер: 9
@tryexponent
@tryexponent 13 күн бұрын
Join the waitlist for Exponent’s Data Engineering Interview Course: bit.ly/4cmpq34
@harisridhar1668
@harisridhar1668 Ай бұрын
I strongly appreciated the trade-offs and the architecture insights discussed: 1. A hybrid approach combined Spark Streaming versus Apache Flink as distributed computing platforms, based on latency criteria for clickstream metrics. The justification being that Apache Spark streaming works well with >= 1 second metrics generation, whereas Apache Flink meets single millisecond / sub-second performance. 2. To use the push model ( agents and daemons ) versus the pull model ( the infra ) for large-scale data pipelines : the former being better for real-time needs ( even if it may overwhelms the pipeline ), whereas the latter is polling based and may fail to deliver real-time ( or close-enough-to-real-time ) customer behavior insights. 3. The justification for using a NoSQL DB versus a SQL DB for compute storage : NoSQL being schema-less, high-performant, and having low-latency reads and writes - and thus, able to handle large event volumes under ingestion ( e.g. Kafka's 50,000 events / second ), and how he identified the RDBMS storage as a potential pipeline bottleneck.
@prashantsalgaocar
@prashantsalgaocar Ай бұрын
I thought this was too high level. There were no non functional requirements discussed. Also a lot of the complexity was abstracted with Lamdba usage. There should have been more discussion on some of the core functional requirements and non functional requirements and some more deep dives which this system design lacked.
@angelotheman
@angelotheman Ай бұрын
We need more of these. However try to make it suitable for beginners or better still, state the experience in the title so we know whom this is directed to. Thanks
@tryexponent
@tryexponent Ай бұрын
Great idea! We're actually working on this. Maybe adding "This is how a junior candidate answers. This is how a senior candidate answers." Hopefully rolling out soon. Stay tuned
@Ikyua
@Ikyua Ай бұрын
I love this channel so much keep up the content :)!
@akshayshankar3707
@akshayshankar3707 16 күн бұрын
When are you launching the data engineering course?
@tryexponent
@tryexponent 7 күн бұрын
Hey akshayshankar3707, we are planning to launch it in 1-2 months time. Join our waitlist so you get notified when it happens! www.tryexponent.com/courses/data-engineering
@tryexponent
@tryexponent 4 күн бұрын
Likely in October! Finishing up some final lessons right now.
System Design Interview: Design Amazon Prime Video
26:53
Exponent
Рет қаралды 97 М.
I Studied Data Job Trends for 24 Hours to Save Your Career! (ft Datalore)
13:07
Thu Vu data analytics
Рет қаралды 225 М.
The Joker wanted to stand at the front, but unexpectedly was beaten up by Officer Rabbit
00:12
Electric Flying Bird with Hanging Wire Automatic for Ceiling Parrot
00:15
Do you choose Inside Out 2 or The Amazing World of Gumball? 🤔
00:19
POV: Your kids ask to play the claw machine
00:20
Hungry FAM
Рет қаралды 18 МЛН
How Data Engineering Works
14:14
AltexSoft
Рет қаралды 445 М.
Google system design interview: Design Spotify (with ex-Google EM)
42:13
IGotAnOffer: Engineering
Рет қаралды 1,1 МЛН
Engineering Management at Meta
32:02
Everyday Leadership
Рет қаралды 4,6 М.
Amazon System Design Interview: Design Parking Garage
29:59
Exponent
Рет қаралды 1,4 МЛН
Kafka Deep Dive w/ a Ex-Meta Staff Engineer
43:31
Hello Interview - SWE Interview Preparation
Рет қаралды 35 М.
Netflix Engineering Manager (EM) Interview - a Deep-dive
8:56
What does a Data Analyst actually do? (in 2024) Q&A
14:27
Tim Joo
Рет қаралды 60 М.
The Joker wanted to stand at the front, but unexpectedly was beaten up by Officer Rabbit
00:12