1.2 Billion Records Per Hour High Performance Kafka and Spark

1.2 Billion Records Per Hour High Performance Kafka and Spark - End to End Data Engineering Project

Рет қаралды 9,103

CodeWithYu

Күн бұрын

Пікірлер: 50

@CodeWithYu 2 ай бұрын

Please don't forget to like and subscribe! ❣

@ss426 Ай бұрын

Please inform the system requirement to run this project

@sarahakinbami2959 2 ай бұрын

Thanks!

@CodeWithYu 2 ай бұрын

You’re most welcome! Thanks for the support! ❤️

@danieladeniyi1558 2 ай бұрын

You are the real MVP. Will sink my teeth into this

@CodeWithYu 2 ай бұрын

Great! Don't forget to leave feedback too. Thanks!

@thesungodzzz 2 ай бұрын

This is a greate video!!!! This is what I expected to see. A greate one, Yu!!! Please keep going. You are inspiring people on this field.

@CodeWithYu 2 ай бұрын

Thank you! Will do!

@shusants 2 ай бұрын

Wonderful content. Thank you so much Yu🙌

@sathyans3852 2 ай бұрын

This is gold. Amazing work and explanations.👌👌👌

@SoarSquads 2 ай бұрын

The view quality is so much better brother keep it up. God bless you and your family

@CodeWithYu 2 ай бұрын

Only the best for you!

@ravitejapothala977 2 ай бұрын

Great video! This is what they look in resume and intervie . Not just etl pipeline but Build scalable and fault tolerant one's. Requesting to include kubernetes in the scalable pipelines . Waiting for part 2

@CodeWithYu Ай бұрын

Great point!

@mohitupadhayay1439 2 ай бұрын

Great work brother

@wiss1998 2 ай бұрын

Thank you again Youssef !

@CodeWithYu 2 ай бұрын

It’s my pleasure Wiss!

@DExpertz 2 ай бұрын

🔥I see Yu post a video, I comment first, then I watch the video. Keep up bro

@CodeWithYu 2 ай бұрын

Legend! You know the drill! 🙌😀

@TAUFIQUESEKH 2 ай бұрын

Very good content. Loved it. Thanks for making this

@CodeWithYu 2 ай бұрын

Only the best for you!

@orafaelgf 2 ай бұрын

I learn so much with your videos man. Keep going on. If possivel bring to content about Kubernetes ( airflow, spark, strimzi (kafka), data lake, etc.) Tks.

@CodeWithYu 2 ай бұрын

Great suggestion!

@NgynAn-dg3kp 2 ай бұрын

love the way you go high level architecture explain with blackboard to the, great 🔥🔥🧑‍🚒

@CodeWithYu 2 ай бұрын

Glad you like it! More to come! 🔥

@thesungodzzz 2 ай бұрын

Great video!!!!

@mohammadbilalniazi5987 2 ай бұрын

جزاک الله خیرا

@CodeWithYu Ай бұрын

Ameen. You too brother

@parasagrawal1140 24 күн бұрын

what is the system requirements? can i make this project in my personel laptop?

@ss426 2 ай бұрын

What is the system requirement to run this project in personal laptop?

@ATHARVA89 2 ай бұрын

thank you

@dhruvingandhi1114 2 ай бұрын

Can you please tell from where u get file of Docker-compose yml.

@shubhambhandari431 2 ай бұрын

Why do we use checkpoint in spark? Can we use group id in kafka to maintain track of records

@CodeWithYu 2 ай бұрын

Good question! It’s for fault tolerance. Think of failures during processing or crashes or similar situations, for Kafka to resume from a close reasonable point before failure, checkpoints help with that. Secondly, during stateful processing, checkpoints make the coordination easier. On the other hand, group id is just for coordination of offsets for message delivery. It has nothing to do with checkpointing or fault tolerance. Hope that helps!

@PRUTHVIRAJ-wp9vu 2 ай бұрын

Great 👍

@CodeWithYu 2 ай бұрын

Thank you! Cheers!

@nadiiar75 2 ай бұрын

🔥

@thirdojohnson4325 2 ай бұрын

hello boss, just wanted to know if you are available at your convenience for just a question on any of your socials if you dont mind me asking please. Thank you for all you do for us beginners

@CodeWithYu 2 ай бұрын

Sure thing, How can I help?

@flynnryder7823 2 ай бұрын

Can you please make your screen smaller, thanks

@bharathdeva8849 2 ай бұрын

Hello yu. Great content in recent time❤️✨. You are valuing your audience interest, great ❤️✨. I can't see more people doing that. But I see that source code is only for members which 10 dollars. Coming tier 2 city 10 dollars is very high which is 10% of my salary 😢 which I can't afford. Could you consider my request of changing your membership fee?

@CodeWithYu 2 ай бұрын

It’ll be available to all youtube members soon

@shusants 2 ай бұрын

I agree with @bharathdeva8849. @CodeWithYu, that would be really helpful. Been your subscriber, really loved your content, keep on

@salvityagi2288 Ай бұрын

@@CodeWithYu when it will be available?

@jay_wright_thats_right 2 ай бұрын

I like blackboarding but only if you CAN WRITE legibly. 🤣🤣

@CodeWithYu 2 ай бұрын

lol sorry… will improve next time. I sometimes wonder if I write those! 😂😂😂😂

@chirantharavishka5918 9 күн бұрын

sir my kafka configuration are correctly work beacuse i send 5 lks msg correctly producer are work but i run the sprk file then get the below error 25/02/02 05:34:30 INFO BlockManagerMasterEndpoint: Registering block manager 13567d606e23:44337 with 434.4 MiB RAM, BlockManagerId(driver, 13567d606e23, 44337, None) 25/02/02 05:34:30 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20250202053430-0001/2 on worker-20250202035206-172.21.0.6-36449 (172.21.0.6:36449) with 1 core(s) 25/02/02 05:34:30 INFO StandaloneSchedulerBackend: Granted executor ID app-20250202053430-0001/2 on hostPort 172.21.0.6:36449 with 1 core(s), 1024.0 MiB RAM 25/02/02 05:34:30 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 13567d606e23, 44337, None) 25/02/02 05:34:30 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 13567d606e23, 44337, None) 25/02/02 05:34:30 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 25/02/02 05:34:31 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20250202053430-0001/1 is now RUNNING 25/02/02 05:34:31 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20250202053430-0001/2 is now RUNNING 25/02/02 05:34:31 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20250202053430-0001/0 is now RUNNING 25/02/02 05:34:42 WARN ResolveWriteToStream: spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets and will be disabled