I have been looking for a nice end-to-end streaming data pipeline tutorial forever. All tutorials were missing something. This is not. This has it all - it's clear, complete and really well explained. Could not recommend enough.
@oussamafahd4205 Жыл бұрын
Thank you Davis for this beautiful mini project it has all what i've wanted to learn as a beginner in the Big Data Technologies
@hkrishnan912 жыл бұрын
World is a better place because of people like you. Very nice presentation of the topic and clear explanation of various steps involved. Thank you very much!
@wewatchyourwebsite13 жыл бұрын
Very well done! I like the fact that you show what you installed and how, and explain every step in perfect detail. Outstanding. I've avoided Scala because it was too complicated, now after seeing your video here, I moved my project from Python to Scala. The API is almost identical. I LOVED the fact that you show your steps for creating the build.sbt, creating the jar file and how to submit it. I also learned how to get rid of the annoying INFO in the log4j file. You also showed how to add the paths so I can run my project from a different folder. Some of this I already knew, but the fact you included it all is outstanding. I'm sure I could have found all this without your video, but the fact that you did it all, is just amazing. Thank you sooooo much for doing this.
@bigdataprofessional18802 жыл бұрын
Super presentation, Practical and thoroughly mentored for a kafka novice
@jayc48494 жыл бұрын
This is great. Exactly what I needed to see, and well explained. I absolutely recommend following the change to the log4j properties file, otherwise your spark output will make it really really hard to find your data as you first test in the console.
@marcosoliveira87312 жыл бұрын
Very good example. Made me think in a lot of ideas. Thanks a lot.
@taimoorabbasi1692 жыл бұрын
Just the video i needed. Brilliant work. Really appreciate it 👏👏
@gasmikaouther68872 жыл бұрын
Hi Mohammed, can we reimplement it together
@taimoorabbasi1692 жыл бұрын
I haven't worked on scala but I've implemented it on Python. We can do it together if you want. 💯
@guilhermesilveira27422 жыл бұрын
This is exatcly what I was looking for. Thanks!!
@faaizshah58303 жыл бұрын
This is fantastic work!
@ambar7524 жыл бұрын
Nicely Explained Davis...Thanks !!
@My_Tahar4 жыл бұрын
thank you @Davis for sharing, the tutorial is very clear
@sankarreddy823 жыл бұрын
Superb. Very much helpful. Thanks a lot.
@savacano28583 жыл бұрын
this is exactly what I am looking for!! Thanks!!!!!
@abdulelahaljeffery62343 жыл бұрын
quality content ... thank you Davis! subscribed
@jagadeeshporalla40414 ай бұрын
Thanks Davis, if there are multiple queries with WriteStream and are done on different datasets derived from the actual Kafka stream using ( readstream) as you did… is this going to grab data for each query independently
@eatonleo3 жыл бұрын
Could you please upload a pyspark version instead of scala? Really appreciate.
@neutrinos4014 жыл бұрын
thank you so much for your great work
@ДмитрийСиницкий-р2щ3 жыл бұрын
My thanks! Great tutorial!
@devon2994 жыл бұрын
thank you! this is helpful to get started.
@sumitkhattar98493 жыл бұрын
Great, and Thankyou. keep doing.
@aomo52933 жыл бұрын
Great tutorial Thank you bro
@amitmiglani52564 жыл бұрын
Hi iam getting this error :- Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.cassandra.package$ can u help
@nadiafekih26194 жыл бұрын
Did you find a solution to this problem ? I'm facing the same issue!!
@dji50284 жыл бұрын
Same here, you tried to run the app on cluster mode ?
@mouadriali32823 жыл бұрын
Same Here >_
@ryanrickerts59823 жыл бұрын
See 34:10
@srhreza3 жыл бұрын
Find proper dependency for your spark version.
@alisahin14664 жыл бұрын
you amazing man ... thank you soo much
@mouhammaddiakhate35463 жыл бұрын
How would we do the spark-submit if it was with PySpark ?
@RajmohanBalachandran3 жыл бұрын
thanks a lot Davis
@imosolar3 жыл бұрын
can I use UI / J notebook instead of command line to run it
@samuelfernandoperezarpi42512 жыл бұрын
Hi, is this lambda architecture?
@nihedattia94774 жыл бұрын
which type of machine VM u work on it ?
@TechGeniusMinds4 жыл бұрын
Nice explanation
@Cal97g3 жыл бұрын
Bro you heard of docker-compose? Installing kafka?
@shrikantbhadane4783 жыл бұрын
good session, Can you teach same thing with PySpark?
@kunjalsujalshah19923 жыл бұрын
Amazing!
@srhreza3 жыл бұрын
Awesome.
@Raaj_ML3 жыл бұрын
Very helpful tutorial ! What is that Trigger.ProcessingTime("5 seconds") and when did you add it ?
@hkrishnan912 жыл бұрын
He missed to talk about it. They determine the batch interval. From the manual: The trigger settings of a streaming query defines the timing of streaming data processing, whether the query is going to executed as micro-batch query with a fixed batch interval or as a continuous processing query.
@suryanshsingh6906 Жыл бұрын
thanks
@enesuguroglu35423 жыл бұрын
It would have been great if all codes implemented with Python.