08 - Azure Portal Overview
5:20
4 ай бұрын
07 - Creating Azure Account
10:04
4 ай бұрын
02 - Course Prerequisites
2:28
5 ай бұрын
01 - About The Course
5:58
5 ай бұрын
Spark Installation Prerequisites
5:26
Spark Development Environment
1:45
Setup and test your IDE
4:27
9 ай бұрын
Install and run Apache Kafka
10:25
Introduction to Stream Processing
9:17
Пікірлер
@F-Zone1
@F-Zone1 3 күн бұрын
Hello sir Can you please explain how to connect db2 database from pyspark I'm searching lot of website and other sources i couldn't connect my company's ibm db2 to access tables please help me with supported jar files or what i need to use to access it thanks
@gurumoorthysivakolunthu9878
@gurumoorthysivakolunthu9878 5 күн бұрын
Very detailed... best ever explanation of a topic, Sir... This is amazing... Thank you, Sir....
@Sauravsuman11005
@Sauravsuman11005 5 күн бұрын
Datanode = 10 16 CPUs / node 64 GB Memory / node Please tell me cluster config we are going to choose ?
@shyamyadav-qk4zb
@shyamyadav-qk4zb 5 күн бұрын
Very good video sir , my doubt is clear now much more helpful 🙏
@arnabghosh21
@arnabghosh21 7 күн бұрын
For the same 10 GB file suppose we have following resources: 38 GB worker memory with10 cores, 8gb driver memory with 2 cores, manually configured schuffle partitions - 80. How will it behave?
@ashutoshkandpal6792
@ashutoshkandpal6792 14 күн бұрын
How to become member to unlock videos?
@soumikdas7709
@soumikdas7709 15 күн бұрын
Nice explanation
@user-uo8jg2qm8q
@user-uo8jg2qm8q 16 күн бұрын
Does creating workspace will charge you??
@SanjayKumar-rw2gj
@SanjayKumar-rw2gj 17 күн бұрын
Great explanation, to the point no exaggeration. Thanks for the video
@Wilmar-xz5io
@Wilmar-xz5io 18 күн бұрын
best explanation ever!
@VivekKBangaru
@VivekKBangaru 18 күн бұрын
Very informative video. Thanks
@ayeshakhan8726
@ayeshakhan8726 24 күн бұрын
Easy explanation👍
@PANKAJKUMAR-fe8zn
@PANKAJKUMAR-fe8zn 28 күн бұрын
Wonderful explanation. I was studying data cloud in salesforce and they were mentioning this data format multiple time. I was clueless but I got clarity from your video. Thank you sir
@rjrmatias
@rjrmatias Ай бұрын
Excellent vídeo, thank you Master
@nisarirshad8366
@nisarirshad8366 Ай бұрын
I have completed this course on udemy and highly recommend this course on udemy. It's very well explained and easy to understand.
@abhilashvasanth700
@abhilashvasanth700 Ай бұрын
Hello, Is this page updated ? Can we rely on this by becoming a member and stay updated ? If not, where do all your courses be updated? I took your PySpark course on Udemy. Though the beginning was really good, the later part of the course did not have a continuous flow. How do I enroll to your batch course ?
@srikanthk-yp4wj
@srikanthk-yp4wj Ай бұрын
to watch all the videos of Databricks course playlist , if we subscribe 199/- or 399/- ?
@AhsanTirmiziVlogs
@AhsanTirmiziVlogs Ай бұрын
such an engaging content you dont loose me for a second .. amazing explanation.. bless you brother.. in my language "SAADAA KHUSHBU"
@sonurohini6764
@sonurohini6764 Ай бұрын
Great .but follow up question for this by interviwever is s how do we take 4x memory per executor.
@amlansharma5429
@amlansharma5429 Ай бұрын
Spark reserved memory is 300 mb in size and executor memory should be atleast 1.5X times of the spark reserved memory, i.e. 450 mb, which is why we are taking executor memory per core as 4X, that sums up as 512mb per executor per core
@justinmurray8313
@justinmurray8313 Ай бұрын
Is there still a coupon to get this course for free?
@ABQ...
@ABQ... Ай бұрын
Please provide prerequisites
@AmitCodes
@AmitCodes Ай бұрын
How to certify
@saisivamadhav8338
@saisivamadhav8338 Ай бұрын
Awesome sir
@anuragjaiswal1399
@anuragjaiswal1399 Ай бұрын
Thanks man it worked.
@ongn1611
@ongn1611 Ай бұрын
Very simple and precise. Thank you
@nwanebunkemjika7822
@nwanebunkemjika7822 Ай бұрын
THANKS
@deevjitsaha3168
@deevjitsaha3168 Ай бұрын
is this course suitable for scala users or do we need to have python knowledge?
@tridipdas9930
@tridipdas9930 Ай бұрын
What if the cluster size is fixed? Also ,shouldn't we take into account per node constraint? For eg: what if the no. of cores in a node is 4?
@Lakshvedhi
@Lakshvedhi Ай бұрын
very very good and valuable course.
@veerendrashukla
@veerendrashukla Ай бұрын
In the last step, you did kinit , that pulled the tgt and then dev uer could list the files. At what point of time, the client interacted with TGS with this tgt?
@user-dx9pj6bp3w
@user-dx9pj6bp3w Ай бұрын
The course is very well organized
@vvsekhar1
@vvsekhar1 Ай бұрын
Thank you so much. Well explained about Root user.
@federico325
@federico325 Ай бұрын
incredible, thanks
@vaibhavtyagi9885
@vaibhavtyagi9885 2 ай бұрын
in last question each and every value you took was default only (128mb, 4, 512mb,5 cores) , so lets say the question is for 50 gb of data then still 3gb would be the answer?
@HIMANSHUMISHRA-yg8dc
@HIMANSHUMISHRA-yg8dc 2 ай бұрын
ModuleNotFoundError: No module named 'pyspark.streaming.kafka' error using command spark-submit --packages org.apache.spark:spark-streaming-kafka-0-10_2.13:3.5.1 live_processing.py can you help please?
@Amarjeet-fb3lk
@Amarjeet-fb3lk 2 ай бұрын
If no. of cores are 5 per executor, At shuffle time, by default it creates 200 partitions,how that 200 partitions will be created,if no of cores are less, because 1 partition will be stored on 1 core. Suppose, that My config is, 2 executor each with 5 core. Now, how it will create 200 partitions if I do a group by operation? There are 10 cores, and 200 partitions are required to store them, right? How is that possible?
@navdeepjha2739
@navdeepjha2739 Ай бұрын
You can set the no of partitions equal to no. of cores for maximum parallelism. ofcourse, u cannot create 200 partitions in this case
@DUFFERMEHUL
@DUFFERMEHUL 2 күн бұрын
In your case if 200 partitions are created, then your degree of Parallelism will be 10, which means 10 partitions will be processed in a single time and then once those slots are free the next 10 partitions will be processed.
@NandhaKumar1712
@NandhaKumar1712 2 ай бұрын
Hi , Thanks for the explanation. It really helps. In the above example let's say In right stream we are getting impressionId=4, and we didn't get matching events for id=4 on left stream for long time, Is it possible to get this record also inside foreachbatch() function before it gets dropped by spark?
@prasannakumar7097
@prasannakumar7097 2 ай бұрын
Very well explained
@robertakid727
@robertakid727 2 ай бұрын
That is an extradentary explanation, Thank you
@oleg20century
@oleg20century 2 ай бұрын
Best video about this three abstractions
@Mado44555
@Mado44555 2 ай бұрын
thank you for explaining i was looking for a start example to get what it is but videos were like explaining to some experts well i figured out to follow your steps , after running the code and done the ncat command i m getting errors and first one is: "chk-point-dir" any help
@CloudandTechie
@CloudandTechie 2 ай бұрын
C:\kafka\bin\windows>kafka-console-producer.bat --topic test2 --broker-list localhost:9092 < ..\data\sample1.csv The system cannot find the path specified. how to fix this error
@rajibinus
@rajibinus 2 ай бұрын
Insightful explanation. Thanks for the video.
@rajat_ComedyCorner
@rajat_ComedyCorner 2 ай бұрын
Great job, Sir