How to run Spark with Minio in Kubernetes

  Рет қаралды 11,074

Brad Sheppard

Brad Sheppard

Күн бұрын

This video discusses how to setup Minio object storage in Kubernetes and integrate that with Apache Spark for running analytics jobs.
VIDEO RESOURCES
Music: Day One from KZbin Audio Library
Twitter: / bradsheppard5

Пікірлер: 29
@anmfaisal964
@anmfaisal964 Жыл бұрын
I was looking for such in-depth description for so long - take a bow - many many thanks. - loved it.
@citormussa
@citormussa 3 жыл бұрын
First video I've found that truly goes into the details on how to actually build a Spark cluster (with the MinIO bonus) with Kubernetes, instead of showing everything already built. Thank you very much!
@recs8564
@recs8564 2 ай бұрын
EXACTLY WHAT I HAVE BEEN LOOKING FOR
@th9679
@th9679 2 жыл бұрын
00:14 Setup Data Lake (minIO) on Kubernetes 03:03 Explore minIO on Kubernetes and its options 04:20 Explore minIO web UI (contains kubefwd part) 08:02 Define object (write analytics job on top of minIO) 08:34 Install local Spark 13:59 Start writing PySpark job 15:05 Install PySpark 16:00 Writing analytics job in python 20:24 Run the PySpark job 20:56 Missing Spark dependency (expected) error 21:20 Add the missing dependencies 24:00 Rerun the PySpark job 24:53 Containerize the PySpark job 25:28 Spark-operator setup 27:33 Writing Dockerfile for the spark job 34:18 Build image from Dockerfile 34:49 Push image to registry 36:10 Deploy Spark job 40:27 Check job outcome Thank you Brad!
@321andyy
@321andyy 8 ай бұрын
By far the best tutorial that i have seen on this topic! Thank you!!
@DevelopersHubChannel
@DevelopersHubChannel Жыл бұрын
Underrated Tutorial and KZbinr!!!! Love this. So Practical and end-to-end demo.. pro level....I work on K3S daily and sometimes K8S..
@marekkucak6581
@marekkucak6581 3 жыл бұрын
I'm planning to run Spark on K8s cluster made of few Raspberries. This was very helpful.
@bradsheppard6650
@bradsheppard6650 3 жыл бұрын
Thanks, much appreciated! Good luck with the Raspberry cluster.
@marekkucak6581
@marekkucak6581 3 жыл бұрын
@@bradsheppard6650 Thanks
@GamerPCForever
@GamerPCForever 3 жыл бұрын
I figured it out to achieve this in a mesos cluster and also with minIO. It was funny to see because this would have saved me a lot of time haha. Great content sir
@jijosunny8626
@jijosunny8626 3 жыл бұрын
Hi Brad this is a great video, there are few videos/articles which describe in detail how to connect from spark to minio. Keep it up
@nikschuetz4112
@nikschuetz4112 4 ай бұрын
nice video. i am trying for publishing minio events to kafka and connecting this to spark streaming app, video helps a lot
@优雅世界
@优雅世界 2 жыл бұрын
课程讲的很有趣,简单明了,主题清晰,good
@rahul-qo3fi
@rahul-qo3fi Жыл бұрын
thank you so much, this was very informative!!
@anshuman9
@anshuman9 3 ай бұрын
Nice and informative video.
@ylcnky9406
@ylcnky9406 3 жыл бұрын
This is a great video. Thanks for the great effort. Deserves a subscription. Expansion of this mini-datalake platform with more components such as kafka streaming, spark streaming, etc. would be awesome.
@tratkotratkov126
@tratkotratkov126 Жыл бұрын
Thank you for the great explanation !
@특이점이온다-l6d
@특이점이온다-l6d Жыл бұрын
This is a very helpful video thank you
@hi-kp7jg
@hi-kp7jg Жыл бұрын
you are a gigachad for doing this
@johnmason9788
@johnmason9788 3 жыл бұрын
Thanks Brad. This was helpful. What do you think of submitting jobs via Jupyter notebook vs spark operator?
@bradsheppard6650
@bradsheppard6650 3 жыл бұрын
Hey John. Great question! Personally I find Jupyter Notebooks are super helpful for messing around and trying things out, but they are less intended for production code. Once I have code working in a notebook, I'll generally containerize it after the fact and then run it similar to how I showed it in the video. The other great thing about the Spark Operator is its scheduling capabilities (via the ScheduledSparkApplication CRD). So if you have a job that you want to run nightly for example, then you can very easily get that automation from the Operator.
@sahmed0211
@sahmed0211 2 жыл бұрын
really good video, you gave me a lot of cool things to work with
@fmfvieira
@fmfvieira 3 жыл бұрын
That was great, man. Very informative!
@kimted3272
@kimted3272 3 жыл бұрын
thanks bro. saved my day
@afshinyavari7422
@afshinyavari7422 2 жыл бұрын
Awesome video!
@rahul-qo3fi
@rahul-qo3fi Жыл бұрын
21:04 spark s3 dependencies 25:13 spark on k8
@iamindigamer
@iamindigamer 3 жыл бұрын
any github url to apply commands
@amarvakacharla
@amarvakacharla 2 жыл бұрын
This helped a great Brad, thanks a ton. I got an exception while running the spark-pi example because of latest updates to repo, reason for that is related spark-pi-driver is forbidden: error looking up service account default/spark: serviceaccount spark not found Resolved this with little change in helm install command helm install spark-operator spark-operator/spark-operator --set serviceAccounts.spark.name=spark-user and use the same serviceAccount=spark-user in spark-pi.yaml
@tratkotratkov126
@tratkotratkov126 2 жыл бұрын
Great explanation ! Thank you !
Ready to run! Get Started with Spark on Kubernetes
44:31
Spot by NetApp
Рет қаралды 13 М.
Smart Sigma Kid #funny #sigma
00:14
CRAZY GREAPA
Рет қаралды 48 МЛН
Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy
00:17
Leisi Crazy
Рет қаралды 80 МЛН
Хасанның өзі эфирге шықты! “Қылмыстық топқа қатысым жоқ” дейді. Талғарда не болды? Халық сене ме?
09:25
Демократиялы Қазақстан / Демократический Казахстан
Рет қаралды 343 М.
Spark в Kubernetes // Демо-занятие курса «Spark Developer»
2:07:17
OTUS IT Онлайн - образование
Рет қаралды 676
The Philosophy Behind MinIO with AB Periasamy
17:03
Tech Field Day
Рет қаралды 6 М.
Why Spark on Kubernetes
6:59
AIEngineering
Рет қаралды 7 М.
Airflow Quickstart Project with DuckDB, MinIO and Streamlit
13:38
Data with Marc
Рет қаралды 6 М.
Do NOT Learn Kubernetes Without Knowing These Concepts...
13:01
Travis Media
Рет қаралды 309 М.
Smart Sigma Kid #funny #sigma
00:14
CRAZY GREAPA
Рет қаралды 48 МЛН