No video

2.2 - Submit Dataproc job via Cloud Console | Apache Spark on Dataproc | Google Cloud Series

  Рет қаралды 12,687

Sushil Kumar

Sushil Kumar

Күн бұрын

In this video we'll see how to submit a Spark job on Dataproc cluster via Google Cloud Console.
Spark Job Driver : gist.github.co...
This video is part of the course Apache Spark on Dataproc. You can find all the videos for this course in the following playlist.
• Apache Spark on Datapr...
I regularly blog and post on my other social media channels as well, so do make sure to follow me there as well.
Medium : / sushil_kumar
Linkedin : / sushilkumar93
Github : github.com/kay...

Пікірлер: 11
@anshikayadav5718
@anshikayadav5718 3 жыл бұрын
Great learning!!
@naveendayyala1484
@naveendayyala1484 Ай бұрын
why we adding jobs every time here we can not use this cluster as we use in databricks and what is the platform we can use to run our query like we run in databricks.
@sharadshrivastava4285
@sharadshrivastava4285 2 жыл бұрын
Hi Sushil, I am getting error while submitting the dataproc pyspark job. Job failed with error task was not acquired. On Google, I found that to this may be due out of memory or network issue. Non of my pyspark job run . Do you know how to check the network issue here? Master and worker nodes are showing up and running on gcp console
@kaysush
@kaysush 2 жыл бұрын
Hey Sharad, did you check in the YARN UI if all the nodes are healthy and have resources ? Is this a shared cluster or exclusive to the job you are running ? This error usually is because of YARN being unable to provide required resources. Also check your per executor configuration is not more than a single node (e.g. an executor is asking for 10GB memory but you are using smaller nodes). Let me know if this works. Thanks.
@parasbansal6069
@parasbansal6069 2 жыл бұрын
How to add the python file on the mentioned path?
@kaysush
@kaysush 2 жыл бұрын
Its at the start of the video. We are using gsutil to put file in the bucket.
@parasbansal6069
@parasbansal6069 2 жыл бұрын
@@kaysush Got it. Can I copy the spark py file from github to GCS directly?
@kaysush
@kaysush 2 жыл бұрын
@@parasbansal6069 I don’t think you can. You’ll have to download that file.
@neha-fp6kp
@neha-fp6kp Жыл бұрын
@@parasbansal6069 u can clone the git repo and then copy paste the git file to a bucket
@duskbbd
@duskbbd Жыл бұрын
Better you should show how to see spark ui dag after job submission with help of dataproc and writing logs to gcs buckets @sushilkumar
GCP Dataproc create cluster using CLI | Run PySpark job through GCP console
22:05
Anjan GCP Data Engineering
Рет қаралды 1,9 М.
WORLD'S SHORTEST WOMAN
00:58
Stokes Twins
Рет қаралды 195 МЛН
👨‍🔧📐
00:43
Kan Andrey
Рет қаралды 10 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 35 МЛН
Google Dataproc BigData Managed Service
16:00
Data Engineering
Рет қаралды 8 М.
Dataproc Workflow Templates | Execute Jobs DAG  on Managed Dataproc Cluster
21:51
Anjan GCP Data Engineering
Рет қаралды 3,1 М.
12  How to use Cloud Dataproc in GCP
15:33
Sanket Thodge
Рет қаралды 10 М.
WORLD'S SHORTEST WOMAN
00:58
Stokes Twins
Рет қаралды 195 МЛН