Google Cloud Tutorial - Hadoop | Spark Multinode Cluster | DataProc

  Рет қаралды 98,887

Learning Journal

Learning Journal

Күн бұрын

Пікірлер: 152
@ScholarNest
@ScholarNest 3 жыл бұрын
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code. www.learningjournal.guru/courses/
@benjamincabalona9014
@benjamincabalona9014 4 жыл бұрын
This video is under rated
@matheusmota108
@matheusmota108 5 жыл бұрын
Great job! I'm impressed. Straight to the point.
@robind999
@robind999 6 жыл бұрын
Thank you LJ, another great demo. I am working on GCP now and glad to see your great instruction.
@satyenmehta9749
@satyenmehta9749 5 жыл бұрын
Very nice!! One suggestion - it would be very helpful to get access to each of this window commands(instead of typing)
@vmahi111
@vmahi111 6 жыл бұрын
Thanks !!! Clear cut explanation
@sonjoysengupto
@sonjoysengupto 7 жыл бұрын
Fantastic tutorial Sir ... I was sweating this out on VirtualBox Ubuntu 16.04 VMs for POC demo but this tutorial made my life soooper simple🤗
@sumitchakraborty2475
@sumitchakraborty2475 3 жыл бұрын
Nice video. Do you have video on Presto accessing data from GCS bucket.
@dipanshutyagi3984
@dipanshutyagi3984 3 жыл бұрын
Hello sir ,After running "c:\program files\Google\chrome\Application .......... I am getting this site can't be reached .. Webpage at 0.0.0.0 / might be temporily down or it may have moved permanently to a new web address..
@Kaaz-
@Kaaz- 3 жыл бұрын
Simple and clear expination for a great service. Thanks a lot
@dharmeswaranparamasivam5498
@dharmeswaranparamasivam5498 5 жыл бұрын
Thanks Prashanth for giving detail tutorials. This is helping me lot to learn without any issue. Great!!
@saurav0777
@saurav0777 4 жыл бұрын
Is oozie also available in dataproc ? how to setup automated data pipelines in GCP?
@emiliod90
@emiliod90 5 жыл бұрын
Great presentation, thank you
@writtikdey4309
@writtikdey4309 2 жыл бұрын
I cannot find the other videos in this series .. please assist
@nxtbigthing3012
@nxtbigthing3012 7 жыл бұрын
Sir, can you show peering of 2 vpc from different projects for cloudera or hortonworks multinode cluster?
@thilinamadush1
@thilinamadush1 4 жыл бұрын
Great job! Very easy to understand
@ganeshsundar1484
@ganeshsundar1484 5 жыл бұрын
Awesome Prashanth. Could you make few more videos on GCP esp on BigData Services.
@vrowmuria
@vrowmuria 6 жыл бұрын
As per at 3:43, I hit the create button and clusters are created. Can I now configure for Jupyter notebook?
@parthdayala7519
@parthdayala7519 5 жыл бұрын
amazing...god bless you always sir...
@vishalteli7343
@vishalteli7343 5 жыл бұрын
Excellent Sir..
@mahdip.4674
@mahdip.4674 Жыл бұрын
How to open and configure Jupyter with DP on VPC?
@hemanttoday
@hemanttoday 5 жыл бұрын
Outstanding Amazing. You have made the subject easy and interesting.
@abhishekchoudhary247
@abhishekchoudhary247 5 жыл бұрын
Great video sir. Just what was needed. Thanks. You rock!
@niketkumar18
@niketkumar18 4 жыл бұрын
You are great..
@sudharsanganesh8190
@sudharsanganesh8190 6 жыл бұрын
very informative.. crystal clear
@DataPro360
@DataPro360 4 жыл бұрын
great video, straight to the result.
@PierLim
@PierLim 6 жыл бұрын
Thank you for the step-by-step explanation, very clear.
@rahulberry4806
@rahulberry4806 4 жыл бұрын
Is there any way to access the data proc via UI(similar to linux system)
@nileshparekh1569
@nileshparekh1569 4 жыл бұрын
Excellent 👍 can we have more such tutorials on GCP
@cleverclover7
@cleverclover7 4 жыл бұрын
Great video.
@chiruvarma6457
@chiruvarma6457 6 жыл бұрын
Thanks a Lot!!! Crystal clear:)
@ShawVishal
@ShawVishal 4 жыл бұрын
can you make a tutorial for installing ambari the same way
@imammuhajir2200
@imammuhajir2200 3 жыл бұрын
i can connect to port 8088 , but if i change to jupyter notebook port 8123 to be error
@npraba187praba4
@npraba187praba4 3 жыл бұрын
how to join your dataproc course pl let me know i am interested how to join your cores
@oxpeople
@oxpeople 5 жыл бұрын
Thanks for your wonderful tutorial. However SSH tunnel can't connect. do you know how to debug. any help appreciated.
@ScholarNest
@ScholarNest 5 жыл бұрын
It works as long as you give all parameters correctly.
@amitbaderia4194
@amitbaderia4194 5 жыл бұрын
Excellent
@azimkangda8545
@azimkangda8545 6 жыл бұрын
i'm not able to open resource manager UI. i tried steps giveem in ur video but still chrome showing sit can't reach error. i'm able to open ssh session. please guide.
@b6654prakash
@b6654prakash 6 жыл бұрын
Thank you for the nice explanation !!!!!!
@sujoykumarbatabyal7809
@sujoykumarbatabyal7809 5 жыл бұрын
just awesome.
@rudraactivities7162
@rudraactivities7162 6 жыл бұрын
sir plz upload video of hive and hbase differnce?
@sshiv908
@sshiv908 4 жыл бұрын
can i install apache flume on dataproc??
@hemilpatel925
@hemilpatel925 4 жыл бұрын
perfect sir, Thank you so much
@hubstrangers3450
@hubstrangers3450 6 жыл бұрын
could you kindly invoke these activities using GCP -Shell, and explaining the flags please, including the delete option within a time interval
@rajeshm039
@rajeshm039 5 жыл бұрын
when i execute command to launch browser. "%ProgramFiles(x86)%\Google\Chrome\Application\chrome.exe" ^ --proxy-server="socks5://localhost:1080" ^ --user-data-dir="%Temp%\spark-cluster-m" spark-cluster-m:8088...browser is opening but is displays "No internet"'..Could you help on this error
@ThePrimosTeam
@ThePrimosTeam 4 жыл бұрын
After executing this command ""C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" "spark-6-m:8088/cluster" --proxy-server="socks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/spark-6-m , the browser opens up but it says "This site can’t be reached spark-6-m’s server IP address could not be found. Try running Windows Network Diagnostics. DNS_PROBE_FINISHED_NXDOMAIN" Can you please help me here?
@tejaswikt4166
@tejaswikt4166 4 жыл бұрын
Dear Sir, Can you please copy paste the contents of the commands, we will edit it to our hostname settings.
@anshulelegant
@anshulelegant 4 жыл бұрын
Hello Mr. Prashant, Thank you very much! When I run the gcloud command on Mac to establish the socks tunnel it just doesn’t do anything. Do I need to have PuTty on my machine? Is there a way to validate if the tunnel is open? Thanks.
@ScholarNest
@ScholarNest 4 жыл бұрын
Now you don't have to open a tunnel. Google dataproc now offers direct access to web UI without creating the tunnel.
@anshulelegant
@anshulelegant 4 жыл бұрын
Learning Journal, Thanks for your reply! I am trying to connect IDE on my local to the gcp vm and that’s why I thought SOCKS5 tunnel may come handy.
@HAMZA-he9nd
@HAMZA-he9nd 5 жыл бұрын
can i use hive pig scoop on this?
@alkaarora730
@alkaarora730 5 жыл бұрын
Hello Sir wonderful explanation can you guide us create cluster on databricks
@ScholarNest
@ScholarNest 5 жыл бұрын
Dtatbricks community edition is an on demand ephemeral single node instance for learning. Creating a multinode cluster needs a commercial plan and the process is well documented and straight forward.
@alkaarora730
@alkaarora730 5 жыл бұрын
Thanks Sir
@jsregar
@jsregar 6 жыл бұрын
Thank you for the crystal clear tutorial !! . I was wondering, if i'm going to process some data, say i csv format, where should i put it ? And how to put the data there ?
@ScholarNest
@ScholarNest 6 жыл бұрын
You can upload your file to any of the nodes and then copy them to hdfs or google cloud storage. I have explained and demonstrated it in my spark tutorials playlist.
@srinivasgoleti1396
@srinivasgoleti1396 5 жыл бұрын
Thank you so much for your reply
@vmahi111
@vmahi111 7 жыл бұрын
how it is cost efficient ?
@harshitgupta2515
@harshitgupta2515 6 жыл бұрын
If I have Pentium processor and 2 or 4 GB RAM then can work on hadoop ? By doing as u said in video ? Can I able to perform every task like pig ,hive ,hbase everything ?
@ScholarNest
@ScholarNest 6 жыл бұрын
I suggest you to use cloud vm. You will be able to do everything. No need to buy new laptop.
@harshitgupta2515
@harshitgupta2515 6 жыл бұрын
Learning Journal how can I contact u.....Sir u r my last hope and I really don't afford New laptop but want to learn alot......please sir help me...how can I contact u
@MrNarendra440
@MrNarendra440 5 жыл бұрын
while running chrome and connecting through proxy am getting the errror "chrome can't read or write to its directory /tmp/hadoop-cluster-m" . Do you have any suggestions? Commands used 1) gcloud compute ssh hadoop-cluster-m ^ --project= ^ --zone=us-east1-b -- -D 1080 -N ---is succesful 2) "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" ^ --proxy-server="socks5://localhost:1080" ^ --user-data-dir="/tmp/hadoop-cluster-m" hadoop-cluster-m:8088
@ScholarNest
@ScholarNest 5 жыл бұрын
If you are on windows, try changing /tmp to c:/tmp or c:\tmp
@MrNarendra440
@MrNarendra440 5 жыл бұрын
@@ScholarNest thanks a lot. Your site is really awesome, No need to spend $$ on all those MOOCS.
@srinivasgoleti1396
@srinivasgoleti1396 5 жыл бұрын
if we run the job in production then everyday need cluster so how to handle in this case
@ScholarNest
@ScholarNest 5 жыл бұрын
It depends on your requirement. 1. Real-time always running jobs will need a persistent cluster. 2. Short lived scheduled jobs will launch the cluster, finish the job and release the resources. You can do both.
@sudharsanganesh8190
@sudharsanganesh8190 6 жыл бұрын
How to install other hadoop tools such as Sqoop, hbase in the cluster ?
@ScholarNest
@ScholarNest 6 жыл бұрын
One easy option is dataproc initialization actions. I have an example in one of the spark video.
@sriharshaalapati
@sriharshaalapati 6 жыл бұрын
Sir, I want to use Rstudio as front-end for using SparkR. How to do that for dataproc
@ScholarNest
@ScholarNest 6 жыл бұрын
I have not done that but you may want to try installing RStudio server on the master node and configure Spark home. Ideally it should work.
@sriharshaalapati
@sriharshaalapati 6 жыл бұрын
Sir, Please share a tutorial video for your solution so that it will be helpful for my project. Thank you for your reply.
@deepuinutube
@deepuinutube 6 жыл бұрын
Google sdk installation issue/error: Unresolved hostname, Can someone help me please!!
@nationviews6760
@nationviews6760 7 жыл бұрын
Thank You So much Sir for Such a Nice explanation.I would like to know, how to load CSV file and access that CSV file in spark -shell. Actually, I am not able to give the exact path for CSV file for loading CSV file through SSH in Spark-shell
@ScholarNest
@ScholarNest 7 жыл бұрын
Move the file into HDFS, and you should be able to load it. However, I will show some examples in upcoming videos.
@pankajjha08
@pankajjha08 6 жыл бұрын
I have moved data in master cluster then used hdfs dfs -cp to move the data in hdfs. What is the location we need to pass to access this file in sparl-shell
@anuragguleria
@anuragguleria 6 жыл бұрын
Hi sir I have made a cluster in GCP .. I want to add a python package datacompy . But when I’m trying it is showing me error of environment. Can you please help me in this . I have made same cluster as you’ve explained in this video . I have to install datacompy package for pyspark .
@ScholarNest
@ScholarNest 6 жыл бұрын
The datacompy can be installed using pip and you need conda for that purpose. You can use conda initialization action to configure conda and some additional Python packages. Check below link for more details. github.com/GoogleCloudPlatform/dataproc-initialization-actions/blob/master/conda/README.MD
@anuragguleria
@anuragguleria 6 жыл бұрын
Hi sir , i have done that like this but i am getting error mymasterworld@spark-hadoop-m:~$ gcloud dataproc clusters create foo \ > --metadata 'CONDA_PACKAGES="numpy pandas",PIP_PACKAGES=pandas-gbq' \ > --initialization-actions \ > gs://dataproc-initialization-actions/conda/bootstrap-conda.sh,gs://dataproc-initialization-actions/conda/install-conda-env.sh Did you mean zone [asia-east1-a] for cluster: [foo] (Y/n)? Y ERROR: (gcloud.dataproc.clusters.create) PERMISSION_DENIED: Request had insufficient authentication scopes. Can you please tell me how to do that
@mohitbansal6360
@mohitbansal6360 7 жыл бұрын
This is completely new. Great job..Keep going..!! I wanna ask that suppose if I am having 300 mb of data (CSV file) then I want to divide the data in these clusters and run a command to fetch some data based on a given condition just like a query. Can I do this and where I need to run my query..I want to know the flow of clusters and time taken for the complete operation. Please let me know about this...and something related with this.
@ScholarNest
@ScholarNest 7 жыл бұрын
I will use this cluster in my upcoming Spark videos. I will do a lot of similar things (loading data and crunching it). You will get that soon.
@mohitbansal6360
@mohitbansal6360 7 жыл бұрын
Okay. Thanks.
@SahandPi
@SahandPi 6 жыл бұрын
Could you give the commands (for the ssh and the chrome) you are using maybe in the Description section of the video, to make it more Copy Paste friendly? :-) Thanks
@ScholarNest
@ScholarNest 6 жыл бұрын
+Sahand Razzaghi, Checkout Spark foundation course page on my website. All commands and code is available there for copying. www.learningjournal.guru/courses/spark/spark-foundation-training/multi-node-spark-setup/
@SahandPi
@SahandPi 6 жыл бұрын
Thanks
@harshitgupta2515
@harshitgupta2515 6 жыл бұрын
How can I contact you? I really need your help
@vivek77
@vivek77 6 жыл бұрын
While using /usr/bin/open -a "/Applications/Google Chrome.app" "viveks-hadoop-cluster-m:8088" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/http:viveks-hadoop-cluster-m. I'm getting below error on my mac.: unrecognized option `--proxy-server=socks5://localhost:10000' Do you think I made a mistake ?
@ScholarNest
@ScholarNest 6 жыл бұрын
All the commands that I used are listed on the website. Check the below link to compare your commands. www.learningjournal.guru/courses/spark/spark-foundation-training/spark-zeppelin-jdbc-client-interfaces/ I used a Windows command line to launch the Chrome browser. Which OS are you trying? /usr/bin/open does n't appear to be windows. You can also refer below link in documentation. cloud.google.com/solutions/connecting-securely#socks-proxy-over-ssh
@rajkumarp7784
@rajkumarp7784 6 жыл бұрын
Sir please create a vedio to show how to move the data from local machine to hadoop cluster
@ScholarNest
@ScholarNest 6 жыл бұрын
Do you mean your local computer to your cloud VM?
@rajkumarp7784
@rajkumarp7784 6 жыл бұрын
Learning Journal yes sir. I got the procedure to copy. Thanks for ur reply
@rajkumarp7784
@rajkumarp7784 6 жыл бұрын
Sir can u please create a vedio how to store the data and code in GCS
@boussifsalima8147
@boussifsalima8147 6 жыл бұрын
what's is the procedure please ?
@andraindonesia
@andraindonesia 6 жыл бұрын
thanks a lot!
@anandrao1341
@anandrao1341 6 жыл бұрын
Hello Sir, Please suggest if I will be able to use this cluster for sqooping Practice, and if it is possible then please let me know how I can invoke MySQL in this cloud. "mysql -h "hostname" -u root -p" what will be the password to invoke the MySQL. Thanks in Advance :)
@wandicui8516
@wandicui8516 6 жыл бұрын
Hi Sir, after I run the command line gcloud compute ssh --zone=us-east1-c --ssh-flag="-D" --ssh-flag="10000" --ssh-flag="-N" "spark-6-m", I was asked to create a ssh key, and I did so. Then I ran this command line again and input my passphrase, but it seemed like getting into a dead loop, and I could only use contrl+c to kill it. How can I fix it?
@ScholarNest
@ScholarNest 6 жыл бұрын
Check your zone name and master node name. Change it accordingly. I think you are using incorrect name and ssh is not able to reach the master node.
@wandicui8516
@wandicui8516 6 жыл бұрын
Thank you for your quick response, but I did write the correct zone name and master name. So I'm really confused right now :(
@wandicui8516
@wandicui8516 6 жыл бұрын
Is it because I passed the VPC network and firewall setting steps?
@ScholarNest
@ScholarNest 6 жыл бұрын
If you are opening an ssh tunnel, you don't need to change firewall setting.
@abhishektripathi3216
@abhishektripathi3216 5 жыл бұрын
Windows user may face issue. Please use this command as Google updated API. 1) gcloud compute ssh spark-6-m --project=premium-origin-234510 -D 1080 -N 2) "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --proxy-server="socks5://localhost:1080" --user-data-dir="C:\Users\Abhis\AppData\Local\Temp\tmpspark-6-m" spark-6-m:8088
@isabelwzupzphzkapusu2925
@isabelwzupzphzkapusu2925 6 жыл бұрын
1551/5000 THANKS. Hello from a help. I'm new to hadoop. I created a cluster with 4 nodes. 1 master and 3 slaves. now I need to configure gridmix2. my biggest problem is that I can not find the src folder where the gridmix is. and I do not know what to do. I'm using hadoop-2.9.1 on ubuntu 16.
@SONY2121995
@SONY2121995 6 жыл бұрын
Dear sir, after i run command to connect ssh "gcloud compute ssh --zone=us-east1-b --ssh-flag="-D" --ssh-flag="10000" --ssh-flag="-N" "centos-1"" it ok. But tomorow, i run command again it fail and noti like ""network error: network error: Connection timeout".
@ScholarNest
@ScholarNest 6 жыл бұрын
Have you started your cloud VM? If the VM is in stopped stage, you may get network error.
@analogylibrary
@analogylibrary 7 жыл бұрын
I am facing problem when creating ssh tunnel gcloud compute ssh --zone=us-east1-c --ssh-flag="-D" --ssh-flag="10000" --ssh-flag="-N" "cluster-spark" ERROR: (gcloud.compute.ssh) Could not fetch resource: - The resource 'projects/united-base-186010/zones/us-east1-c/instances/cluster-spark' was not found
@ScholarNest
@ScholarNest 7 жыл бұрын
What is the name of your master node? Check it and give a correct name in this command. I am sure it is not "cluster-spark" because GCP automatically add -m at the end. I guess it would be "cluster-spark-m". Make sure you are giving correct zone name and the cluster name and retry.
@analogylibrary
@analogylibrary 7 жыл бұрын
Yes you are right I had to add -m for the master and I did that, now no problem at that step. Thanks for the help sir. By the way you explain very well I follow your each tutorial, please keep on making tutorials for big data.
@SantoshSingh-ki8bx
@SantoshSingh-ki8bx 7 жыл бұрын
sir , after installing SDK , I excuted below command but getting below issue. Pleas guide C:\Program Files (x86)\Google\Cloud SDK>gcloud compute ssh --zone=us-east1-c --ssh-flag="-D" --ssh-flag="10000" --ssh-flag="-N" "spark-6-m" ERROR: (gcloud.compute.ssh) The required property [project] is not currently set. You may set it for your current workspace by running: $ gcloud config set project VALUE or it can be set temporarily by the environment variable [CLOUDSDK_CORE_PROJECT]
@ScholarNest
@ScholarNest 7 жыл бұрын
+Santosh Singh solution to your problem is already given in the error message. I think you missed to set project name during the installation.
@SantoshSingh-ki8bx
@SantoshSingh-ki8bx 7 жыл бұрын
sir, after executing following command I am getting "Google chrome can not read or write to its directory /tmp/spark-6-m" command-C:\Users\hp>"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" "spark-6-m:8088" --proxy-server="socks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/spark-6-m. Kindly help me . I tried after re installation of chrome too
@ScholarNest
@ScholarNest 7 жыл бұрын
+Santosh Singh It does't look like a chrome problem. Instead of /tmp give c:\test and try again. The test directory should not exist.
@SantoshSingh-ki8bx
@SantoshSingh-ki8bx 7 жыл бұрын
Thank you,sir.It is working
@zaheerbeg4810
@zaheerbeg4810 2 жыл бұрын
Precise and Adorable video. Keep Posting #Thanks a lot. #Subscribing channel
@smitshah1737
@smitshah1737 6 жыл бұрын
how to shutdown dataproc cluster?
@ScholarNest
@ScholarNest 6 жыл бұрын
You can't. Just delete it and recreate again when you need it later. You might want to keep your data in Google cloud storage to avoid loosing data on deleting your cluster.
@smitshah1737
@smitshah1737 6 жыл бұрын
Thanks for quick response...!!! Waiting for ur spark video.s....!!
@vivek2319
@vivek2319 6 жыл бұрын
@Smit Shah : Enable Google Compute Engine from API's (The one with the blue sign / label) then you'd be able to stop those clusters from running.
@vmahi111
@vmahi111 6 жыл бұрын
What if I use hive to store data. I will lose data from it's internal table
@jean-christopheclavier3249
@jean-christopheclavier3249 6 жыл бұрын
A good way to create clusters quickly is to prepare a gcloud script. This way, you don't lose your time with clicks, having to think each time to each parameter. Her is a very minimal example : gcloud dataproc clusters create spark-6-m \ --async \ --project=my-project-id \ --region=us-east1 \ --zone=us-east1-b \ --bucket=my-project-bucket \ --image-version=1.2 \ --num-masters=1 \ --master-boot-disk-size=10GB \ --master-machine-type=n1-standard-1 \ --worker-boot-disk-size=10GB \ --worker-machine-type=n1-standard-1 \ --num-workers=6 \ --initialization-actions=gs://dataproc-initialization-actions/jupyter2/jupyter2.sh the machine types can be found here : cloud.google.com/compute/docs/machine-types#predefined_machine_types
@4a6vamsi
@4a6vamsi 6 жыл бұрын
How to Install and practice hive and Pig on Google cloud?Is it Free or Do I Need to Pay anything?
@ScholarNest
@ScholarNest 6 жыл бұрын
I think this cluster comes with hive and pig. Just start using it directly.
@4a6vamsi
@4a6vamsi 6 жыл бұрын
Thanks for quick reply.One small clarification.How this video is related to Google Cloud Tutorial - Get Free Virtual Machine.Do I need to follow this tutorial[Google Cloud Tutorial - Get Free Virtual Machine] before proceeding this video[Google Cloud Tutorial - Hadoop | Spark Multinode Cluster ] I did not understand the connectivity between these two videos?
@ScholarNest
@ScholarNest 6 жыл бұрын
You need a GCP account to create the cluster that is explained in this video. The Google cloud tutorial video explained some basics to setup your GCP account. If you are new to GCP, I recommend to watch that first.
@milansahu88
@milansahu88 7 жыл бұрын
Awsome , !!! Thats what I was looking for , Thank you so much . Could you please instruct on how to use the initial $300 coupon provided by GCP for new accounts
@ScholarNest
@ScholarNest 7 жыл бұрын
You don't have to do anything, They will automatically credit it to your GCP account and adjust it based on your usage.
@KP33767
@KP33767 6 жыл бұрын
I'm unable to connect to the web UI can anyone help me
@MsHarsha009
@MsHarsha009 6 жыл бұрын
were you able to connect web UI? i have same problem
@charuskulkarni
@charuskulkarni 6 жыл бұрын
Any idea to implement this cloud dataproc on aws ?
@KarthikSrinivasanTucson
@KarthikSrinivasanTucson 6 жыл бұрын
Great video! I was wondering, cant we go to Compute Engine and stop the clusters instead of deleting the VM each time? The VM takes more than 2 minutes to get initialized at times...
@ScholarNest
@ScholarNest 6 жыл бұрын
Yes, you can do that. But charges for disk and any other resource may apply.
@nationviews6760
@nationviews6760 7 жыл бұрын
I am getting the following Error while creating ssh tunnel.please help me regarding this. C:\Users\Raj>gcloud compute ssh --zone=us-east1-c --ssh-flag="-D" --ssh-flag="10000" -ssh-flag="-N" "spark-3-m" ERROR: (gcloud.compute.ssh) unrecognized arguments: -ssh-flag=-N Usage: gcloud compute ssh [USER@]INSTANCE [optional flags] [-- SSH_ARGS ...] optional flags may be --command | --container | --dry-run | --force-key-file-overwrite | --help | --plain | --ssh-flag | --ssh-key-file | --strict-host-key-checking | --zone For detailed information on this command and its flags, run: gcloud compute ssh --help
@ScholarNest
@ScholarNest 7 жыл бұрын
change -ssh-flag=-N to --ssh-flag=-N. There are two hyphens before ssh-flag.
@nationviews6760
@nationviews6760 7 жыл бұрын
Thank you, Sir ...Now it's working fine.
@rutugandhi7921
@rutugandhi7921 5 жыл бұрын
I have a problem opening and running the ssh tunnel. This is what I get after running the "gcloud compute ssh ..." command: Updating project ssh metadata...⠧Updated [www.googleapis.com/compute/v1/projects/dsp-kali]. Updating project ssh metadata...done. Waiting for SSH key to propagate. Warning: Permanently added 'compute.5317879760501945220' (ECDSA) to the list of known hosts.
@gangaprasad21
@gangaprasad21 5 жыл бұрын
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" "hadoop-cluster-6-m:8088" --proxy-server="scoks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/hadoop-cluster-6-m using above unable open web UI of resource manager Error saying that google chrome doesn't have read write permissions /tmp/hadoop-cluster-6-m
@MsHarsha009
@MsHarsha009 6 жыл бұрын
very useful videos, can anyone help me, when I am trying to create socket connection using command as explained in the video "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" "spark6-m-m:8088" --proxy-server="socks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/spark6-m-m receving error google chrome can not read and write to the directory /tmp/spark6-m-m
@MsHarsha009
@MsHarsha009 6 жыл бұрын
it is not identifing spark6-m-m:8088
@ScholarNest
@ScholarNest 6 жыл бұрын
check your master node name. Is it spark6-m-m or spark6-m ?
@P2P_Relations
@P2P_Relations 5 жыл бұрын
Thank you so much, Sir, for this great and step by step video. I am new to this course and I am learning to your provided videos. I have got stuck when I could create SSH Tunnel but couldn't start a browser session that uses the SOCKS proxy through this tunnel using below command: "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" "spark-6-m:8088" --proxy-server="socks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/spark-6-m Error: Fail to create data directory: Google chrome cannot read and write to its data directory: /tmp/spark-6-m can anybody help me resolve this issue, i will be a great help. Thank you
@ScholarNest
@ScholarNest 5 жыл бұрын
change /tmp/spark-6-m to some valid directory name.
@P2P_Relations
@P2P_Relations 5 жыл бұрын
Thank you for your help, sir. I did as per your saying but I am getting the same error "Google chrome cannot read and write to its data directory" /tmp/mydata. is it something wrong with my browser setting....? Thank you
@arupnaskar3818
@arupnaskar3818 4 жыл бұрын
Great Tutorials .. Thank u ..
Apache Spark - 03 - Architecture - Part 1
21:13
Learning Journal
Рет қаралды 122 М.
Hadoop Tutorial - The Map Reduce
16:14
Learning Journal
Рет қаралды 59 М.
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 27 МЛН
Арыстанның айқасы, Тәуіржанның шайқасы!
25:51
QosLike / ҚосЛайк / Косылайық
Рет қаралды 678 М.
How Many Balloons To Make A Store Fly?
00:22
MrBeast
Рет қаралды 194 МЛН
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 16 МЛН
Is it too late to learn Cloud Computing in 2024?
9:02
Tech With Lucy
Рет қаралды 92 М.
Google Dataproc BigData Managed Service
16:00
Data Engineering
Рет қаралды 11 М.
Run Spark and Hadoop faster with Dataproc
16:35
Google Cloud Tech
Рет қаралды 16 М.
Spark Tutorials - Spark Dataframe | Deep dive
18:40
Learning Journal
Рет қаралды 43 М.
Hadoop on Google Cloud platform || GCP
27:04
Data with Mohit
Рет қаралды 113
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 27 МЛН