I have tried several SPARK series and never get very far. I have gone through all of yours in a row so far and think you do a really good job. Thanks for putting this together, cheers!
@SidIndian082 Жыл бұрын
Excellent Lecture Sir ,,. Truly Adorable ...
@ampcode11 ай бұрын
Thank you so much! Subscribe for more content 😊
@albertopedro8632 Жыл бұрын
Wonderful, i'm not english native, sooo the way i´ve been understand all sessions, top top ! U are the greatest thanks for sharing with us. From Angola
@ampcode11 ай бұрын
Thank you so much! Subscribe for more content 😊
@sriponnirealestates32594 ай бұрын
very useful as beginner and clear,cut explanation
@avinash7003 Жыл бұрын
RuntimeError: Java gateway process exited before sending its port number -- how to solve?
@riomorder Жыл бұрын
Very useful for me I have databricks in my job but I want to practice my queries in personal laptop thanks to you I know how
@ampcode11 ай бұрын
Thank you so much! Subscribe for more content 😊
@sachindubey4315 Жыл бұрын
i like the way you are explaining the code .
@ampcode11 ай бұрын
Thank you so much! Subscribe for more content 😊
@ashishveer4591 Жыл бұрын
How to run spark application on cluster ??
@jankipatel118 Жыл бұрын
I can not download the csv file. Can you please check why or give website link so that we can directly download from that website.
@ampcode Жыл бұрын
Sorry for late response. Sure I’ll check the URL and provide you the updated one.
@mahendranaidu8758 Жыл бұрын
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[14], line 2 1 data_2=data.select("industry","value").\ ----> 2 filter(Col("value")>1000).\ 3 orderBy(desc("value")) NameError: name 'Col' is not defined
@xx-pn7it10 ай бұрын
same error are getting me also how can i fix them
@pramodgupta549210 ай бұрын
Try using data.value as you not created or imported 'col' anywhere in the code. data2 = data.select('industry', 'value').filter(data.value > 10000).orderBy('value') also, easy way to create dataframe is data = spark.read.csv('operations_management.csv', inferSchema=True, header=True) Not sure why instructor went with different way which is making code look more complex.
@varshaamuruganandam8 ай бұрын
@@pramodgupta5492 Thanks, this worked!!! also 'desc' is not working
@patrickwheeler71075 ай бұрын
I had the same error. I used this to get it to work. #Start SparkSesson First from pyspark.sql import SparkSession from pyspark.sql.functions import col data_2 = data.select("industry", "value").\ filter(col("value") > 10000).\ orderBy("value", ascending = [False])
@flosrv319410 ай бұрын
its throwing me errors from everywhere claiming col and desc are not recognized names. How damn can you make your app work without issue ??
@robyp3 ай бұрын
if you look at the cell numbers you see he removed the lines where he imports the symbols :/
@nayanagrawal9878 Жыл бұрын
My Spark is considering all the values of the header as String. root |-- description: string (nullable = true) |-- industry: string (nullable = true) |-- level: string (nullable = true) |-- size: string (nullable = true) |-- line_code: string (nullable = true) |-- value: string (nullable = true I have written same code as you have done in the video. #Creating DataFrame # as our dataset already had header, therefore, we provided inferSchema as True and header as true data = spark.read.format('csv').\ option('inferScheme', 'true').\ option('header', 'true').\ option('path','operations_management.csv').\ load() Can anyone please help?