Hands on spark RDDs, DataFrames, and Datasets

  Рет қаралды 41,167

Nathaniel Osgood

Nathaniel Osgood

Күн бұрын

Пікірлер: 8
@learnsharegrow7294
@learnsharegrow7294 7 жыл бұрын
This is the only video on youtube which explains with hands on exercise on RDD, DataFrame and Dataset. Quite good material to get started. Thanks for sharing.
@surajsheshadri
@surajsheshadri 6 жыл бұрын
Can we use dataset always compared to dataframe or in which cases should we prefer dataframe over dataset?
@wenyian4845
@wenyian4845 Жыл бұрын
I like this class
@ThePrashant13feb
@ThePrashant13feb 7 жыл бұрын
Thanks for the video, it was a good learning. I got stuck in one of the parts: val dsPopulation = populationRDD.toDS import org.apache.spark.sql.SQLContext val sqlContext = new SQLContext(sc) import sqlContext.implicits._ val dfPopulation2 = sqlContext.createDataFrame(populationRDD) val dsPopulation = dfPopulation2.as[Person] dsPopulation.filter((p: Person) => p.age > 40).groupBy(round($"age"/5)* 5) ----> Works fine till here. dsPopulation.filter((p: Person) => p.age > 40).groupBy(round($"age"/5)* 5).agg(avg($"income")) ----> Gives error , says Type Mismatch found : org.apache.spark.sql.Column required: org.apache.spark.sql.TypedColumn[Person,?] Please help me out.
@gauravjain5705
@gauravjain5705 6 жыл бұрын
same here
@Canada1989
@Canada1989 6 жыл бұрын
please explain clearly
@dalwindersingh9282
@dalwindersingh9282 5 жыл бұрын
it was ok ok session
@vadiyaraju8853
@vadiyaraju8853 6 жыл бұрын
its nice vidio
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 34 МЛН
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
28:09
CMPT856 -- Double Category Examples, Basics and Motivations
1:41:37
Nathaniel Osgood
Рет қаралды 52
PowerME Field Update - December 2024
43:30
PowerME
Рет қаралды 194