Apache Spark Transformation and Actions

  Рет қаралды 18,759

AIEngineering

AIEngineering

Күн бұрын

Пікірлер: 32
@IsaiahShadE
@IsaiahShadE 4 жыл бұрын
Probably the only person who tells you facts and reality in the data science community.
@HaridasJanjire
@HaridasJanjire 4 жыл бұрын
Very well.. Very helpful to learn Apache spark with real business end to end case.
@ayeshababar-fl4ev
@ayeshababar-fl4ev 11 ай бұрын
Very elaborate and well-explained! Can you please share the code and notebook?
@ranjanirajamani7565
@ranjanirajamani7565 4 жыл бұрын
Thank you, Sir my learning curve with regards to Spark has taken an exponential trend after watching your videos. It has been a rich learning experience. I have been trying to practice this parallely. I have a question regarding data frame in pyspark. When I tried to create the variable "bad_loan" using withColumn and when (for the various cases of loan_status), the variable doesnt get created in the table, though I can see it in the dataframe. When I try to access this column using a select statement, I get an error. Can you please throw some light on this?
@AIEngineeringLife
@AIEngineeringLife 4 жыл бұрын
Thanks Ranjani.. did u assign it to dataframe and use that dataframe to save. In my video I think I saved old dataframe object and not the one I assigned to new columns. Can you please validate it?
@ranjanirajamani7565
@ranjanirajamani7565 4 жыл бұрын
@@AIEngineeringLife Thank you for the response, Sir. I was able to resolve this issue. It was related to the way the when function was to be used.
@saurabhjain1626
@saurabhjain1626 4 жыл бұрын
Thank you for the wonderful video...I have a question as you mentioned you should use sortWithinPartitions to avoid expensive transformations when you know that the particular data is in one partition, how will you know that?? I am assuming that is only possible when you partition the data based on the values of that particular column.
@AIEngineeringLife
@AIEngineeringLife 4 жыл бұрын
Yes you can have to custom partition the data in that case
@jaspermarshall3973
@jaspermarshall3973 3 жыл бұрын
instablaster.
@IsaiahShadE
@IsaiahShadE 4 жыл бұрын
Sir you are an Inspiration.
@sudippandit1
@sudippandit1 4 жыл бұрын
Excellent presentation sir!!
@mukeshkesavan4852
@mukeshkesavan4852 3 жыл бұрын
Thanks ton..! You made spark easy. Please make a video on how to optimize spark code and data skewness..
@nagarajuch2412
@nagarajuch2412 4 жыл бұрын
Videos are all very informative. Is there anyway we can sort based on more than one attribute? eg: Country Ascending and Date Descending
@nagarajuch2412
@nagarajuch2412 4 жыл бұрын
Ans: orderBy(col("City").asc(),col("Date").desc())
@AIEngineeringLife
@AIEngineeringLife 4 жыл бұрын
@@nagarajuch2412 .. You got the answer :) .. It is there in one of my data engineering video as well
@KishoreKumar-yx4nw
@KishoreKumar-yx4nw 4 жыл бұрын
Thanks Srinivasan for the wonderful explanation
@AIEngineeringLife
@AIEngineeringLife 4 жыл бұрын
Thanks Kishore
@mateen161
@mateen161 4 жыл бұрын
Thanks Srivatsan...Nice explanation!
@rajeevrajeev5244
@rajeevrajeev5244 3 жыл бұрын
Do you have this Databricks page somewhere in git?
@viBeotamil
@viBeotamil 4 жыл бұрын
Amazing video sir.
@AkshayKumar-xo2sk
@AkshayKumar-xo2sk 3 жыл бұрын
@AIEngineering - Thanks a lot for your video. May I kindly check all your spark video codes are based on python? You don't use scala/java? Whatever we do in scala/java can also be done using python?
@AIEngineeringLife
@AIEngineeringLife 3 жыл бұрын
All of my videos are using pyspark. So python is the one I have used but same can be easily done on Scala as well
@AkshayKumar-xo2sk
@AkshayKumar-xo2sk 3 жыл бұрын
@@AIEngineeringLife - do you think CCA175 cloudera certification for Apache spark and hadoop developer is good one to attempt for someone who is working as Data Engineer? Do you recommend any other certifications? And can the certification be done using Pyspark as well? Your help is highly appreciated
@designwithpicmaker2785
@designwithpicmaker2785 4 жыл бұрын
Thank you bro thanks for this wonderful content video
@taliacohen7872
@taliacohen7872 2 жыл бұрын
Amazing video thank you!!!!
@naveenreddythirugudu
@naveenreddythirugudu 3 жыл бұрын
Best video 👍
@kkckvr
@kkckvr 4 жыл бұрын
Thanks a lot
@deepakparamesh8292
@deepakparamesh8292 4 жыл бұрын
very nice explanation sir.....could you please upload the code, sir?
@AIEngineeringLife
@AIEngineeringLife 4 жыл бұрын
Deepak.. Spark videos are not yet in my git repo.. It will take time to get there. Below is my repo that has other video code at this time github.com/srivatsan88/KZbinLI
@kketanbhaalerao
@kketanbhaalerao 3 жыл бұрын
Please provide your GitHub link and also provide corona data and twitter data
@AIEngineeringLife
@AIEngineeringLife 3 жыл бұрын
You can find all codes here - github.com/srivatsan88/Mastering-Apache-Spark
@Cricketpracticevideoarchive
@Cricketpracticevideoarchive 4 жыл бұрын
Grateful for this series Day 3 : colab.research.google.com/drive/1yTDcFFcUAynSXqZxjmu6UJ8bFAkEgnqV?usp=sharing&authuser=1#scrollTo=O9naSW-WLWR5
Data Cleaning and Analysis using Apache Spark
49:16
AIEngineering
Рет қаралды 60 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
The Best Band 😅 #toshleh #viralshort
00:11
Toshleh
Рет қаралды 22 МЛН
RDDs: Transformation and Actions
16:11
Big Data Analysis with Scala and Spark
Рет қаралды 16 М.
The ONLY PySpark Tutorial You Will Ever Need.
17:21
Moran Reznik
Рет қаралды 146 М.
Getting Started with Apache Airflow
1:17:40
AIEngineering
Рет қаралды 45 М.
Making Apache Spark™ Better with Delta Lake
58:10
Databricks
Рет қаралды 180 М.
Top 5 Mistakes When Writing Spark Applications
30:37
Spark Summit
Рет қаралды 102 М.
Databricks overview and Delta Lake deep dive
1:23:01
AIEngineering
Рет қаралды 7 М.
Apache Spark for Data Engineering and Analysis - Overview
17:45
AIEngineering
Рет қаралды 46 М.
Spark Session vs Spark Context | Spark Internals
8:08
Data Savvy
Рет қаралды 73 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН