19. Databricks & Pyspark: Real Time ETL Pipeline Azure SQL to ADLS

  Рет қаралды 46,924

Raja's Data Engineering

Raja's Data Engineering

Күн бұрын

Пікірлер: 98
@maheshwarkuchana190
@maheshwarkuchana190 2 жыл бұрын
A one stop solution to understand basics of ETL in Databricks. Thanks Mr. Raja for such amazing tutorials on your channel. We really benefit from them. Many thanks again.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you Maheshwar!
@gopireddybhargavi2844
@gopireddybhargavi2844 11 ай бұрын
Thank you so much sir actually I am having support kind of experience in Azure I just followed all of your videos now I got placed in 2 mnc's
@rajasdataengineering7585
@rajasdataengineering7585 11 ай бұрын
Thanks Gopi for sharing your experience. It is really amazing to know that you got placed in MNCs. All the very best! If you find this channel helpful, just spread across data engineering communities so that people can be benefited
@OurOrangeKitchen
@OurOrangeKitchen 2 жыл бұрын
Beautiful explanation and a very good example of ETL. Thanks a lot for this video. It helped a lot in gaining a clear picture of the ETL process in Databricks.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you
@archichaudhari9283
@archichaudhari9283 7 ай бұрын
Best video series.
@rajasdataengineering7585
@rajasdataengineering7585 7 ай бұрын
Thank you
@mitusa1234566
@mitusa1234566 5 ай бұрын
Great video with nice and clear instructions.. keep it up. Thanks.
@rajasdataengineering7585
@rajasdataengineering7585 5 ай бұрын
Glad it was helpful! keep watching
@Rk-mv8sz
@Rk-mv8sz Жыл бұрын
It's really very helpful. Please make a video on end to end project with ADF and ADB. Thank you for giving wonderful videos.
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Sure, will do
@ranajaymondal390
@ranajaymondal390 2 жыл бұрын
Nice explanations and this series is really awesome. please create more videos on databricks while solving some real time ingest/export requirements using pyspark.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
sure, will do more videos on this series
@priyankapushp8271
@priyankapushp8271 5 ай бұрын
Wow very nicely explained. Thanks a lot for your efforts.
@rajasdataengineering7585
@rajasdataengineering7585 5 ай бұрын
You are most welcome
@tejaswinisunkavalli2086
@tejaswinisunkavalli2086 8 ай бұрын
Great Explanation and easily understandble👏👏
@rajasdataengineering7585
@rajasdataengineering7585 8 ай бұрын
Glad you liked it! Thanks for your comment
@sriharichennupati5
@sriharichennupati5 4 ай бұрын
Thanks for sharing this info
@rajasdataengineering7585
@rajasdataengineering7585 4 ай бұрын
My pleasure! Welcome
@alwalravi
@alwalravi Жыл бұрын
Grt thanks for sharing the video
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Glad it helps!
@navk4960
@navk4960 Жыл бұрын
it would be really amazing if the links to topics - mentioned to refer are added in description. as you are an amazing tutor
@mustafakamal5945
@mustafakamal5945 3 ай бұрын
This is very informative video. Do you also have a video on connecting to SQL DB via Managed identity?
@rajasdataengineering7585
@rajasdataengineering7585 3 ай бұрын
Thank you for your comment! No I haven't a video on managed identity
@mustafakamal5945
@mustafakamal5945 3 ай бұрын
@@rajasdataengineering7585 I am struggling to find a resource for implementing it, do let me know if you know of any resources for guidance and help on this topic apart from ms learn site
@pawanukey2990
@pawanukey2990 Жыл бұрын
Hi Raja , You have explained this in detailed . Thanks for that ,But can you please provide the data set ?? To do hands on activity ??
@ranaumershamshad
@ranaumershamshad 3 ай бұрын
I read data from a huge table in Azure SQL DB and wrote it to ADLS. It created one file of 900 MB instead of partitions. Is there any parameter we can change to create the partitions?
@dianadai4616
@dianadai4616 4 ай бұрын
Do you have your codes posted somewhere? It is very important for us to follow along
@prathapganesh7021
@prathapganesh7021 10 ай бұрын
Thank you for the content
@rajasdataengineering7585
@rajasdataengineering7585 10 ай бұрын
Thanks Prathap! Glad it helps
@arrooow9019
@arrooow9019 2 жыл бұрын
No words to express my feelings.How great this tutorial sir.Thanks for this video👍 and also could you make any video how to clear left space and characters in string while cleansing data?
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you. Sure will post a video on your requirement
@VanakkamSQL
@VanakkamSQL 8 ай бұрын
Great work!!!!!!!!!
@rajasdataengineering7585
@rajasdataengineering7585 8 ай бұрын
Thank you! Cheers!
@VanakkamSQL
@VanakkamSQL 8 ай бұрын
How can i reach you ?
@phanisrikrishna
@phanisrikrishna Жыл бұрын
Hi Sir, It is a great series and well structured ones with regular topics and interview questions. Can you also share the notebooks for reference and practice. Thanks a lot in advance.
@paarthiban8452
@paarthiban8452 Жыл бұрын
A nice video. Can you create another video to automate this pipeline using Airflow?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Sure Paarthi, will create a video on this requirement
@mohsintamboli4394
@mohsintamboli4394 2 жыл бұрын
Best example
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thanks
@mujeebrehman1146
@mujeebrehman1146 Жыл бұрын
Great tutorial but i have a question. Can we connect oracle database to databricks?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Yes we can connect oracle database too
@anonymous-254
@anonymous-254 Жыл бұрын
Instead of ADLS... can we put that data in synapse dedicated sql pool
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Yes we can load into azure synapse and azure SQL also. Please watch the video no 87 in this channel
@anonymous-254
@anonymous-254 Жыл бұрын
@@rajasdataengineering7585 ok thanks
@susobhanghosh5093
@susobhanghosh5093 Жыл бұрын
It will be very helpful if u can share the notebook also in HTML format
@what_worldgot3063
@what_worldgot3063 7 ай бұрын
Where to find dataset for this table ??
@manikiranthota2811
@manikiranthota2811 2 жыл бұрын
In Azure SQL db itself we can do the null handling ,join and duplicates deletion.why we are using data frame is there any reason specific to that.Thanks in advance.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
The requirement is we need to move the data from azure SQL to adls after performing these transformations. Could you please explain how you will do these transformations in azure SQL itself as part of this requirement?
@manikiranthota2811
@manikiranthota2811 2 жыл бұрын
@@rajasdataengineering7585 yes sir.i got it.i thought of using isnull operation on product ,delete the duplicates in fact and join the 2 tables in Azure SQL db itself but that will take too much time.Thanks.
@shot_freeze
@shot_freeze 2 жыл бұрын
@@rajasdataengineering7585 Hi Raja , Could you please share us few requirements like this for us think logically , So that we can have some clear idea how we will get in our projects.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Sure will do
@shot_freeze
@shot_freeze 2 жыл бұрын
@@rajasdataengineering7585 You can just share it here or either comment outside and Pin it .
@sravankumar1767
@sravankumar1767 2 жыл бұрын
Superb bro 👌 👏
@chinna4549
@chinna4549 Жыл бұрын
Nice anna
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Hope it helps!
@BhakthiYoutube
@BhakthiYoutube Жыл бұрын
Can you please provide us the some big data end to end projects involving all components
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Hi, I have already created one video on end to end project using multiple components. Pls refer the video number 87 kzbin.info/www/bejne/mqnbiYqbacyBitE
@zubairmushtaq7912
@zubairmushtaq7912 3 ай бұрын
read from azure sql db and write it again in azure sql db please make a video on it
@rajasdataengineering7585
@rajasdataengineering7585 3 ай бұрын
Sure, will create a video on this requirement
@AdexDurojaiye
@AdexDurojaiye Жыл бұрын
Do you have a class where you can train me. thanks for your video.
@PriyaDarshiniNeverSayNever
@PriyaDarshiniNeverSayNever Жыл бұрын
Getting error driver not found @ send step. Please help how to solve this?
@gunasekar_vs
@gunasekar_vs 2 жыл бұрын
Thanks for great explanation 🙏🙏... I couldn't able to see the code clearly so if you don't mind can you pls share the code. We can also try once by following your videos 🙏🙏
@nagulmeerashaik5336
@nagulmeerashaik5336 Жыл бұрын
When we learn azure And also required to learn pyspark?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Not really. Azure has many services within data engineering like adf, synapse you analytics, databricks etc. Pyspak is mainly needed for databricks developer and spark pool inside synapse. If your project is on adf, Pyspark does not play any role
@nagulmeerashaik5336
@nagulmeerashaik5336 Жыл бұрын
Thanks.
@anil6328
@anil6328 2 жыл бұрын
Explained very well 👌 Can someone help on how to setup jdbc without showing password in the code?
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Hi Anil, yes that is possible. I have already explained that concept in key vault integration video. Please go through once kzbin.info/www/bejne/mWOonoeJlNhosNk
@kneelakanta8137
@kneelakanta8137 Жыл бұрын
how did you get tables in azure sql database
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
We can create tables using create table statement. Otherwise you can use readymade adventure works database by choosing it in additional settings while creating the database
@alluchandrasekhar2992
@alluchandrasekhar2992 2 жыл бұрын
What if multiple tables(more than 10) needs to copy from azure sql db to data lake
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
We can create multiple dataframes reading multiple tables and load them into adls
@alluchandrasekhar2992
@alluchandrasekhar2992 2 жыл бұрын
@@rajasdataengineering7585 yes we but can it be parametarized if yes then how?
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Yes parameters can be setup using widgets
@alluchandrasekhar2992
@alluchandrasekhar2992 2 жыл бұрын
@@rajasdataengineering7585 okay and BTW your video's are very informative .......keep doing such great videos.
@RAHULGUPTA-wy2zb
@RAHULGUPTA-wy2zb Жыл бұрын
Hi Raja...would u plz tell that why do you take left outer join not inner join
@pavanaditya2309
@pavanaditya2309 5 ай бұрын
Since this is onlya sample logic he is trying to do, I think for demonstration purpose does'nt matter which join he uses.
@Umerkhange
@Umerkhange Жыл бұрын
How to apply multiple rules in a single statement like SUM("UnitPrice"), SUM("TOTALLINE), AVG(PRICE) ETC
@Learn2Share786
@Learn2Share786 2 жыл бұрын
@raja, nice explanation.. can you pls share notebook ?
@srinivasanmadeshwaran1130
@srinivasanmadeshwaran1130 2 жыл бұрын
Hi Sir, i am getting below error when i try to connect jdbc java.sql.SQLException: No suitable driver pls help on this
@ayushbhatt9469
@ayushbhatt9469 3 ай бұрын
.option("driver", jdbcDriver) add this code as well in the second step to resolve your issue
@sujitunim
@sujitunim 2 жыл бұрын
👍👍
@hozefakanchwala8720
@hozefakanchwala8720 2 жыл бұрын
It's a nice use case for batch processing but you shouldn't call it Real Time ETL.
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Real time does not mean streaming data here. It means one of the real time use case for ETL requirement.
@shalinikumari-qx9tn
@shalinikumari-qx9tn 2 жыл бұрын
I want the dataset that you used..how to get
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Hi Shalini, I have used the sample database adventure works in this exercise. It is open source dataset
@kartikeshsaurkar4353
@kartikeshsaurkar4353 2 жыл бұрын
While creating azure sql db you'll see the option to have sample tables/databases. After creation, you'll see few tables will be present by default
@shot_freeze
@shot_freeze 2 жыл бұрын
@@kartikeshsaurkar4353 is that table has any data in it by default have u checked it ?
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Yes Ajay, it would have sample data as well to practice
@shot_freeze
@shot_freeze 2 жыл бұрын
@@rajasdataengineering7585 Thanks Raja !!! For your reply , Your videos are really helpful to prepare Azure Data Engineer.
@nivasnivi
@nivasnivi 2 жыл бұрын
Can you help me on one small assignment sir please??
@MRCyberstriker
@MRCyberstriker Жыл бұрын
No intermediate steps explained, a beginner will find difficulty following! Please make it beginner friendly. take it as feedback
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
The intermediate steps are already explained in videos 17 and 18. Pls watch them as a prerequisite to this video kzbin.info/www/bejne/mIvdmWqhm6eWrpY kzbin.info/www/bejne/rqmxaWt5lpmom9U
@sravankumar1767
@sravankumar1767 2 жыл бұрын
How to handle bad records in Azure databricks
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Hi Sravan, I have already posted a video on how to handle bad records. You can refer that kzbin.info/www/bejne/rZCkiJSOo8Z8b5Y
@sravankumar1767
@sravankumar1767 2 жыл бұрын
@@rajasdataengineering7585 thanks
@nagulmeerashaik5336
@nagulmeerashaik5336 Жыл бұрын
Most companies are asking that's why
20. Databricks & Pyspark: Azure Key Vault Integration
9:32
Raja's Data Engineering
Рет қаралды 16 М.
17. Databricks & Pyspark: Azure Data Lake Storage Integration with Databricks
14:43
Это было очень близко...
00:10
Аришнев
Рет қаралды 6 МЛН
Osman Kalyoncu Sonu Üzücü Saddest Videos Dream Engine 269 #shorts
00:26
HELP!!!
00:46
Natan por Aí
Рет қаралды 5 МЛН
Databricks and Pyspark Project | Real Time ETL Pipeline Azure SQL to ADLS
20:40
Get Data Into Databricks - Simple ETL Pipeline
10:05
Databricks
Рет қаралды 83 М.
18. Databricks & Pyspark: Ingest Data from Azure SQL Database
12:08
Raja's Data Engineering
Рет қаралды 34 М.
Simplify ETL pipelines on the Databricks Lakehouse
30:19
Databricks
Рет қаралды 26 М.
Real Time end to end Databricks Project | Databricks Project
29:33
learn by doing it
Рет қаралды 31 М.
Building an End-to-End ETL pipeline on Databricks
13:24
Databracket
Рет қаралды 21 М.
Azure Databricks Tutorial | Data transformations at scale
28:35
Adam Marczak - Azure for Everyone
Рет қаралды 393 М.
Это было очень близко...
00:10
Аришнев
Рет қаралды 6 МЛН