Delta Live Tables || Introduction
10:39
what to test in ADF Pipeline
16:39
What is spark job
17:24
9 ай бұрын
How to skip rows in pyspark
2:51
9 ай бұрын
Пікірлер
@ramm3020
@ramm3020 7 сағат бұрын
Hi, THanks for videos on Delta live table.However in this playlist continuity is missing. I means vedios are shuffeled. Could you pls add numbers in vedios. so that we can follows vedios one after others. so we can get more clarity. Example Introduction vedio comes 2nd.
@wayneliu7006
@wayneliu7006 5 күн бұрын
Great series!!! I have two questions. We notice that there were 2 sample files loaded in sequence and files stayed in the source folder. With DLT, how could we move old files from source folder to archive folder/storage at the end of pipeline? What is best/better practice for data archiving when using DLT? Another question is how we can perform data retention, e.g. say delete/remove data older than 30 days from the tables managed by DLT pipelines? It would be great if we can talk about more about managing data life cycle with DLT context in the coming videos. Thanks a lot!
@Real-IndianArmyFan
@Real-IndianArmyFan 7 күн бұрын
everytime you are defining the schema explicitly, but what if we have 100's of files at the source location. you need to first start with historic load, then daily load, apply CDC for all these 100's of files into their respective tables. how do we handle such situation. Obviously it is not good idea to use multiple (100s of ) notebooks right
@tvyoutube140
@tvyoutube140 9 күн бұрын
good video.
@sarangKhedkar
@sarangKhedkar 10 күн бұрын
Useful content 🎉❤
@sanjeevreddy3691
@sanjeevreddy3691 10 күн бұрын
meta store present in control plan or data plan?
@Frank-i2z5c
@Frank-i2z5c 10 күн бұрын
I don't think this was helpful. It didn't really explain the WHY of anything. Why the view, WHY the readStream how do you know when you should do a materialized view or not?
@mukeshnandy5589
@mukeshnandy5589 21 күн бұрын
@softwaredevelopmentenginee5650 could you fix the sequence
@akshay11000
@akshay11000 22 күн бұрын
Your videos are very much beneficial, i have a use case where we need to process nested JSON and those nested have to saved into mulitple table.Also,we will receiving 2 different type of file like sales,purchase.each file is having different schema and transformation,would you be help on building end to end pipeline
@anandsahasrabudhe
@anandsahasrabudhe 22 күн бұрын
Hello, This is a great series. One question, why have you not configured Keys in config table like other options. I tried configuring it, but each character in string value is getting considered as separate column and hence it is failing. eg: 'pkey' becomes 'p', 'k', 'e', 'y', 's' and then I am getting UNRESOLVED_COLUMN.WITH_SUGGESTION error
@bktumma
@bktumma 22 күн бұрын
Nice video. I have created test cases for the user story. But once the user story closed, there are no more test cases. How to link these User story test cases to Test Plan? I want to add User Story to Test plan and link already written test cases. How to do it?
@Goureesh22
@Goureesh22 23 күн бұрын
Hi Sir where will data get stored in databricks
@markmotarker
@markmotarker 25 күн бұрын
dude your voice
@sirajansari2848
@sirajansari2848 27 күн бұрын
Thanks for the video. Can you please also share the git-repo of the code with us to practice ?
@sirajansari2848
@sirajansari2848 27 күн бұрын
Thanks for uploading the videos, they are very helpful. However, Can you please rearrange the videos in their logical order ? I think this particular video should be the 3rd video, not the 1st one.
@jaguarprovo
@jaguarprovo Ай бұрын
Best damn summary I’ve seen about the need for DLT
@neeldarji4185
@neeldarji4185 Ай бұрын
What is sequence number of these videos in Playlist?
@ramakrishnareddynomula9123
@ramakrishnareddynomula9123 Ай бұрын
Hi, I need training in ETL Testing with Azure platform (Azure Synapse Analytics, Azure Data Lake, Azure Databricks and Azure SQL). If you provide training on the same, please share your contact details.
@abhishekbr6681
@abhishekbr6681 Ай бұрын
How do we search multiple items in single search? Example: Assign to field, how to check for items in which multiple people are assigned by using the list of names
@andrejbelak9936
@andrejbelak9936 Ай бұрын
Great series, when can we expect new video? How many episodes do you plan ?
@balajia8376
@balajia8376 Ай бұрын
Fantastic video. May I know the cluster type/creation steps which you are using to run the DLT notebook interactively? Also can you order/index these 8 videos in sequence, which is first to watch, second to last? Thanks
@naresh8743
@naresh8743 Ай бұрын
Thank you for making this. qq on DLT, is there a way I can control reading only from xth year and month from the s3bucket having tonnes of history partitioned by year and month? I know we can apply the filter condition on the dataframe, however, I do not want dlt to scan the entire history and filter and stream from the point I'm interested in. Appreciate your time in responding to it.
@PierreRoussin
@PierreRoussin Ай бұрын
Give a shot at DLT-Meta.
@NaveenVuppala-z2l
@NaveenVuppala-z2l 2 ай бұрын
For incremental files we use auto loader , Do we need to use auto loader if we use DLT. Since DLT have flows
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
DLT works with autoloader
@aashishraina2831
@aashishraina2831 2 ай бұрын
One suggestion: vidoes are not sequenced properly.
@yvishal519
@yvishal519 2 ай бұрын
Can we use the wildcard path for data loading In my scenario I have folders like Inc/table_12072024/data.parquet Inc/table_13072024/data.parquet Everyday new folders getting created inside inc folders and inside that data folder is getting updated So any suggestions how to handle this type of scenario incrementally in dlt tables
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
You don't need *, autoloader with file notifications will take care on its own you just need to give path till Inc folder
@DilipDiwakarAricent
@DilipDiwakarAricent 3 ай бұрын
I think .. storing schema in table is not good sense.. we should pass dynamically schema location which managed by Autoloader😊.. think about 100 application and 10k tables😂
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
yes, in real world project we should use dynamic schema, and as i am fetching the metadata similarly we can either store schema in external location or in database and create schema on the fly based on source system
@Real-IndianArmyFan
@Real-IndianArmyFan 7 күн бұрын
@@softwaredevelopmentenginee5650 can you show how to create schema on the fly and create dynamic schema
@aadil8409
@aadil8409 3 ай бұрын
sir catalog by default is created to which storage... by default storage of databricks or the storage as you created with the databases connector you showed in second video of catalog??
@DilipDiwakarAricent
@DilipDiwakarAricent 3 ай бұрын
It will choose the default metastore location 😊
@aadil8409
@aadil8409 3 ай бұрын
sir does the unity catlog works on any cluster?? means you said... first you told metastore is connected to adls. then what is the work of this unity catlog cluster, and if we terminate it... does all the data of the unity catlog will be deleted??
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
no it won't delete the data, if cluster is turned off. Also unity catalog works on runtime greater then 12.2 TLS
@aadil8409
@aadil8409 3 ай бұрын
you are saying, if you are having metastore in some different region and workspace in some other region we cannot connect both.... then you are saying we can have multiple metastore, but 2 metastore cannot be of same region. now due to this... if suppose we create 2 metastore in a workspace, one metastore is published in the same region where the workspace is there. but now suppose if we are having second metastore which is not in the region where the workspace is there, then what is the use of having multiple metastore for one workspace, since we cannot connect to remaining metastore.
@aadil8409
@aadil8409 3 ай бұрын
this storage adls a/c is outside of the databricks a/c and it's not a default databricks storage.
@aadil8409
@aadil8409 3 ай бұрын
thanks for such a clear and simple explanation. Please wear a mic for recording, becuase what you are saying I cannot hear that in laptop mics, then i used my earphone then also your voice is coming very low, then i connected a wireless speaker to listen to you. so, kindly use a mic. Thanks
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
Thank you, I will
@letsunderstand3824
@letsunderstand3824 3 ай бұрын
How to track lineage as after Apply changes drops the Lineage
@vrsubrahmanyamkollipara7882
@vrsubrahmanyamkollipara7882 3 ай бұрын
Nice explanation, Can we duplicate a test case and do changes in that duplicate test case?
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
Yes, you can
@tanmay9649
@tanmay9649 3 ай бұрын
Thank you so much bro, You saved my lot of time Got rid of boring work Also got good comments from seniors because of you
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 3 ай бұрын
Happy to help
@hareeshabm6927
@hareeshabm6927 4 ай бұрын
What if accidentally all the steps are deleted is there any steps to recover
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
it depends on the policy
@mehul5217
@mehul5217 4 ай бұрын
very insightful👌👌👍
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 2 ай бұрын
Glad you liked it
@surajbasha9062
@surajbasha9062 4 ай бұрын
2. Can we use yarn as a cluster manager or resource manager in spark in databricks? In real time?
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
As far as i know, you cannot use YARN (Yet Another Resource Negotiator) as a cluster manager or resource manager in Spark within Databricks. Databricks uses its own optimized resource management and cluster management system, which is built on top of Apache Spark.
@surajbasha9062
@surajbasha9062 4 ай бұрын
Recent interview questions: 1. If you are using unity catelog in your project then can we use service principals to connect adf to batabricks? Sir can you please explain in depth.
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Not sure how unity catalog is even matters here.. may be interviewer want to know if we can intract with the data present in unity catalog tables or want to create table in uc using ADF. What i know is service principal in Azure provides a secure and scalable way to handle authentication and authorization for automated processes and applications. Specific to Azure Data Factory and Databricks Integration: So if he want to intract with databricks unitycatalog using ADF then Unfortunately nothing directly in ADF by which you can access untiy catalog tables, However you can use a Databricks job or Web Activity to call Databricks SQL Now , to use the Databricks REST API from within Azure Data Factory to call the commands to make our table available in Unity Catalog, we need to authenticate and service principle is one of the way which you can use.. Not sure if i am able to answer your question, but feel free to share me more quesions you faced in interview with company name that will help others as well.. thanks :)
@surajbasha9062
@surajbasha9062 4 ай бұрын
What is the difference in unity catelog and hive metastore
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Please don't confuse with Unity Catalog Metastore here with the hive Metastore. The Unity Catalog Metastore is a brand new IP developed by Databricks. The Hive Metastore still exists in the Databricks workspace, but the advice is not to use them on a unity catalog enabled workspace. While Hive Metastore is focused on managing metadata within the Hadoop ecosystem with limited governance features, Unity Catalog Metastore offers a unified, advanced governance solution for modern data architectures, streamlining metadata management, security, and compliance. They're just there for backward compatibility and to support projects migrating to Unity catalog. If you are continuing to use Hive Metastore, you won't get any of the new functionalities offered by the Unity catalog such as Data Lineage, Audit Log, etc.
@destroyer3706
@destroyer3706 4 ай бұрын
where are the tables, ? how we can see those tables which are created , ? like SQL server database style, how can we query our datasets after pipeline finished running.
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
i have created tables in unity catalog you can query data like normal tables
@venkata928
@venkata928 4 ай бұрын
Thanks for the excellent video. Can you please make a video on what to test in Azure Databricks and how to write test cases for Azure Databricks ETL. If you are planning any training for these, please let me know or can you please give me your contact number to talk to you on this? bit urgent for me!
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
sure will try to create.. you can ping me in telegram
@sweethad753
@sweethad753 4 ай бұрын
Testcase title is validate the login functionality
@mazharkhatri779
@mazharkhatri779 4 ай бұрын
I was searching more details about what you explain in your videos Awesome How can I approach to you if I have any questions
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
you can ping me in telegram
@mazharkhatri779
@mazharkhatri779 4 ай бұрын
The way you explain is awesome, but that would great if you explain the steps of writing the testcases or show the testcases which are already written and then at the time of creating testsuite how does queries work to link the same testcases
@itzashik
@itzashik 4 ай бұрын
If you code share the code in the description it will be helpful
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Sure will add the code as well from next time.. thanks for your feedback
@itzashik
@itzashik 4 ай бұрын
If you share the the code and data file I can practice if will be easy for the first timmer to get similar with code.
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Sure will add the code as well from next time.. thanks for your feedback
@Bell4Fun
@Bell4Fun 4 ай бұрын
I have basic version and dont see any of the options
@shashankgupta3549
@shashankgupta3549 4 ай бұрын
Amazing, I like the matter covered in great depth here!
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Glad you liked it!
@prabhatsrivastava6148
@prabhatsrivastava6148 4 ай бұрын
One of the best tutorial videos i have gone through. Explanation is very clear and pace is really good. Thank you making this playlist❤
@softwaredevelopmentenginee5650
@softwaredevelopmentenginee5650 4 ай бұрын
Glad you liked it!