Hi, THanks for videos on Delta live table.However in this playlist continuity is missing. I means vedios are shuffeled. Could you pls add numbers in vedios. so that we can follows vedios one after others. so we can get more clarity. Example Introduction vedio comes 2nd.
@wayneliu70065 күн бұрын
Great series!!! I have two questions. We notice that there were 2 sample files loaded in sequence and files stayed in the source folder. With DLT, how could we move old files from source folder to archive folder/storage at the end of pipeline? What is best/better practice for data archiving when using DLT? Another question is how we can perform data retention, e.g. say delete/remove data older than 30 days from the tables managed by DLT pipelines? It would be great if we can talk about more about managing data life cycle with DLT context in the coming videos. Thanks a lot!
@Real-IndianArmyFan7 күн бұрын
everytime you are defining the schema explicitly, but what if we have 100's of files at the source location. you need to first start with historic load, then daily load, apply CDC for all these 100's of files into their respective tables. how do we handle such situation. Obviously it is not good idea to use multiple (100s of ) notebooks right
@tvyoutube1409 күн бұрын
good video.
@sarangKhedkar10 күн бұрын
Useful content 🎉❤
@sanjeevreddy369110 күн бұрын
meta store present in control plan or data plan?
@Frank-i2z5c10 күн бұрын
I don't think this was helpful. It didn't really explain the WHY of anything. Why the view, WHY the readStream how do you know when you should do a materialized view or not?
@mukeshnandy558921 күн бұрын
@softwaredevelopmentenginee5650 could you fix the sequence
@akshay1100022 күн бұрын
Your videos are very much beneficial, i have a use case where we need to process nested JSON and those nested have to saved into mulitple table.Also,we will receiving 2 different type of file like sales,purchase.each file is having different schema and transformation,would you be help on building end to end pipeline
@anandsahasrabudhe22 күн бұрын
Hello, This is a great series. One question, why have you not configured Keys in config table like other options. I tried configuring it, but each character in string value is getting considered as separate column and hence it is failing. eg: 'pkey' becomes 'p', 'k', 'e', 'y', 's' and then I am getting UNRESOLVED_COLUMN.WITH_SUGGESTION error
@bktumma22 күн бұрын
Nice video. I have created test cases for the user story. But once the user story closed, there are no more test cases. How to link these User story test cases to Test Plan? I want to add User Story to Test plan and link already written test cases. How to do it?
@Goureesh2223 күн бұрын
Hi Sir where will data get stored in databricks
@markmotarker25 күн бұрын
dude your voice
@sirajansari284827 күн бұрын
Thanks for the video. Can you please also share the git-repo of the code with us to practice ?
@sirajansari284827 күн бұрын
Thanks for uploading the videos, they are very helpful. However, Can you please rearrange the videos in their logical order ? I think this particular video should be the 3rd video, not the 1st one.
@jaguarprovoАй бұрын
Best damn summary I’ve seen about the need for DLT
@neeldarji4185Ай бұрын
What is sequence number of these videos in Playlist?
@ramakrishnareddynomula9123Ай бұрын
Hi, I need training in ETL Testing with Azure platform (Azure Synapse Analytics, Azure Data Lake, Azure Databricks and Azure SQL). If you provide training on the same, please share your contact details.
@abhishekbr6681Ай бұрын
How do we search multiple items in single search? Example: Assign to field, how to check for items in which multiple people are assigned by using the list of names
@andrejbelak9936Ай бұрын
Great series, when can we expect new video? How many episodes do you plan ?
@balajia8376Ай бұрын
Fantastic video. May I know the cluster type/creation steps which you are using to run the DLT notebook interactively? Also can you order/index these 8 videos in sequence, which is first to watch, second to last? Thanks
@naresh8743Ай бұрын
Thank you for making this. qq on DLT, is there a way I can control reading only from xth year and month from the s3bucket having tonnes of history partitioned by year and month? I know we can apply the filter condition on the dataframe, however, I do not want dlt to scan the entire history and filter and stream from the point I'm interested in. Appreciate your time in responding to it.
@PierreRoussinАй бұрын
Give a shot at DLT-Meta.
@NaveenVuppala-z2l2 ай бұрын
For incremental files we use auto loader , Do we need to use auto loader if we use DLT. Since DLT have flows
@softwaredevelopmentenginee56502 ай бұрын
DLT works with autoloader
@aashishraina28312 ай бұрын
One suggestion: vidoes are not sequenced properly.
@yvishal5192 ай бұрын
Can we use the wildcard path for data loading In my scenario I have folders like Inc/table_12072024/data.parquet Inc/table_13072024/data.parquet Everyday new folders getting created inside inc folders and inside that data folder is getting updated So any suggestions how to handle this type of scenario incrementally in dlt tables
@softwaredevelopmentenginee56502 ай бұрын
You don't need *, autoloader with file notifications will take care on its own you just need to give path till Inc folder
@DilipDiwakarAricent3 ай бұрын
I think .. storing schema in table is not good sense.. we should pass dynamically schema location which managed by Autoloader😊.. think about 100 application and 10k tables😂
@softwaredevelopmentenginee56502 ай бұрын
yes, in real world project we should use dynamic schema, and as i am fetching the metadata similarly we can either store schema in external location or in database and create schema on the fly based on source system
@Real-IndianArmyFan7 күн бұрын
@@softwaredevelopmentenginee5650 can you show how to create schema on the fly and create dynamic schema
@aadil84093 ай бұрын
sir catalog by default is created to which storage... by default storage of databricks or the storage as you created with the databases connector you showed in second video of catalog??
@DilipDiwakarAricent3 ай бұрын
It will choose the default metastore location 😊
@aadil84093 ай бұрын
sir does the unity catlog works on any cluster?? means you said... first you told metastore is connected to adls. then what is the work of this unity catlog cluster, and if we terminate it... does all the data of the unity catlog will be deleted??
@softwaredevelopmentenginee56502 ай бұрын
no it won't delete the data, if cluster is turned off. Also unity catalog works on runtime greater then 12.2 TLS
@aadil84093 ай бұрын
you are saying, if you are having metastore in some different region and workspace in some other region we cannot connect both.... then you are saying we can have multiple metastore, but 2 metastore cannot be of same region. now due to this... if suppose we create 2 metastore in a workspace, one metastore is published in the same region where the workspace is there. but now suppose if we are having second metastore which is not in the region where the workspace is there, then what is the use of having multiple metastore for one workspace, since we cannot connect to remaining metastore.
@aadil84093 ай бұрын
this storage adls a/c is outside of the databricks a/c and it's not a default databricks storage.
@aadil84093 ай бұрын
thanks for such a clear and simple explanation. Please wear a mic for recording, becuase what you are saying I cannot hear that in laptop mics, then i used my earphone then also your voice is coming very low, then i connected a wireless speaker to listen to you. so, kindly use a mic. Thanks
@softwaredevelopmentenginee56502 ай бұрын
Thank you, I will
@letsunderstand38243 ай бұрын
How to track lineage as after Apply changes drops the Lineage
@vrsubrahmanyamkollipara78823 ай бұрын
Nice explanation, Can we duplicate a test case and do changes in that duplicate test case?
@softwaredevelopmentenginee56502 ай бұрын
Yes, you can
@tanmay96493 ай бұрын
Thank you so much bro, You saved my lot of time Got rid of boring work Also got good comments from seniors because of you
@softwaredevelopmentenginee56503 ай бұрын
Happy to help
@hareeshabm69274 ай бұрын
What if accidentally all the steps are deleted is there any steps to recover
@softwaredevelopmentenginee56502 ай бұрын
it depends on the policy
@mehul52174 ай бұрын
very insightful👌👌👍
@softwaredevelopmentenginee56502 ай бұрын
Glad you liked it
@surajbasha90624 ай бұрын
2. Can we use yarn as a cluster manager or resource manager in spark in databricks? In real time?
@softwaredevelopmentenginee56504 ай бұрын
As far as i know, you cannot use YARN (Yet Another Resource Negotiator) as a cluster manager or resource manager in Spark within Databricks. Databricks uses its own optimized resource management and cluster management system, which is built on top of Apache Spark.
@surajbasha90624 ай бұрын
Recent interview questions: 1. If you are using unity catelog in your project then can we use service principals to connect adf to batabricks? Sir can you please explain in depth.
@softwaredevelopmentenginee56504 ай бұрын
Not sure how unity catalog is even matters here.. may be interviewer want to know if we can intract with the data present in unity catalog tables or want to create table in uc using ADF. What i know is service principal in Azure provides a secure and scalable way to handle authentication and authorization for automated processes and applications. Specific to Azure Data Factory and Databricks Integration: So if he want to intract with databricks unitycatalog using ADF then Unfortunately nothing directly in ADF by which you can access untiy catalog tables, However you can use a Databricks job or Web Activity to call Databricks SQL Now , to use the Databricks REST API from within Azure Data Factory to call the commands to make our table available in Unity Catalog, we need to authenticate and service principle is one of the way which you can use.. Not sure if i am able to answer your question, but feel free to share me more quesions you faced in interview with company name that will help others as well.. thanks :)
@surajbasha90624 ай бұрын
What is the difference in unity catelog and hive metastore
@softwaredevelopmentenginee56504 ай бұрын
Please don't confuse with Unity Catalog Metastore here with the hive Metastore. The Unity Catalog Metastore is a brand new IP developed by Databricks. The Hive Metastore still exists in the Databricks workspace, but the advice is not to use them on a unity catalog enabled workspace. While Hive Metastore is focused on managing metadata within the Hadoop ecosystem with limited governance features, Unity Catalog Metastore offers a unified, advanced governance solution for modern data architectures, streamlining metadata management, security, and compliance. They're just there for backward compatibility and to support projects migrating to Unity catalog. If you are continuing to use Hive Metastore, you won't get any of the new functionalities offered by the Unity catalog such as Data Lineage, Audit Log, etc.
@destroyer37064 ай бұрын
where are the tables, ? how we can see those tables which are created , ? like SQL server database style, how can we query our datasets after pipeline finished running.
@softwaredevelopmentenginee56504 ай бұрын
i have created tables in unity catalog you can query data like normal tables
@venkata9284 ай бұрын
Thanks for the excellent video. Can you please make a video on what to test in Azure Databricks and how to write test cases for Azure Databricks ETL. If you are planning any training for these, please let me know or can you please give me your contact number to talk to you on this? bit urgent for me!
@softwaredevelopmentenginee56504 ай бұрын
sure will try to create.. you can ping me in telegram
@sweethad7534 ай бұрын
Testcase title is validate the login functionality
@mazharkhatri7794 ай бұрын
I was searching more details about what you explain in your videos Awesome How can I approach to you if I have any questions
@softwaredevelopmentenginee56504 ай бұрын
you can ping me in telegram
@mazharkhatri7794 ай бұрын
The way you explain is awesome, but that would great if you explain the steps of writing the testcases or show the testcases which are already written and then at the time of creating testsuite how does queries work to link the same testcases
@itzashik4 ай бұрын
If you code share the code in the description it will be helpful
@softwaredevelopmentenginee56504 ай бұрын
Sure will add the code as well from next time.. thanks for your feedback
@itzashik4 ай бұрын
If you share the the code and data file I can practice if will be easy for the first timmer to get similar with code.
@softwaredevelopmentenginee56504 ай бұрын
Sure will add the code as well from next time.. thanks for your feedback
@Bell4Fun4 ай бұрын
I have basic version and dont see any of the options
@shashankgupta35494 ай бұрын
Amazing, I like the matter covered in great depth here!
@softwaredevelopmentenginee56504 ай бұрын
Glad you liked it!
@prabhatsrivastava61484 ай бұрын
One of the best tutorial videos i have gone through. Explanation is very clear and pace is really good. Thank you making this playlist❤