DAIS Summary
15:17
3 ай бұрын
Custom Logging in Databricks
17:02
Delta Table  - Clone
19:30
Жыл бұрын
Delta Table Transaction Log   Part 2
18:20
Delta Table Transaction Log   Part 1
13:49
optimization in spark
13:03
2 жыл бұрын
databricks   connect to SQL db
17:41
2 жыл бұрын
window functions in databricks
18:25
2 жыл бұрын
Databricks streaming part 2
12:01
2 жыл бұрын
Stream processing with Databricks
26:49
Schedule azure databricks notebook
36:11
Read from Rest API
18:31
3 жыл бұрын
Implementing SCD Type 2 using Delta
13:56
Пікірлер
@FerozFerozAbdulhannan
@FerozFerozAbdulhannan 13 күн бұрын
great Presentation Jithesh. Can you help in giving the list of all the job that are completing or Failure using dashboard.
@KnowledgeSharingjkb
@KnowledgeSharingjkb 13 күн бұрын
@@FerozFerozAbdulhannan sure. Will let you know
@RichardsonIsaac-r2e
@RichardsonIsaac-r2e Ай бұрын
Heidenreich Forge
@ranadip123
@ranadip123 3 ай бұрын
really helpful, thank you for sharing
@ericjanssens3475
@ericjanssens3475 6 ай бұрын
So far off! Scd type 2 requires a unique surrogate key to join with a fact FK!
@CrickOGGY
@CrickOGGY 7 ай бұрын
Is this hue
@Parquet773
@Parquet773 7 ай бұрын
Just finding your channel today. You are an AWSEOME teacher, presenter, and practitioner. Thanks much for sharing your knowledge!
@lokeswarv
@lokeswarv 7 ай бұрын
Hey, Try to run merge query again and again. It will insert the records into Dim Table.. Beacuse of joinkey considering as null always EmoloyeeId from target not matched with null and keep on inserting records
@rockingrakesh8197
@rockingrakesh8197 9 ай бұрын
Can we use this method to read excel file by placing files in gen2 and reading excel files using pyspark. Since iam not able to do same from storage account. Pls reply
@KnowledgeSharingjkb
@KnowledgeSharingjkb 8 ай бұрын
yes, you do this by uploading into the storage gen 2
@y.c.breddy3153
@y.c.breddy3153 9 ай бұрын
Hi Bro How can I connect azure data studio from databricks and databricks to data lake then datalake to snowflake can you help me
@KnowledgeSharingjkb
@KnowledgeSharingjkb 8 ай бұрын
can I know the reason to connect to azure data studio from databricks? I didnt try this method as I dont have any use case
@sravankumar1767
@sravankumar1767 9 ай бұрын
Superb explanation 👌 👏 👍
@KnowledgeSharingjkb
@KnowledgeSharingjkb 8 ай бұрын
Glad you liked it
@shilpananda6335
@shilpananda6335 10 ай бұрын
How can I import a notebook along with visualization .actually I have created a notebook and visualization with the results and now I want to migrate them in prod
@KnowledgeSharingjkb
@KnowledgeSharingjkb 8 ай бұрын
best approach is to use github
@janblasko4949
@janblasko4949 10 ай бұрын
The cmd 4 did not work. I have the excel in Microsoft Azure storage
@CoopmanGreg
@CoopmanGreg 10 ай бұрын
I think you should re-title this video as "Databricks credential pass through". This was specifically what I was seeking for and almost did not click on it because I did not think it was Databricks focused. ....just a thought. Thanks
@KnowledgeSharingjkb
@KnowledgeSharingjkb 10 ай бұрын
Sure. Thanks for the suggestions
@SatishKumar-n3t5t
@SatishKumar-n3t5t 11 ай бұрын
if there is no change in source data and we try to run the merge code again as part of daily run then the mergeKey null records will be inserted again into target column as active and we will be ending with duplicates , how to solve it ?
@KnowledgeSharingjkb
@KnowledgeSharingjkb 8 ай бұрын
there should not be null values in the key columns. please handle nulls before the insertion
@sankarkumarazad3843
@sankarkumarazad3843 11 ай бұрын
Great Explaination. How do we decide which worker and driver type is to be selected. And how many instances of workers are to be used. Are there any set of rules or calculations to decide??
@KnowledgeSharingjkb
@KnowledgeSharingjkb 11 ай бұрын
It should be based on the work load. Normally we will not do any work on the driver unless the user using data science codes using pandas. If you add multiple nodes, then your parallelism increase. Again please note that if the high volume data processing required from the beginning, then you can add more capacity to the nodes. It requires separate session to explain. Let me add video
@muthukumar-rj8ik
@muthukumar-rj8ik Жыл бұрын
I need you’re help, possible to connect with you over call
@ORARAR
@ORARAR Жыл бұрын
is there a way to connect databricks from Oracle SQL Developer ?
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
Didn’t try that. Should be there
@praveenkumarkumawat7203
@praveenkumarkumawat7203 Жыл бұрын
this mathod not working in synapse notebook .
@KnowledgeSharingjkb
@KnowledgeSharingjkb 11 ай бұрын
oh ok. didnt try in synapse one. will try and let you know
@subburayadu-bc8jh
@subburayadu-bc8jh Жыл бұрын
Hi sir, I am facing connectivity issue from power bi to Azure databricks. This is the error: Details: "ODBC : ERROR [HY000] [ Microsoft][ThriftExtension] (14) Unexpected response from server during a HTTP Connection: SSL_Connect: Certificate verify failed.". Can you please help me in above issue.
@KnowledgeSharingjkb
@KnowledgeSharingjkb 11 ай бұрын
how are you connecting. is this your organization laptop or personal one. if it is your office laptop, work with your network team.
@Creativesoulsowmya
@Creativesoulsowmya 4 ай бұрын
Is this issue resolved andi
@kenpachi-zaraki33
@kenpachi-zaraki33 Жыл бұрын
can you please write scd type 2 code in generic way currently you have written it only for the one column please and thank you.
@KnowledgeSharingjkb
@KnowledgeSharingjkb 11 ай бұрын
yes, this is an example. please let me know your requirement in detail.
@KrishnaGupta-dd1mo
@KrishnaGupta-dd1mo Жыл бұрын
It helpful. Thanks
@vinayakbiju932
@vinayakbiju932 Жыл бұрын
can you share the file you have uploaded here the csv file
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/2683250055805286/1609000298248664/3892374572023226/latest.html
@rajsekhargada9212
@rajsekhargada9212 Жыл бұрын
what if othet column updated apart from address
@KnowledgeSharingjkb
@KnowledgeSharingjkb 11 ай бұрын
use the columns that you need to consider for scd type 2. this is just an example
@rnunez2496
@rnunez2496 Жыл бұрын
How did you get your dashboard to look like that? its not letting me write code in the dashboard
@maheboobpatel573
@maheboobpatel573 Жыл бұрын
great but try to mention your linkedin try to share your notebook link on a repo so that we can get the code
@eric8188
@eric8188 Жыл бұрын
Hi, can global view directly be accessible by powerBI?
@dhirajandhere8850
@dhirajandhere8850 Жыл бұрын
Hi watched this video, it is really helpful one quick question. Once we shutdown logging the file is written to storage (ADLS in my case). After that i am unable to write data to same file throws an error. Can you please help with that
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
Can you please share your code
@xxczerxx
@xxczerxx Жыл бұрын
Is there a reason why you shouldn't do this? I am surprised this isn't encouraged as a best practice which makes me think I'm missing something
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
If the data volume is high, it may affect your program performance. ADF will be your best choice for such scenarios
@Learn2Share786
@Learn2Share786 Жыл бұрын
can we read pivot Excel connected to azure analysis services uding this method?
@midhunrajaramanatha5311
@midhunrajaramanatha5311 Жыл бұрын
Hi can make video about auto loader and structured streaming
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
will create one for auto loader. please watch this video for streaming kzbin.info/www/bejne/jYq2kmWaiqZ_d8U
@runilkumar3127
@runilkumar3127 Жыл бұрын
Thanks Jithesh for this video. Really helpfull
@Suriya_MSM
@Suriya_MSM Жыл бұрын
Hi sir , what if i want to fill the null columns in salary with the average of preceding and successive values ?
@Suriya_MSM
@Suriya_MSM Жыл бұрын
and if there are continuous null values then first populate the average for the first null values with the average and then .. with that updated value and the next successive value calculate the average for the 2nd null value
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
@@Suriya_MSM I think I am not clear. Can you please paste the example
@arpithasp7500
@arpithasp7500 Жыл бұрын
Thank you for this content
@lucaschiqui
@lucaschiqui Жыл бұрын
Hi, excelent video. I have a question, is there a way to schedule an email sending with the dashboard information? For example to receive everyday an email with a pdf or a link in which I can see the dashboard with its information updated.
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
I didnt try this option. will try and let you know
@KomalSingh-mi7ux
@KomalSingh-mi7ux Жыл бұрын
Hi how do we troubeshoot spark driver error no parent missing and null pointer exception
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
You can go to the driver logs and dig deep
@kanumuriharshith6581
@kanumuriharshith6581 Жыл бұрын
import pandas as pd df=pd.read_excel("filename.xlsx")
@maheshraccha5957
@maheshraccha5957 Жыл бұрын
Thank you so much for the realtime explanation of shallow vs deep clone - I have been searching for it - It's a great explanation!
@khandoor7228
@khandoor7228 Жыл бұрын
this is top notch content!! Excellent!!
@mohdtoufique7446
@mohdtoufique7446 Жыл бұрын
Hi..Thanks for the content! I am converting the pandas df to spark dataframe in databricks but getting an error cannot infer schema, I have used the parameter inferschema=True,.The pyspark version is 3.0 ...Can you please help me with this
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
can you please share your code
@CoopmanGreg
@CoopmanGreg Жыл бұрын
Fantastic video, example and explanation. Thanks!
@eljangoolak
@eljangoolak Жыл бұрын
when I use direct query, query folding doesn't happen so it tries to import everything which can't happen because database is too large... how can I solve this?
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
how much is the data size
@eljangoolak
@eljangoolak Жыл бұрын
@@KnowledgeSharingjkb 33billion rows
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
I believe it is the power bi issue as it has size restrictions
@swarup19051979
@swarup19051979 Жыл бұрын
Excellent topic and well explained
@MohanKumar-ge4nv
@MohanKumar-ge4nv Жыл бұрын
Thank you for the video. How I can create a function based on this example ? For example I have 100 columns in DataFrame1 and 100 columns in DataFrame2 now I want to replace null values in DataFrame1 with DataFrame2. Note: Both DataFrame1 and DataFrame2 have same column names. Thanks in advance!!
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
are you thinking to create function that accepts the columns as parameter and then replace the values?
@13Keerthana
@13Keerthana Жыл бұрын
Nice video. clearly explained I have a blocker. While running the dbutils.fs.mount(), I'm getting the below error: Unsupported Azure Scheme: abfss
@shanhuahuang3063
@shanhuahuang3063 Жыл бұрын
i have encounter an ssl issues could you help?
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
Please let me know your issue
@shividhun8675
@shividhun8675 Жыл бұрын
Is it only me or someone else have the same question, like while creating shareanalysis table first line is Drop table if exists, the How come data can be there unless we run the insert command?
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
can you please elaborate it
@mranaljadhav8259
@mranaljadhav8259 Жыл бұрын
Thank you so much...today I learn new concept I will add it into my resume.
@muvvalabhaskar3948
@muvvalabhaskar3948 Жыл бұрын
how can i get the dataset for this example
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
It is available in Yahoo finance. You download it from there. Let me check I can add it here
@161vinumail.comvinu6
@161vinumail.comvinu6 Жыл бұрын
Sir is it possible in community databricks?
@KnowledgeSharingjkb
@KnowledgeSharingjkb Жыл бұрын
I didn’t try this honestly but should work
@ikernarbaiza2138
@ikernarbaiza2138 Жыл бұрын
Thank you, very well explained. I have an important question, as I saw in the Internet, the users have to pay charges for terminated clusters despite not being running, my question is if there is any way to delete the cluster once the execution is done, this way you can safe money, because I have to schedule a job to be done every day in a year, for example. Thank you.
@KnowledgeSharingjkb
@KnowledgeSharingjkb 10 ай бұрын
there is no charge to you if the cluster is inactive. we can also programmatically delete the cluster