It is amazing how concisely you put so much information in one video! Great!
@joselitophala5307Күн бұрын
Thanks, Bryan, for these wonderful videos, this has really helped me a lot in understanding these confusing terminologies that I tend to interchange sometimes (Data Warehouse, Data Lake, Data Lakehouse, Delta Table) as I don't have clear distinction between them and knowing a bit of their history is a great addition!
@BryanCafferky23 сағат бұрын
Glad it helps.
@ayandapeter16816 ай бұрын
Sir, I just want to say thank you so much, I've gone through many videos but was still confused, u made this crystal clear with all your conceptual approach.
@BryanCafferky6 ай бұрын
Thank you for kind words. I'm so glad my videos are helping you. That's why I do them. I know this technology is not easy to learn so kudos to you for sticking with it.
@sujithravindran7082 Жыл бұрын
I really enjoyed the perspective you brought into the evolution. Great work. Please keep bringing in these great videos. Thank you very much.
@BryanCafferky Жыл бұрын
Thank You! and you're welcome.
@potnuruavinash7 ай бұрын
Can we implement data lakehouse with open source tools like spark, presto & hive metastore ? is there any alternative for unity catalog in open source eco system
@BryanCafferky7 ай бұрын
Lakehouse is just Delta Lake, i.e., delta tables which are available in open source Spark so yes. Unity Catalog is really just a catalog of catalogs so you could build your own central catalog by extracting the meta data from local Hive metastores. I believe Spark tends to work one cluster at a time unlike Databricks which spins any number of clusters up as needed so not sure if UC could be implemented on open source Spark but perhaps?
@rich1112962 жыл бұрын
do you have an example in any of your videos connecting to an s3 bucket specifying an endpoint within databricks? basically how to connect to an s3 bucket from a service other than aws? Thanks
@BryanCafferky Жыл бұрын
Hmmmm.... No have not tried that. Have you googled it?
@rich111296 Жыл бұрын
@@BryanCafferky yeah ha, i did find a solution eventually, i think somewhere from stack overflow, searched around several places so i don't have the exact source "sc
@rich111296 Жыл бұрын
and run the function obvi
@HasanCatalgol5 ай бұрын
Underrated channel, really quality information.
@sundsrik21542 ай бұрын
Beautiful explanation! Loved it
@prarthananeesh9 ай бұрын
Can we use the lakehouse to replace a transactional system ?
@BryanCafferky9 ай бұрын
See my reply to your question about OLTP.
@gardnmi2 жыл бұрын
I'd love to see a non-bias comparison between delta lake, hudi, and iceberg.
@BryanCafferky2 жыл бұрын
So would I. lol. Iceberg seems to be Snowflake's version of Lakehouse. Not sure about hudi.
@BryanCafferky2 жыл бұрын
Looks like Amazon is promoting hudi.
@wennie2939 Жыл бұрын
Best video on this topic ever!
@prarthananeesh9 ай бұрын
Is it mainly used for OLAP or can this be used for OLTP also ?
@BryanCafferky9 ай бұрын
It's meant for data warehousing, i.e., warehouse = lake + house, so warehouse on a data lake. OLTP has stringent requirements like high data transactions concurrency, referential integrity, etc. Delta logging is done at a file level whereas SQL databases log at a row level. See my video on Delta logs to get an understanding of what I mean.