Data Lakehouse: An Introduction

  Рет қаралды 22,935

Bryan Cafferky

Bryan Cafferky

Күн бұрын

Пікірлер
@DenisGorev-xj5hl
@DenisGorev-xj5hl Жыл бұрын
It is amazing how concisely you put so much information in one video! Great!
@joselitophala5307
@joselitophala5307 Күн бұрын
Thanks, Bryan, for these wonderful videos, this has really helped me a lot in understanding these confusing terminologies that I tend to interchange sometimes (Data Warehouse, Data Lake, Data Lakehouse, Delta Table) as I don't have clear distinction between them and knowing a bit of their history is a great addition!
@BryanCafferky
@BryanCafferky 23 сағат бұрын
Glad it helps.
@ayandapeter1681
@ayandapeter1681 6 ай бұрын
Sir, I just want to say thank you so much, I've gone through many videos but was still confused, u made this crystal clear with all your conceptual approach.
@BryanCafferky
@BryanCafferky 6 ай бұрын
Thank you for kind words. I'm so glad my videos are helping you. That's why I do them. I know this technology is not easy to learn so kudos to you for sticking with it.
@sujithravindran7082
@sujithravindran7082 Жыл бұрын
I really enjoyed the perspective you brought into the evolution. Great work. Please keep bringing in these great videos. Thank you very much.
@BryanCafferky
@BryanCafferky Жыл бұрын
Thank You! and you're welcome.
@potnuruavinash
@potnuruavinash 7 ай бұрын
Can we implement data lakehouse with open source tools like spark, presto & hive metastore ? is there any alternative for unity catalog in open source eco system
@BryanCafferky
@BryanCafferky 7 ай бұрын
Lakehouse is just Delta Lake, i.e., delta tables which are available in open source Spark so yes. Unity Catalog is really just a catalog of catalogs so you could build your own central catalog by extracting the meta data from local Hive metastores. I believe Spark tends to work one cluster at a time unlike Databricks which spins any number of clusters up as needed so not sure if UC could be implemented on open source Spark but perhaps?
@rich111296
@rich111296 2 жыл бұрын
do you have an example in any of your videos connecting to an s3 bucket specifying an endpoint within databricks? basically how to connect to an s3 bucket from a service other than aws? Thanks
@BryanCafferky
@BryanCafferky Жыл бұрын
Hmmmm.... No have not tried that. Have you googled it?
@rich111296
@rich111296 Жыл бұрын
@@BryanCafferky yeah ha, i did find a solution eventually, i think somewhere from stack overflow, searched around several places so i don't have the exact source "sc
@rich111296
@rich111296 Жыл бұрын
and run the function obvi
@HasanCatalgol
@HasanCatalgol 5 ай бұрын
Underrated channel, really quality information.
@sundsrik2154
@sundsrik2154 2 ай бұрын
Beautiful explanation! Loved it
@prarthananeesh
@prarthananeesh 9 ай бұрын
Can we use the lakehouse to replace a transactional system ?
@BryanCafferky
@BryanCafferky 9 ай бұрын
See my reply to your question about OLTP.
@gardnmi
@gardnmi 2 жыл бұрын
I'd love to see a non-bias comparison between delta lake, hudi, and iceberg.
@BryanCafferky
@BryanCafferky 2 жыл бұрын
So would I. lol. Iceberg seems to be Snowflake's version of Lakehouse. Not sure about hudi.
@BryanCafferky
@BryanCafferky 2 жыл бұрын
Looks like Amazon is promoting hudi.
@wennie2939
@wennie2939 Жыл бұрын
Best video on this topic ever!
@prarthananeesh
@prarthananeesh 9 ай бұрын
Is it mainly used for OLAP or can this be used for OLTP also ?
@BryanCafferky
@BryanCafferky 9 ай бұрын
It's meant for data warehousing, i.e., warehouse = lake + house, so warehouse on a data lake. OLTP has stringent requirements like high data transactions concurrency, referential integrity, etc. Delta logging is done at a file level whereas SQL databases log at a row level. See my video on Delta logs to get an understanding of what I mean.
@BryanCafferky
@BryanCafferky 9 ай бұрын
Delta Logs 1: kzbin.info/www/bejne/pnSrkKSEptOgqZI Delta Logs 2: kzbin.info/www/bejne/kIS3e3-cj96SftU
@brokejohnnylive1530
@brokejohnnylive1530 6 ай бұрын
Dude you are on the money!! Agree all 100%.
@janni9789
@janni9789 2 жыл бұрын
Again, perfectly explained. Thank you
@antwanto
@antwanto 3 ай бұрын
wow that was very informative and amazing, thank you for your efforts
@ioannisnikolaospappas6703
@ioannisnikolaospappas6703 5 ай бұрын
Life saver 🫡 Thank you sir!
@BhaveshKumar-dz8hq
@BhaveshKumar-dz8hq 9 ай бұрын
you are a hidden gem
@stu8924
@stu8924 Жыл бұрын
Thank you Bryan.
@BryanCafferky
@BryanCafferky Жыл бұрын
You're welcome Stu.
@WeAreTeamNovus
@WeAreTeamNovus Жыл бұрын
Amazing stuff, as always!
@BryanCafferky
@BryanCafferky Жыл бұрын
Thank you!
@jayashreetheagarajan2708
@jayashreetheagarajan2708 Жыл бұрын
Amazing contents.. Thank you Bryan
@BryanCafferky
@BryanCafferky Жыл бұрын
You're Welcome! Glad it is helpful!
@maheshthati1320
@maheshthati1320 Жыл бұрын
Best explanation
@GILLOS21
@GILLOS21 Жыл бұрын
Amazing lecture! Thank you!
@BryanCafferky
@BryanCafferky Жыл бұрын
You're Welcome!
@ChristianWDegn
@ChristianWDegn Жыл бұрын
Good presentation Thank!
@BryanCafferky
@BryanCafferky Жыл бұрын
YW!
@avishaysebban1515
@avishaysebban1515 Жыл бұрын
you're the best thank you.
@BryanCafferky
@BryanCafferky Жыл бұрын
You're welcome! Thanks for watching.
@kamalesht5942
@kamalesht5942 Жыл бұрын
Your videos are really helping me improve the core knowledge on Data Engineering concepts. Thankyou!
@BryanCafferky
@BryanCafferky Жыл бұрын
Great to hear! You're welcome.
Understanding Data Lakehouse
11:46
Bryan Cafferky
Рет қаралды 8 М.
Understanding Delta Lake - The Heart of the Data Lakehouse
19:26
Bryan Cafferky
Рет қаралды 8 М.
Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей
00:19
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
Why Databricks Delta Live Tables?
16:43
Bryan Cafferky
Рет қаралды 18 М.
Data Warehouse vs Data Lake vs Data Lakehouse
9:32
Jesper Lowgren
Рет қаралды 54 М.
Data Lake VS Data Warehouse VS Data Marts | CodeLearnX
12:07
CodeLearnX
Рет қаралды 29 М.
How and When to Use Databricks Identity Column
15:00
Bryan Cafferky
Рет қаралды 2,7 М.
Data Lakehouses Explained
8:51
IBM Technology
Рет қаралды 101 М.
Making Apache Spark™ Better with Delta Lake
58:10
Databricks
Рет қаралды 180 М.
Why a Data Lakehouse Architecture
8:02
IBM Technology
Рет қаралды 65 М.
Database Sharding and Partitioning
23:53
Arpit Bhayani
Рет қаралды 104 М.
Master Dimensional Modeling Lesson 01 - Why Use a Dimensional Model?
9:57
Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей
00:19