Data Vault vs Traditional Data Warehouse Architectures

  Рет қаралды 55,557

nullQueries

nullQueries

Күн бұрын

Пікірлер: 39
@nullQueries
@nullQueries 3 жыл бұрын
What do you think of the data vault compared to the dimensional data warehouse? Have you built both? For more Data warehouse options: kzbin.info/www/bejne/ipfJZGegn8SJY5I
@JimRohn-u8c
@JimRohn-u8c 2 жыл бұрын
I would love to see more videos on how to implement this. Wish there was a Udemy course on how to implement this.
@norpriest521
@norpriest521 2 жыл бұрын
@@JimRohn-u8c I love how he mentioned at the end that data vault may not be the best option for some scenario. This shows that it's not about which is the better one, but it's about which one is more reasonable to use in specific scenarios.
@willi1978
@willi1978 Жыл бұрын
The idea of Data Vault sounds nice. But using an ETL Automation Tool like WhereScape etls can be adapted very nicely too and with less overhead
@AlbertoSimeoni-wi9wj
@AlbertoSimeoni-wi9wj 9 ай бұрын
I think the main problem is computational power /time to build every link tables. The fact that in the end you build a reporting layer that is in fact a dimensional model vanish all the effort. The clear advantage is having the original keys in a staging area and avoid to change the extractors. But this is all made having in mind old row and disk based databases. With in memory columnstore database (SAP HANA) the link logics is not necessary, it can be all virtual. We have customers with all dwh / BI logic that runs on the erp database with tables over 100 million rows, all with virtual modeling without persistence.
@paulheadey265
@paulheadey265 10 ай бұрын
My data engineering team have built many data vaults, but could never quite articulate to me as a business leader why? This has been very educational for me in explaining the benefits vs complexity. The pace that business is changing and the number of new data sources that become available makes a data vault seem a more obvious choice. The business still gets its Inmon Kimble model, but the foundational data structures in the Vault provide more capability to make changes to them. That's what this inferred to me. I hope I am on the right mark.
@stephanzhechev141
@stephanzhechev141 Жыл бұрын
This is a wonderful video. Unfortunately for me, I read 450 pages from Dan Lindstedt's book introducing the data vault 2.0 architecture. This is, hands down, the worst book I have ever ready. It is just horrible. However, it does contain about 7 good ideas and this video captures all of them in a nicely presented coherent way. Thank you!
@CrazySw3de
@CrazySw3de 3 жыл бұрын
I enjoy your videos quite a bit, just a few pieces of constructive criticism: I feel like a little bit more space between sentences to let the viewer digest what is being said/shown would help a lot. I like the clean look of the visuals, but the text labels etc. help make things easier to visually process. I think the visual example you did with the tables in this one was good, more real examples like that for what these concepts actually look like in the real world, even just as examples helps drive the points hope. Looking forward to seeing your channel grow, keep up the good work!
@nullQueries
@nullQueries 3 жыл бұрын
Thanks for the feedback. I'm trying to keep these as 5 minute overview videos, which is a challenge with some of these dense topics. Still trying to work out the pacing and how much detail to cram in. I have some ideas for more in depth, slower paced example videos to go along with the overviews. Just need to find the time!
@sued12345
@sued12345 6 ай бұрын
@@nullQueries for me you don't need to change anything. I mean a short video will not replace proper training, but helps a lot. Thank you for your effort.
@danielolaru2496
@danielolaru2496 2 жыл бұрын
I went from the 3NF video to the dimensions one to this one and I feel like the only advantage I see is the dimension/Kimball one. This data vault seems just overkill. The storage will increase exponentially with all the extra keys needed and with very large storage of millions/billions of rows the performance I suppose will be greatly impacted when querying all those keys. Why is this an easier ETL solution? Am I missing something?
@TheR0yalBeast
@TheR0yalBeast 2 жыл бұрын
Hi Daniel, I think a key point of the data vault to understand is that it is exceptionally good at showing lineage. In my point of view it is only a good solution when you are dealing with many different data sources which need to be combined. A great example of a project I have helped on was combining 10 different SAP clients at a manufacturing company. Each is customized slightly, the data may be stored in the same fact table, say sales, but have different indicators or flags etc. modifying it. WIth the ETL solution you would do a one off ETL to land it in a standardized table; however, in 4 years you will need to spend weeks of development trying to figure out where the mistakes are and what transformation occurred.
@SamuelLees-jv8ji
@SamuelLees-jv8ji Жыл бұрын
I see a lot of advantages with data vault but I just can't see it as an advantage over dimensional warehouse for my business context: e-commerce platform + CRM + billing system + marketing campaign system because all of these sources are quite static. Would be great to get feedback on this.
@husanturdiev
@husanturdiev Ай бұрын
Hi Daniel! Actually, storage cost with Data Vault is in average a lot less than with Dimensional data modeling. I would suggest two main factors moving to Data Vault: 1. enormous amount of data, 2. complexity of data and business processes. So, when building Data Vault, you'll make a data model that's change tolerant - i.e. if something changes in business, or in business processes, data model will remain, which is not the case for dimensional data modeling. Data Vault is extremely hard and expensive to create, but cheap to maintain, in dimensional data model it's easy and cheap to create, but expensive to maintain in the long run. Therefore there are hybrids - data vault + dimensional modeling where you first model data in vaults, then model dimensions and facts on top of data vault
@christopherbronson3275
@christopherbronson3275 3 жыл бұрын
Can I just say "Dimensional Datamart" is my favorite cyberpunk term
@MrCutlash
@MrCutlash 2 жыл бұрын
Data vault is the curated layer in a data lake. And they have a very specific design... But really its an inmon/operational design
@guillaumegiroux9425
@guillaumegiroux9425 2 ай бұрын
My company is moving from a Datalake with a Raw and Curated zone, to a Datalake/Datavault with a Raw and Certified zone. We are a huge bank with 9billions$ of revenue. I feel it’s a big gamble, the current system, while having governance flaws, isn’t that bad and I wondered if all the money will be genuinely adding value. What do you think?
@srikanthmanduri6429
@srikanthmanduri6429 Жыл бұрын
One of the best video's out there regarding Data Vault modelling
@pedropradocarvalho
@pedropradocarvalho 2 жыл бұрын
Would it happen that you guys have a transcript of this video? maybe posted in a blog post?
@michaelenriquez_
@michaelenriquez_ 3 жыл бұрын
thanks for make this kind of videos, i really appreciate it, they are so useful for people like me who are learning about it
@bytedonor
@bytedonor 7 ай бұрын
Well explained in pictorial format. But there should be some use case or an example so the newbies can understand more easily.
@Sam-gj4hf
@Sam-gj4hf 3 жыл бұрын
First time watching your videos and I absolutely love them! Subbed and liked. It'd be even more awesome if you could allow for an extra second to digest what you're saying. It's a lot of useful information. But even if you don't change anything, I'll still be a fan! Thank you for this!
@pb78pb
@pb78pb 3 жыл бұрын
Hi. Thank you for this overview video. Do you have also a webpage where you can be contacted? Would be happy to get your thoughts about DWH automation (we are the creators of the Datavault Builder tool). Regards
@ardee3949
@ardee3949 3 жыл бұрын
Great videos .. very informative ...can you do a quick comparison between Redshift & Vertica? an overall evaluation?
@treelo11
@treelo11 2 жыл бұрын
This video is very good but I need to clarify the ETL Process. Supposed I have a few raw files yet to be stored. They are placed inside the data lake unmodified. From there, I insert the data as hubs, link tables and satellites tables into the raw vault, creating surrogate keys along the way. Is that right? And what does 'since objects in each layer never connect to each other' mean? 4:01
@ivani3237
@ivani3237 2 жыл бұрын
it's mean that no any hard foreign keys, but logically they of course connected
@yogeshbharadwaj6200
@yogeshbharadwaj6200 2 жыл бұрын
very well explained...tks a lot
@nullQueries
@nullQueries 2 жыл бұрын
Glad it was helpful!
@vidak92
@vidak92 7 ай бұрын
Really, the best explanation.
@kabirsingh6582
@kabirsingh6582 2 жыл бұрын
Great content..subscribed!
@moverecursus1337
@moverecursus1337 Жыл бұрын
a little bit complex
@SjeetjeMineetje
@SjeetjeMineetje 3 жыл бұрын
Very well explained with good examples, this is very helpful!
@galeop
@galeop 3 жыл бұрын
Really good video! Thank you! Quick question: what do you mean by "Business logic"? Do you mean that kind of logic that would be used with an MDM, to control whether new attributes about an entity should be added or ignored (eg if we have conflicting phone numbers for a customer)?
@nullQueries
@nullQueries 3 жыл бұрын
I'm using Business Logic to represent anytime some sort of business rule alters source data. Sometimes it's explicit (ie: phone numbers are always stored in a certain format). And sometimes it's just tribal knowledge (ie: Some sources call it a customerID and some a consumerID. But everyone in the office knows it's referred to as ClientID. So we'll convert to that naming so it's easy for users to consume. ) A good MDM should handle this but it depends on how it's implemented, what it catches, and where in the architecture it makes the changes. But for the DV this would happen in the business vault layer, as the raw vault should reflect the sources.
@galeop
@galeop 3 жыл бұрын
Thank you!
@thghtfl
@thghtfl Жыл бұрын
All those fancy pictures make zero sense without real live examples, just think about it
@mosa36
@mosa36 2 жыл бұрын
Nice video, where can we learn about the other data warehouse format?
@juliustuckayo8973
@juliustuckayo8973 2 жыл бұрын
Great video, I stumbled upon this channel by accident today, after reading an opinion piece by Bill Inmon on why Snowflake isnt a data warehouse (on LInkedIn) after watching your video on Inmon vs KImbal i immediately subscribed, great content, what software do you use for the video animations? anyways you've got a new subscriber from Papua New Guinea, keep it up, happy Easter.
@nullQueries
@nullQueries 2 жыл бұрын
Thanks for the compliment! I use the adobe suite for all illustration and animations.
Database Normalization Tutorial - Modeling 3NF for OLTP
7:53
nullQueries
Рет қаралды 9 М.
The day of the sea 😂 #shorts by Leisi Crazy
00:22
Leisi Crazy
Рет қаралды 2,1 МЛН
Spongebob ate Michael Jackson 😱 #meme #spongebob #gmod
00:14
Mr. LoLo
Рет қаралды 9 МЛН
Пришёл к другу на ночёвку 😂
01:00
Cadrol&Fatich
Рет қаралды 11 МЛН
小丑妹妹插队被妈妈教训!#小丑#路飞#家庭#搞笑
00:12
家庭搞笑日记
Рет қаралды 38 МЛН
Data Modeling in the Modern Data Stack
10:14
Kahan Data Solutions
Рет қаралды 103 М.
Let's Compare the Kimball and Inmon Data Warehouse Architectures
5:16
Is Data Mesh the Future?
5:50
nullQueries
Рет қаралды 26 М.
DataVault / Anchor Modeling / Николай Голов
1:05:30
DataLearn
Рет қаралды 20 М.
How to create a Data Vault Model from scratch
10:58
IT and Analytics
Рет қаралды 43 М.
Should you switch to Snowflake?
4:54
nullQueries
Рет қаралды 20 М.
Dimensional Modeling
53:54
Bryan Cafferky
Рет қаралды 169 М.
The day of the sea 😂 #shorts by Leisi Crazy
00:22
Leisi Crazy
Рет қаралды 2,1 МЛН