What do you think of the data vault compared to the dimensional data warehouse? Have you built both? For more Data warehouse options: kzbin.info/www/bejne/ipfJZGegn8SJY5I
@JimRohn-u8c2 жыл бұрын
I would love to see more videos on how to implement this. Wish there was a Udemy course on how to implement this.
@norpriest5212 жыл бұрын
@@JimRohn-u8c I love how he mentioned at the end that data vault may not be the best option for some scenario. This shows that it's not about which is the better one, but it's about which one is more reasonable to use in specific scenarios.
@willi1978 Жыл бұрын
The idea of Data Vault sounds nice. But using an ETL Automation Tool like WhereScape etls can be adapted very nicely too and with less overhead
@AlbertoSimeoni-wi9wj9 ай бұрын
I think the main problem is computational power /time to build every link tables. The fact that in the end you build a reporting layer that is in fact a dimensional model vanish all the effort. The clear advantage is having the original keys in a staging area and avoid to change the extractors. But this is all made having in mind old row and disk based databases. With in memory columnstore database (SAP HANA) the link logics is not necessary, it can be all virtual. We have customers with all dwh / BI logic that runs on the erp database with tables over 100 million rows, all with virtual modeling without persistence.
@paulheadey26510 ай бұрын
My data engineering team have built many data vaults, but could never quite articulate to me as a business leader why? This has been very educational for me in explaining the benefits vs complexity. The pace that business is changing and the number of new data sources that become available makes a data vault seem a more obvious choice. The business still gets its Inmon Kimble model, but the foundational data structures in the Vault provide more capability to make changes to them. That's what this inferred to me. I hope I am on the right mark.
@stephanzhechev141 Жыл бұрын
This is a wonderful video. Unfortunately for me, I read 450 pages from Dan Lindstedt's book introducing the data vault 2.0 architecture. This is, hands down, the worst book I have ever ready. It is just horrible. However, it does contain about 7 good ideas and this video captures all of them in a nicely presented coherent way. Thank you!
@CrazySw3de3 жыл бұрын
I enjoy your videos quite a bit, just a few pieces of constructive criticism: I feel like a little bit more space between sentences to let the viewer digest what is being said/shown would help a lot. I like the clean look of the visuals, but the text labels etc. help make things easier to visually process. I think the visual example you did with the tables in this one was good, more real examples like that for what these concepts actually look like in the real world, even just as examples helps drive the points hope. Looking forward to seeing your channel grow, keep up the good work!
@nullQueries3 жыл бұрын
Thanks for the feedback. I'm trying to keep these as 5 minute overview videos, which is a challenge with some of these dense topics. Still trying to work out the pacing and how much detail to cram in. I have some ideas for more in depth, slower paced example videos to go along with the overviews. Just need to find the time!
@sued123456 ай бұрын
@@nullQueries for me you don't need to change anything. I mean a short video will not replace proper training, but helps a lot. Thank you for your effort.
@danielolaru24962 жыл бұрын
I went from the 3NF video to the dimensions one to this one and I feel like the only advantage I see is the dimension/Kimball one. This data vault seems just overkill. The storage will increase exponentially with all the extra keys needed and with very large storage of millions/billions of rows the performance I suppose will be greatly impacted when querying all those keys. Why is this an easier ETL solution? Am I missing something?
@TheR0yalBeast2 жыл бұрын
Hi Daniel, I think a key point of the data vault to understand is that it is exceptionally good at showing lineage. In my point of view it is only a good solution when you are dealing with many different data sources which need to be combined. A great example of a project I have helped on was combining 10 different SAP clients at a manufacturing company. Each is customized slightly, the data may be stored in the same fact table, say sales, but have different indicators or flags etc. modifying it. WIth the ETL solution you would do a one off ETL to land it in a standardized table; however, in 4 years you will need to spend weeks of development trying to figure out where the mistakes are and what transformation occurred.
@SamuelLees-jv8ji Жыл бұрын
I see a lot of advantages with data vault but I just can't see it as an advantage over dimensional warehouse for my business context: e-commerce platform + CRM + billing system + marketing campaign system because all of these sources are quite static. Would be great to get feedback on this.
@husanturdievАй бұрын
Hi Daniel! Actually, storage cost with Data Vault is in average a lot less than with Dimensional data modeling. I would suggest two main factors moving to Data Vault: 1. enormous amount of data, 2. complexity of data and business processes. So, when building Data Vault, you'll make a data model that's change tolerant - i.e. if something changes in business, or in business processes, data model will remain, which is not the case for dimensional data modeling. Data Vault is extremely hard and expensive to create, but cheap to maintain, in dimensional data model it's easy and cheap to create, but expensive to maintain in the long run. Therefore there are hybrids - data vault + dimensional modeling where you first model data in vaults, then model dimensions and facts on top of data vault
@christopherbronson32753 жыл бұрын
Can I just say "Dimensional Datamart" is my favorite cyberpunk term
@MrCutlash2 жыл бұрын
Data vault is the curated layer in a data lake. And they have a very specific design... But really its an inmon/operational design
@guillaumegiroux94252 ай бұрын
My company is moving from a Datalake with a Raw and Curated zone, to a Datalake/Datavault with a Raw and Certified zone. We are a huge bank with 9billions$ of revenue. I feel it’s a big gamble, the current system, while having governance flaws, isn’t that bad and I wondered if all the money will be genuinely adding value. What do you think?
@srikanthmanduri6429 Жыл бұрын
One of the best video's out there regarding Data Vault modelling
@pedropradocarvalho2 жыл бұрын
Would it happen that you guys have a transcript of this video? maybe posted in a blog post?
@michaelenriquez_3 жыл бұрын
thanks for make this kind of videos, i really appreciate it, they are so useful for people like me who are learning about it
@bytedonor7 ай бұрын
Well explained in pictorial format. But there should be some use case or an example so the newbies can understand more easily.
@Sam-gj4hf3 жыл бұрын
First time watching your videos and I absolutely love them! Subbed and liked. It'd be even more awesome if you could allow for an extra second to digest what you're saying. It's a lot of useful information. But even if you don't change anything, I'll still be a fan! Thank you for this!
@pb78pb3 жыл бұрын
Hi. Thank you for this overview video. Do you have also a webpage where you can be contacted? Would be happy to get your thoughts about DWH automation (we are the creators of the Datavault Builder tool). Regards
@ardee39493 жыл бұрын
Great videos .. very informative ...can you do a quick comparison between Redshift & Vertica? an overall evaluation?
@treelo112 жыл бұрын
This video is very good but I need to clarify the ETL Process. Supposed I have a few raw files yet to be stored. They are placed inside the data lake unmodified. From there, I insert the data as hubs, link tables and satellites tables into the raw vault, creating surrogate keys along the way. Is that right? And what does 'since objects in each layer never connect to each other' mean? 4:01
@ivani32372 жыл бұрын
it's mean that no any hard foreign keys, but logically they of course connected
@yogeshbharadwaj62002 жыл бұрын
very well explained...tks a lot
@nullQueries2 жыл бұрын
Glad it was helpful!
@vidak927 ай бұрын
Really, the best explanation.
@kabirsingh65822 жыл бұрын
Great content..subscribed!
@moverecursus1337 Жыл бұрын
a little bit complex
@SjeetjeMineetje3 жыл бұрын
Very well explained with good examples, this is very helpful!
@galeop3 жыл бұрын
Really good video! Thank you! Quick question: what do you mean by "Business logic"? Do you mean that kind of logic that would be used with an MDM, to control whether new attributes about an entity should be added or ignored (eg if we have conflicting phone numbers for a customer)?
@nullQueries3 жыл бұрын
I'm using Business Logic to represent anytime some sort of business rule alters source data. Sometimes it's explicit (ie: phone numbers are always stored in a certain format). And sometimes it's just tribal knowledge (ie: Some sources call it a customerID and some a consumerID. But everyone in the office knows it's referred to as ClientID. So we'll convert to that naming so it's easy for users to consume. ) A good MDM should handle this but it depends on how it's implemented, what it catches, and where in the architecture it makes the changes. But for the DV this would happen in the business vault layer, as the raw vault should reflect the sources.
@galeop3 жыл бұрын
Thank you!
@thghtfl Жыл бұрын
All those fancy pictures make zero sense without real live examples, just think about it
@mosa362 жыл бұрын
Nice video, where can we learn about the other data warehouse format?
@juliustuckayo89732 жыл бұрын
Great video, I stumbled upon this channel by accident today, after reading an opinion piece by Bill Inmon on why Snowflake isnt a data warehouse (on LInkedIn) after watching your video on Inmon vs KImbal i immediately subscribed, great content, what software do you use for the video animations? anyways you've got a new subscriber from Papua New Guinea, keep it up, happy Easter.
@nullQueries2 жыл бұрын
Thanks for the compliment! I use the adobe suite for all illustration and animations.