I didn't learn how to design a data warehouse from this video. Misleading title. Bad.
@MahadirAhmad5 ай бұрын
I generally believe this should be lakehouse design instead of warehouse also the database replication into data lake is classified under change data capture (cdc). If you push your data both into queue and database it’s hard to ensure consistency between the datalake and the database ie cases like rollback or database failure. Typically the state of the art solution for this type of problem is to rely on the database journal for instance through the binlog or WAL
@interviewpen5 ай бұрын
Cool thanks
@MardiLo-l3c13 күн бұрын
It's indeed a lakehouse design
@richardmccauley90815 ай бұрын
Years of experience all packed in to 14 min, Thank you Sir! As with all your videos, great work
@interviewpen5 ай бұрын
Thanks for watching!
@ramielkady9385 ай бұрын
In today's lesson we explain motor vehicles ... We will go over everything ... But the engine .... DW means show OLTP Schema design vs OLAP
@interviewpen5 ай бұрын
Our youtube videos are usually higher level-If you’re looking for more in-depth content we have plenty on interviewpen.com :)
@ramielkady9385 ай бұрын
@@interviewpen I think it is more of a watered down average Joe explanations rather than higher level. Which is ok but should be reflected in the title. "Data warehousing concepts for the average Joe" would be a better title for a generic video whose audience are accountants are libertarians ... If you call it "Design a data warehouse" like you did, professionals will think that you will provide what you said you would ... Which did not materialize ...
@Porkductions5 ай бұрын
The timing could not be better. I'm about to take on a new role literally about the contents of this video so thank you so much for making this!
@interviewpen5 ай бұрын
Glad you liked it!
@anegyptiangod738611 күн бұрын
Thanks for sharing. But there's a thing that i do not understand, what is the point of using bus system like kafka or kinesis if the main goal is to process data in scheduled intervals? Would it to be more cost efficient to just use old school batch processing pipeline?
@interviewpen8 күн бұрын
Sometimes yes, sometimes no. If we have the ability to process streaming data, this results in less overall data being processed and more up to date results. Sometimes this isn’t possible though, and we have to process the data in batch.
@gaberial33615 ай бұрын
I'm wondering which app were you using for demo
@interviewpen5 ай бұрын
We use GoodNotes on an iPad
@qazyhn9415 күн бұрын
Which tech is used or concept to put DB changes to the queue ? Does PG native support this? It's also scary since they can't be desynchronized
@interviewpen8 күн бұрын
Usually we’d dump things into Kafka at the same time they’re being dumped into the database. But this is highly dependent on the system-some databases support this natively, sometimes it’s easier just to use a batch job.
@decrypt_key5 ай бұрын
Surprised that dbt was not mentioned since we're talking about a modern approach👀. Appreciated the video otherwise
@interviewpen5 ай бұрын
Yep, dbt can certainly be used as an alternative to the solutions mentioned. Thanks!
@bhanuprakashrao146020 күн бұрын
Does consumers of Kafka also generally horizontally scale? Or are the consumers always unique (different from each other, rather than being replicas)
@interviewpen17 күн бұрын
Yes, the consumers are stateless so scaling them just means replicating it.