What Is The Modern Data Stack - Intro To Data Infrastructure Part 1

  Рет қаралды 32,146

Seattle Data Guy

Seattle Data Guy

Күн бұрын

To put things simply, the modern data stack (MDS) is a set of tools that power data integration. In order, these tools include a fully managed data pipeline for “extract, load, and transform” (ELT) processes along with a cloud-based data lake or columnar warehouse for the data’s destination. On top of that, MDS requires a data transformation tool and a visualization or business intelligence (BI) platform so your company can make use of all the numbers.
The primary difference between a modern data stack and a legacy data stack lies in the host. The modern data stack is hosted in the cloud and, as a result, requires very little technical interference from its users. The modern data stack almost guarantees end-user accessibility, while the company at-large enjoys endless scalability that grows quickly without the expensive downtime associated with scaling the server room that supports a legacy data stack.
If you enjoyed this video, check out some of my other top videos.
What Skills Do Data Engineers Need?
• What Skills Do Data En...
Data Engineering Project Ideas
• 5 Data Sources for You...
If you want to learn more about machine learning, check out DataCamps Machine Learning Course
bit.ly/3BeLEml
If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
seattledataguy...
Or check out my blog
www.theseattle...
Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio
_____________________________________________________________
Subscribe: / @seattledataguy
_____________________________________________________________
About me:
I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.
*I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.

Пікірлер: 94
@LukeBarousse
@LukeBarousse 2 жыл бұрын
If this intro doesn't convince you that Data Engineers are going to be one of the top most needed jobs for the foreseeable future... I don't know what will! Great content, Ben!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
It's a wild world right now. So many tools, heres to hoping I can make some sense of it all.
@letechnicaljames
@letechnicaljames 2 жыл бұрын
True.
@DataProfessor
@DataProfessor 2 жыл бұрын
Exactly and this channel is the place to be for learning about this exciting area 😆
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
@@DataProfessor You're too kind!
@sndselecta
@sndselecta 2 жыл бұрын
@@SeattleDataGuy It's funny you say that, because so many companies hiring DEs these days, put on this act as though their data stack is the holy grail, when in reality they have just evolved from vendor lock in to vendor pathing. It's good to be humble and take a step back to actually realize the on slaught of too many options or ways to do the same thing, it is a good thing (competitive pricing, anti-vendor lock in etc...) but also bad thing (focus, overwhelmed on where to start, endless learning different vendor paths). I think your 3 layers is a great start for the base without getting overwhelmed in vendor marketing BS. Hats off to trying the hold the raging bull by its horns. Looking forward to more material.
@antonkostov1691
@antonkostov1691 2 жыл бұрын
Hello again, brother. I want to brag again. After successfully snagging a DE job inspired by you . Now I obtained the DP-203 Microsoft Certified DE. Thanks again, brother for helping me make the big step a year and a half ago.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Congrats. That's so exciting. I am glad you're continuing to grow. How is the new DE job going?
@career-calling
@career-calling 11 ай бұрын
Happy to finally find a channel that talks about data 360 degrees. Thank you for your effort to bring this to the audience.
@SeattleDataGuy
@SeattleDataGuy 10 ай бұрын
thank you! hopefully you're finding it helpful
@rachelzhang5709
@rachelzhang5709 2 жыл бұрын
finally!!!! been looking forward to this series to have a more concrete and contextual understanding of data infrastructure.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yeah, I am very excited! I actually think the next video might be something like "x types of data stacks" and then the EL video. We shall see.
@ekta_r7417
@ekta_r7417 2 жыл бұрын
Really looking forward to this series. Would love if you would discuss an end to end data Infrastructure design and walk us through the thought process while selecting the tools for each of them with diff use cases. Thank you again for guiding through your videos. They are a big help!☺️
@matiaspirovanovarela1241
@matiaspirovanovarela1241 2 жыл бұрын
Thanks a lot for the video, the series has a lot of potential. May I request if you could talk about cost effective tools? Maybe a sample stack for companies of different sizes or maturity (like the Analysts chart that you showed).
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yeah, I love this. I do find that there is the open source data stack for example. If I hear postgres, metabase and airflow. I know the company is trying to keep costs low or like engineering.
@GuyThompsonFWTX
@GuyThompsonFWTX 2 жыл бұрын
Great video! Can't wait to see where this series goes.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yeah, so many tools to go over. VCs got to chill
@Clever_viper
@Clever_viper 2 жыл бұрын
Great video, Ben! It is nice to hear this in simple words, so it helps me transmit such ideas to my team and the stakeholders, hopefully in a way the latter will understand.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I am glad you found it helpful! I hope your stakeholders can also understand the value of building a reliable data stack.
@pushpanthkumar9028
@pushpanthkumar9028 2 жыл бұрын
I Love this.. Ben please consider making videos how you are monitoring the data applications & Best practices to ensure Data Quality.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yup! I have a few tools I like in that space.
@reneeh9132
@reneeh9132 Жыл бұрын
Super userful chart in describing the datastack!!
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
Glad you found it useful!
@eth6706
@eth6706 2 жыл бұрын
Perfect choice for a series! Looking forward to it
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Me too! Thanks for your support.
@andi93007
@andi93007 2 жыл бұрын
This is amazing sir. Thank you ahead for the series! Definitely help all of us to keep up with the changing landscape.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Watch half the videos I make be out of date 3 weeks after I put them out.😅
@tethadam4929
@tethadam4929 2 жыл бұрын
Another fantastic video. Thanks in advance for all your hard work!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Thank you!
@letechnicaljames
@letechnicaljames 2 жыл бұрын
Insightful video. Looking forward to the rest of this series!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Thanks, I look forward to it!
@shatandv
@shatandv 2 жыл бұрын
Excited for this. Thanks!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Me too! I love this high level stuff.
@shatandv
@shatandv 2 жыл бұрын
@@SeattleDataGuy I'm especially interested in all of this as a startup founder. We're just starting out, but already feel the need for a structured approach to our data
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
That makes sense. I am also kind of going through a similar walk through in my newsletter and it actually has links to the images seattledataguy.substack.com/p/the-baseline-datastack-going-beyond. It might be helpful
@shatandv
@shatandv 2 жыл бұрын
@@SeattleDataGuy Sounds interesting, thanks! I’ll give it a read
@Alez101010
@Alez101010 2 жыл бұрын
So useful video! I’m so excited to see what’s next on this series of videos! Thanks for your work
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yup, I am very excited for this series
@GiasoneP
@GiasoneP 2 жыл бұрын
Great video. Looking forward to you finishing this series…and hopefully you finish the DE project series too 🙃
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Hahaha,, the project series I need to restart. However, the next video in this series is completed I will be posting it towards the end of next week (most likely).
@anathanholland
@anathanholland 2 жыл бұрын
Looking forward to this series! We're implementing a MDS in my org right now and I think it will greatly improve our efficiency and ability to make data-driven decisions.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
That's exciting! What tools are you using?
@anathanholland
@anathanholland 2 жыл бұрын
@@SeattleDataGuy Our main tools are snowflake, fivefran, dbt, and looker!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Solid stack. Where are you guys at thus far?
@kapuriaritik
@kapuriaritik 2 жыл бұрын
Amazing video! Excited for the next part!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Me too!
@datawitharslan
@datawitharslan 2 жыл бұрын
Your videos are always very informative and Valuable for Data Lovers. Can you please tell me what you thing have better future , Jobs Modern Data Stack or Cloud Data Stack.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I think both will have their place. A lot is shifting currently due to funding. It will take 2-3 years for it all to shake out.
@elis8185
@elis8185 2 жыл бұрын
Great info! Thanks for the steady stream of super useful videos!!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I am doing my best to keep a steady stream going
@N77b44
@N77b44 2 жыл бұрын
It would be great to hear more about the testing part of the process. I feel like this gets talked about a lot for more core software engineering but I think data engineering presents unique challenges that make adapting something akin to Test Driven Development far from straightforward (many external dependencies, rapidly changing state, etc).
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Hmm, I might need to think about this for sure!
@theravitshow
@theravitshow 2 жыл бұрын
Love it!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Thank you!
@obiradaniel
@obiradaniel 2 жыл бұрын
Thank you very much, very insightful.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
glad you found it helpful!
@dnn1982
@dnn1982 2 жыл бұрын
Very valuable video. Do you have plans to make part 2 or continuing the series ?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
There are parts 2 and 3 kzbin.info/www/bejne/ooTSdnxun86UabM
@shawnteo3837
@shawnteo3837 2 жыл бұрын
great video! what type of data infrastructure do you recommend to use for image data?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Image data is generally stored in like S3 or a similar solution. Then you store the url in the database with metadata.
@Marcos-yg2vi
@Marcos-yg2vi 2 жыл бұрын
Very nice! This series will help me about I've wrote in the previous video! thanks! I don`t know if it is the purpose of series, but if you can show tools for big files, high (distributed and parallel) processing, professional environments I appreciate! I see a lot youtube channels that give just simple (educational) examples but when you try to apply in the pipelines, in hard life (lol), nothing works!!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yeah, its always hard to actually create situations where you are processing lots of files. Mostly because it starts becoming expensive to store that much data. Overall, its just always easier to show a, here is a hello world vs here is a difficult configuration/scaling issue.
@virginiopancadao
@virginiopancadao 2 жыл бұрын
More Videos like this!!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
On a trip this week and I actually have 2 other videos in front of this getting edited...but I will be back next week and filming the next part. You can also check out my newsletter where I am going over similar topics seattledataguy.substack.com/
@muni7561
@muni7561 2 жыл бұрын
Hi! I usually just watch your videos and not really comment (LOVE ur contents btw). Not really related to this video I just wanted to know your opinion about Data Engineering Bootcamps when transitioning from DA to DE. Do you consider that an option for the transition? Thanks!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
It's hard to say. Any specific bootcamps? They can help. But also, if as a data analyst, you can take data engineering projects. That can also bee a great way of shifting careers.
@muni7561
@muni7561 2 жыл бұрын
@@SeattleDataGuy Thanks for the reply Ben! ☺️I found LearningFuze’s Data Science Bootcamp to be appealing since I live in Orange County and could work in person. Im a recent grad from UCSB with math major and I am trying to break into data field as a data analyst and hopefully be a data engineer later in my career. Do you think the bootcamp program would help me break into the field? Thank you for your time and thoughts!! I found your videos to be so inspiring and helpful guiding me through my tough time!!
@jlm89jlm
@jlm89jlm 2 жыл бұрын
Do any of your videos cover managing expectations and timelines as a data engineering consultant? Would love to check that out if it exists!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Not currently, but it might have to go on the list
@caseypdx503
@caseypdx503 2 жыл бұрын
Given that the Analytics Engineering role is mostly based on the advent of DBT--do you see that role continuing to be viable? Or in other words, is DBT and the like just a fad? or should it be considered in the long-term.
@loner007
@loner007 2 жыл бұрын
I am interested in this question as well. I have heard that big companies that already have solid data infrastructure, the data engineers are actually analytics engineers. Whereas companies that don't have a solid data infrastructure, really need data engineers.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I actually plan to respond to a post I read on the data engineering subreddit on some of this.
@jester667
@jester667 2 жыл бұрын
What kind of data observability tools (ideally open source) would you recommend to monitor the data pipelines?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I should probably make a video on that!
@Goku-br7yt
@Goku-br7yt 2 жыл бұрын
If you could cover something on data observability , & how big techs implement data quality Audit Frameworks.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Its on the list
@ridhwaans
@ridhwaans 2 жыл бұрын
where do services like infra-as-code, configuration management, IAM, SSO live in this system?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Hopefully with a different team...cries in reality
@caseypdx503
@caseypdx503 2 жыл бұрын
Would it not be true that a business just starting out (low maturity) could just get sources like Fivetran and a cloud data warehouse managed by an analytics engineer and if needed an analyst?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
There are a whole bunch of ways people can set up their data infrastructure depending on budget, head count, priorities, etc. I do plan to talk through some of these in my next video. I was going to do EL. But I think it would be interesting to discuss different stack based on priorities, preferences, etc. The example I often give is the "open source stack". This is usually some combo like Postgres, Airflow, Metabase, re_data or datahub...etc vs the analytics engineer data stack. This tends to be Fivetran, Snowflake and Looker. But there are so many other different versions.
@caseypdx503
@caseypdx503 2 жыл бұрын
@@SeattleDataGuy Thanks for the reply! If you wouldn't mind one more question? :) Given that the Analytics Engineering role is mostly based on the advent of DBT--do you see that role continuing to be viable? Or in other words, is DBT and the like just a fad? or should it be considered in the long-term. Thanks again!
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
I think there has always been a natural split for data engineers. Some data engineers tend to be more technical and software focused where others are more analytical. So even if the actual title of analytical engineer goes away..I think the role itself will always exist.
@caseypdx503
@caseypdx503 2 жыл бұрын
@@SeattleDataGuy Right yeah, I guess a lot of the job descriptions I see are software focused, but I am definitely more analytical. I want to work with the data using technical skills, but I will always prefer to be closer to the data and not so much down the software engineer side.
@dingo-k1n
@dingo-k1n 2 жыл бұрын
What does "Core Data" mean? Is it a central data system like EDW?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Something of that nature. I generally mean the most granular form of your production tables. So essentially, what everyone else builds their KPIs, reporting, etc off of.
@kjdkmcvkdm
@kjdkmcvkdm 2 жыл бұрын
Is the series out yet?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Here is part 2 kzbin.info/www/bejne/ooTSdnxun86UabM
@andrewdecotiis-mauro3709
@andrewdecotiis-mauro3709 2 жыл бұрын
Do you have any resources for data lineage and governance?
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Do you mean like, which tools are solid? Also for data governance, there is the lightsondata channel.
@andrewdecotiis-mauro3709
@andrewdecotiis-mauro3709 2 жыл бұрын
@@SeattleDataGuy Yeah what tools or any sort of books, articles. I'm going to check out lightsondata now
@donchichiumelo2762
@donchichiumelo2762 Жыл бұрын
is that a pink quartz crystal.... its huge!!!!
@ankush_chatterjee
@ankush_chatterjee 2 жыл бұрын
Also, along with the new guys, Informatica went IPO with around $10B valuation
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Yeah! For real. And I haven't even ever worked with it. There is so much money going into the space.
@sylwiaanna2423
@sylwiaanna2423 Жыл бұрын
I haven't seen a single comment about the Naruto reference, so here it is!
Why Everyone Cares About Snowflake
11:41
Seattle Data Guy
Рет қаралды 112 М.
Good teacher wows kids with practical examples #shorts
00:32
I migliori trucchetti di Fabiosa
Рет қаралды 12 МЛН
Хасанның өзі эфирге шықты! “Қылмыстық топқа қатысым жоқ” дейді. Талғарда не болды? Халық сене ме?
09:25
Демократиялы Қазақстан / Демократический Казахстан
Рет қаралды 300 М.
Synyptas 4 | Жігіттер сынып қалды| 3 Bolim
19:27
kak budto
Рет қаралды 1,3 МЛН
Smart Sigma Kid #funny #sigma
00:14
CRAZY GREAPA
Рет қаралды 6 МЛН
The modern data stack sucks
6:34
PostHog
Рет қаралды 7 М.
Data Scientist vs Data Analyst vs Data Engineer: What's the difference?
6:58
Modern Data Infrastructure: ETL to ELT
5:44
a16z
Рет қаралды 36 М.
Vocabulary for Data Engineers - Data Engineering 101
15:11
Seattle Data Guy
Рет қаралды 40 М.
What is Data Pipeline? | Why Is It So Popular?
5:25
ByteByteGo
Рет қаралды 176 М.
🚨 YOU'RE VISUALIZING YOUR DATA WRONG. And Here's Why...
17:11
Adam Finer - Learn BI Online
Рет қаралды 200 М.
How to OVER Engineer a Website // What is a Tech Stack?
11:20
Fireship
Рет қаралды 2,5 МЛН
Good teacher wows kids with practical examples #shorts
00:32
I migliori trucchetti di Fabiosa
Рет қаралды 12 МЛН