Day In The Life Of A Data Engineer - What Do Data Engineers Do?

  Рет қаралды 74,352

Seattle Data Guy

Seattle Data Guy

Күн бұрын

Пікірлер: 127
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
If you enjoyed this video, then consider signed up for my weekly newsletter! seattledataguy.substack.com/p/airbyte-is-open-source-the-way-forward
@harmoniousT
@harmoniousT 2 жыл бұрын
I appreciate this video so much. This is the real "day in the life" video that I'm looking for. These other videos are mostly always showed in a glorified manner. I don't care what you had for lunch, tell me what you do at work as said profession. Thanks again!
@אהלןסהלן
@אהלןסהלן 7 ай бұрын
Yup, they aren't day in the lives of a professional, but of a general human🤷‍♀️
@FineFlu
@FineFlu 2 жыл бұрын
Mic so big they call it Michael
@ThiagoAraujo-ie8gb
@ThiagoAraujo-ie8gb Жыл бұрын
Thank you for this video, I've been working as a data engineer intern for 6 months now and this video shows everything I'm seeing in my job
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
I am so glad it was helpful!
@hrushikeshshukla3021
@hrushikeshshukla3021 3 жыл бұрын
Amazing video.. I watched your video and it helped me a lot... I'm now a data engineer. Thanks a lot
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
You're making my day! How are you enjoying being a DE?
@jameshizon4861
@jameshizon4861 3 жыл бұрын
My fav forgotten print statement.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
print("did you make it here")
@gnomesukno
@gnomesukno 3 жыл бұрын
Overall, very easy to understand explanation video. Good job. Currently I work as a engineering specialist and my role combines data engineering and data analyst. I feel it's best if you can have some understanding from a data engineering perspective it will help you out tremendously when it comes to analyzing said data.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I agree. From both perspectives. If you're a data engineer who understands the goals of a data analyst or a data analyst who understand the data engineering process, then you can communicate better with the other role.
@vyacheslavmasalitin7949
@vyacheslavmasalitin7949 2 жыл бұрын
@@SeattleDataGuy hi guys, thanks for opening this topic Data Scientist vs Engineer workflow & duties! Very much interesting and informative. Now for me as a Beginner in Data sector who wants take an educational course from which position is better to start to learn the subject : from Data Science or Engineering? Please advise! Looking forward to receiving your support on that!
@vyacheslavmasalitin7949
@vyacheslavmasalitin7949 2 жыл бұрын
+ @Chris Ellis - your opinion is very welcome too 🙏
@ianfelton5056
@ianfelton5056 3 жыл бұрын
Although building pipelines is more of a challenge, I am really enjoying azure synapse analytics (serverless sql) for data engineering. Essentially, there is just a data store (with documents) and a data warehouse definition. Data visualization tools just tap the data warehouse definition and an engine serves the relevant transformed data from the (virtual?) warehouse. It feels much cleaner than other approaches I have worked with. I could see this being where things are going: businesses have large object document storage, and Google/Amazon/Microsoft sell mpp engines that enable businesses to easily consume their data as if everything was a small sql database with structured data. Pipelines might still populate the data store, but a lot of the transformation can happen on end-user read.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I still haven't got to play in the new Microsoft environment. This sounds a little like Looker in terms of how you have defined models and metrics in their ML files. Would you say that is accurate? I think many companies are switching to the EL(T) process where they use a low code solution like Azure data factory or fivetran just to get the data into a datawarehouse and then transform. Do you think this puts data governance at risk. Whenever I hear end-users can set up their own transforms, it gets concerning because then you have 5 definitions of the same field floating around. So many questions.
@ianfelton5056
@ianfelton5056 3 жыл бұрын
@@SeattleDataGuy Data governance is definitely a concern. The security side can be handled multiple ways, such as native row-level security or dataset truncating by the client (for embedded analytics). As far as multiple definitions of the same field, I think that is more likely to cause havok. It can directly impact the quality of decisioning, along with the confidence to use the information to make decisions. I have not had a chance to try out Looker yet. I am using Power BI, although, from what I have read, I expect Tableau, Power BI and Looker to have a lot of functional overlap. One of the interesting things about Power BI and Synapse is that a person can define Synapse views in a Power BI query. Synapse can be completely empty of any tables, views or anything resulting in data. In this case, an analyst would strictly use Synapse as an engine to query the data store from a data visualization tool. Naturally, Power BI will not let a person connect front-end paramters to these queries :)
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Yup! I agree about havoc being created. I have now worked on several projects with this issue. Not to mention in some cases low-code projects where analysts were just pulling in Billions of rows into Looker or PowerBI and wondering why it is taking so long to pull in the data. But yes, I am loving a lot of these tools. They are making managing all the different data sources so much easier.
@GuyThompsonFWTX
@GuyThompsonFWTX 3 жыл бұрын
I'm in a small shop where I started doing Analyst work (i.e. writing queries, preparing reports for dashboard uploads, troubleshooting alignment between the frontend to the backend) and now I'm learning more about the data engineering aspect of how to manage the pipeline. The biggest problem for me is understanding the legacy code (which is honestly great learning). But I constantly feel like there's a better way, I just don't have the knowledge of how to do it yet. And like other creators have mentioned - there's just so many tools out there now.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Sounds about right. The legacy code can be great to learn but it can also be difficult to maintain. There honestly is probably a better way, but I assume it would be time consuming to migrate. What is the code base built off of? I have been spending a lot of time moving some companies away from custom code and over to more maintainable solutions like Fivetran and Stitch or if they still want to code Astronomer or Google Cloud Composer.
@GuyThompsonFWTX
@GuyThompsonFWTX 3 жыл бұрын
@@SeattleDataGuy The data and pipeline is managed through GCP using node.js and running cloud scheduler/cloud functions/pubsub/etc. A lot of what I do is gathering data from another Accounting Software and trying to duplicate or create custom reports from the raw data. Part of my problem is that the Accounting Software is also a blackbox in that they don't have much documentation bc they're worried about competitors obtaining source information, so I'm constantly flying blind trying to reverse engineer their tables while also maintaining accounting standards. There's also some hubspot, slack, and nodered stuff in there as well. I feel that there's already some tooling within GCP that could make this easier to manage using Dataflow and DataProc. But we're also talking about moving to Prefect to manage the pipelines as well. I wish there was another me - 1 to manage the day-to-day data analysis work, and another to research the pipeline and options for optimization and improvement.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I can at least tell you you're not alone! I think Dataflow or Cloud Composer would be good choices. Also with hubspot and slack you could use a low code solution as most low code solutions have data connectors to manage the big business apps. Are they expecting you to build all of this?
@GuyThompsonFWTX
@GuyThompsonFWTX 3 жыл бұрын
@@SeattleDataGuy I won't build all of it. But once a lot of the analysis backlog (mainly updating sql queries) is fixed, or as I become more knowledgeable and able to fix things faster, then I'll transition over to more of the DevOps/DE solutions. I'm already troubleshooting and monitoring the current pipelines which are based off BEAM structure - Source > Staging > DW > Creating Views > import into dashboard, so it's not far-fetched to pick up that task as well, right? This my first Analyst/Engineering role after a decade in Accounting. I feel like I'm doing okay, always could be faster.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Sounds like a great move! It's pretty typical. How are you enjoying the shift over to the DE/Analyst world?
@unclegeorgesspeedandfeed
@unclegeorgesspeedandfeed 3 жыл бұрын
I wasn't expecting a "day in the life" video so soon. Thanks
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I had been planning to make it for a while. So ta-da!
@unclegeorgesspeedandfeed
@unclegeorgesspeedandfeed 3 жыл бұрын
@@SeattleDataGuy I'm going to sign up for classes at my local community College they have three programs. General programing, networking, and cyber security. I want to go into the data field but don't know which program to start with. Any suggestions?
@sng94
@sng94 Жыл бұрын
currently i am working as data engineer and i spend my days on data migration, etl dev, am and report development.
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
do you like doing that work?
@ChaeWookKim-vd7uy
@ChaeWookKim-vd7uy 3 жыл бұрын
Better background!
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thanks for noticing! We had fun setting it up
@compton8301
@compton8301 3 жыл бұрын
Love the background! 🤌
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thank you for noticing 🙏
@compton8301
@compton8301 3 жыл бұрын
@@SeattleDataGuy Please consider doing a "how to set up a youtube channel". Maybe covering what you use to produce your content. Software, hardware and etc.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Are you planning to start a channel! Yeah I want to do that when I finally get a real camera..I am just using my phone.
@compton8301
@compton8301 3 жыл бұрын
​@@SeattleDataGuy Yes. Maybe I should just start instead of trying to save for expensive equipment. My phone's camera is actually good. Which affordable mic, on Amazon, would you recommend?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Awesome! Also on data engineering(Let's grow the community). Also I am just using a blue yeti mic. The classic affordable option.
@JigneshDave
@JigneshDave Жыл бұрын
Informative video, thank you!
@SeattleDataGuy
@SeattleDataGuy Жыл бұрын
Glad you enjoyed it!
@hanumanthchinna7676
@hanumanthchinna7676 2 жыл бұрын
your way of explanation is so fast. for the beginner but explanation is good clear and clarity .
@SaveThatMoney411
@SaveThatMoney411 Жыл бұрын
The data janitor said that data engineers don't have many meetings, which is why he likes it over data science and machine learning. Not true? Also, can you get hired as a data engineer without a prior IT job (for example, if you did some machine learning as a biologist, and passed some SQL certification exams).
@RafaelLopez-dp1fd
@RafaelLopez-dp1fd 3 жыл бұрын
Per your request I smashed the like button ; p
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thank you kind sir!
@salimz.8587
@salimz.8587 3 жыл бұрын
I love your content.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thank you!!!
@rreyes1555
@rreyes1555 Жыл бұрын
hello! Thank you for insightful content. I’d like to ask if archiving data is part of DE’S job? If yes, please make a separate video about it. Thanks.
@liy1561
@liy1561 3 жыл бұрын
Ben lookin swole af I see you👀
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
The key is the angle...
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Also, truthfully I have thought it might be funny to do a video "Data Engineer Vs Data Scientist Vs Data Analyst" Who is Stronger? And have Ken Jee and Luke Barousse and I do some form of tech bro-y challenge. Just because both Ken Jee and Luke Barousse have both put clips of them working out in their videos. So far I think Luke wins with his Muscle Ups.
@jameshizon4861
@jameshizon4861 3 жыл бұрын
Exactly. He does pipeline curls.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
curl -X GET H 'Content-Type: application/json' 0.0.0.0:5001/api/get/swole
@jameshizon4861
@jameshizon4861 3 жыл бұрын
@@SeattleDataGuy Lol no
@simple_neko
@simple_neko 2 жыл бұрын
thank you
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
You're welcome!
@eduh7950
@eduh7950 Жыл бұрын
Dude, 5:38 -> "Queiries" is a 8 character string.. while the correct word has only 7. Look at the English Language data to figure that out. We're engineers but that doesn't mean we can't spell (ironically, this is a word/term Data people use quite a lot :D).
@Zioom234
@Zioom234 2 жыл бұрын
Now I realized that I was hired as data analyst, but in fact I mainly do data engineer's job. Lol
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
That happens a lot. Time to ask for a raise
@jaydecanon1314
@jaydecanon1314 2 жыл бұрын
samedt
@herbsabeast1
@herbsabeast1 7 ай бұрын
Hey I was curious on what the workload is. Do they expect you to do x amount of data models per week. In other words, do they just give you the data say we want to find x y and z and you go out and do it. In a given week do you have to do like thousands of datasets? For example , the vouume of things they want done. I am looking for the day to day.
@serikusa
@serikusa 3 жыл бұрын
Ben, thanks for video. Do you have A-Z course how become a data engineer? Or can you recommend one ? Thank you
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Hey Serik! Always good to see you. I think you have seen my data engineering roadmap video correct? That kind of is a hodge podge of courses. But perhaps I should put together a video A-Z of DEs.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Because I don't know if I know of a good udemy or coursera course
@serikusa
@serikusa 3 жыл бұрын
@@SeattleDataGuy that will be amazing, and you can sell the course as well. I feel there are so much information and hard to find A-Z course for DEs. Thank you
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I don't know about selling a course, but maybe! It would take some time to put together a course of everything we do
@compton8301
@compton8301 3 жыл бұрын
We're basically janitors of data?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I prefer plumbers 😂
@compton8301
@compton8301 3 жыл бұрын
@@SeattleDataGuy Oh yeah, pipelines! 😅
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
The pipelines, data modeling, and occasionally adhoc queries.
@duixianzhuanyong1858
@duixianzhuanyong1858 9 ай бұрын
thanks for sharing, I wonder if I can dabble in DE without statistics background or not. Is it a degree-specific occupation?
@DarshilParmar
@DarshilParmar 3 жыл бұрын
Basically we do plumbing work with data 😆
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thank you Andreas Kretz for pushing the term!
@andreaskayy
@andreaskayy 3 жыл бұрын
Plumbing of data science 💪 Great video Ben!
@DarshilParmar
@DarshilParmar 3 жыл бұрын
@@andreaskayy hahaha
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
The gangs all year!!
@dmitriyobidin6049
@dmitriyobidin6049 4 ай бұрын
How many DevOps tasks an average engineer covers? Is it data engineers job to monitor pipelines and being on-call?
@oresttokovenko
@oresttokovenko 3 жыл бұрын
Hey Ben, I was wondering if you'll be doing another resume review in the near future. Thanks
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Yes, I will plan to make a post about it in the next day or two
@oresttokovenko
@oresttokovenko 3 жыл бұрын
@@SeattleDataGuy great, I'll be looking for it!
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
@@oresttokovenko Thanks for the support!
@DanielLavedoniodeLima_DLL
@DanielLavedoniodeLima_DLL 3 жыл бұрын
Quick question, are you a senior data engineer? If so, how long did it take for you to become one from when you started as a junior?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
It took me about 7 years to become a Senior data engineer. But that really depends on the company. I have seen people get the senior title straight from college because they had a masters degree. I would worry less about titles and more on the work. The more you chase something, the farther it gets from you. Zach wilson talks about this on linkedin. He couldn't get a senior DE position at FB, quit, took a year off, went to Netflix and got hired as a Sr. SWE and is now a tech lead at airbnb.
@phuinh9716
@phuinh9716 3 жыл бұрын
how can we stay up to date with technology news about DE
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
This is always a challenge. I would say don't get to caught up with every new tool. Pick 1-2 a year to learn but focus more on what you're working on at work.
@caioreis350
@caioreis350 3 жыл бұрын
Forgotten print statment xD
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
thats me
@nivretech2072
@nivretech2072 3 жыл бұрын
I'm a Computer Engineering student, can I consider choosing Data Engineer as my career path? Thank you Ben :)
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
If you're into data, then yes! What are your goals?
@nivretech2072
@nivretech2072 3 жыл бұрын
@@SeattleDataGuy Yes I can say right now that I am into it! Data is everywhere and it is something that won't expire in the future. I just watched your video about 'Breaking into Data Engineering and How to be DE', my goal is to be Data Analyst first to gain experience and then later be DE. Also I might consider being a DE intern after I graduate but upon searching for DE internships here in my country, there are less results. Thank you so much Ben!
@gideonblinick6581
@gideonblinick6581 2 жыл бұрын
Hey Ben, Great video. wrt Data Modelling, is The Data Warehouse Toolkit still the best resource for learning? If I'm not mistaken, it was last updated in 2013 so I'm wondering if there's anything more recent. Thanks!
@gideonblinick6581
@gideonblinick6581 2 жыл бұрын
also, would be great if you could summarize your videos at the beginning or split them up so people could find things in the video better and also recap what they learned.
@SeattleDataGuy
@SeattleDataGuy 2 жыл бұрын
Alright, I can work on that! Splitting up my videos that is.
@gideonblinick6581
@gideonblinick6581 2 жыл бұрын
@@SeattleDataGuy Thanks, prob easier for you to just divide the videos up. Any comment on my question about Data Modelling? The Toolkit is a long book and very detailed, I'm assuming it's not necessary to go though all of it. Is going through a couple Udemy courses on Data Modelling enough to know to do interviews?
@henrique4171
@henrique4171 3 жыл бұрын
Hi man,i'm brazilian and i'm doing a 2,5 degree in data science,i want to be a data engineer but a data anlyst too,did u heard about analytics engineer? i don't wanna be overwhelmed with data engineering and data science,what should i do?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Yeah there are a lot of terms out there. I would focus more on the work you like doing and not the title. Do you prefer analytics or engineering? Because even with analytics engineers, I imagine some people will be either leaning towards the analytics side of that role or the DE side of that role
@henrique4171
@henrique4171 3 жыл бұрын
@@SeattleDataGuy idk what i like omg,but for online job is better data analytics...right?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
By online, do you mean remote? Data analytics or DE work can be done remote pretty easily. Been doing it for 2 years now...thanks to the cough
@henrique4171
@henrique4171 3 жыл бұрын
@@SeattleDataGuy thanks
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
@@henrique4171 Thank you for the question! I appreciate all the support.
@malcorub
@malcorub 3 жыл бұрын
Does a data engineer work often with software engineers and app developers or is that more of a database administrator role?
@DanielLavedoniodeLima_DLL
@DanielLavedoniodeLima_DLL 3 жыл бұрын
Depends on the company's needs. I've already worked in a company that didn't want to use any third-party paid software outside AWS, so I worked a lot with Python creating connections and managing Airflow and Kafka in EKS and Hadoop in EMR, but now I work at a company that uses a lot of third-party tools and software and my main focus is modeling data using SQL and DBT for Airflow. Data Engineering roles vary a lot from company to company, but what Ben said in this video is mainly the day-to-day routine of all data engineering jobs I've been in.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I agree with Daniel. It really does depend company to company. For example, at large tech companies you will likely work closely with software engineers because they are building products you are pulling from. Where as in other companies you might be working with external third party solutions so you talk more with solutions engineers and sales engineers. It all depends on how close you are with the data!
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Thank you Daniel for responding! I love it when people share their experiences(besides just me). Like you said, the role of a DE depends company to company. You could work a lifetime and not realize that some DEs do completely different work. So having other perspectives is exactly what this community needs!
@rudrakshgandhar3381
@rudrakshgandhar3381 3 жыл бұрын
How much SQL is required to begin with data engineering?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
I would say it is very SQL heavy. So you need to know the basics as well as understand how to re-model data with SQL.
@rudrakshgandhar3381
@rudrakshgandhar3381 3 жыл бұрын
@@SeattleDataGuy Thanks for reply. Can you make one video comparing AWS, GCP and Azure for data engineering?(job market, difficulty etc) I am moving on path with Azure but just want a heads up regarding what is better. And lastly, Keep up the good work!
@Aryan-ji2nk
@Aryan-ji2nk 3 жыл бұрын
Hey , It was a great video first of all but a quick question , Can a complete fresher ( College student) get an entry level job as a Data engineer?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
This is challenging. I have a few videos talking about how I got my first DE job. There are some internships and jr. positions. But the other route is usually analyst to DE or SWE to de. Here is a video on the concept kzbin.info/www/bejne/onjdmWCHmtOXgMU
@shauryajain4851
@shauryajain4851 3 жыл бұрын
Are there enough openings for data engineers, as I'm planning to pursue masters in usa?
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Yes I believe so
@positivist83
@positivist83 3 жыл бұрын
You talk so quickly makes it hard to understand
@malcorub
@malcorub 3 жыл бұрын
audio on youtube can be sped up or slowed down.
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
This and my upspeak are two areas I am trying to work on. You guys don't even see the times I borderline start saying a new sentence before i finish the last one. I am working on it. Thank you for your patience!
@SeattleDataGuy
@SeattleDataGuy 3 жыл бұрын
Which one do you do malcorub?
Things I Wish I Knew When I Started As A Data Engineer
14:41
Seattle Data Guy
Рет қаралды 34 М.
This Game Is Wild...
00:19
MrBeast
Рет қаралды 177 МЛН
FOREVER BUNNY
00:14
Natan por Aí
Рет қаралды 30 МЛН
Day In The Life Of A Software Engineer | calm & productive
13:57
Luke Made It
Рет қаралды 258 М.
ML Engineering is Not What You Think - ML jobs Explained
13:23
Boris Meinardus
Рет қаралды 109 М.
The Harsh Reality of Being a Data Engineer
14:21
Jash Radia
Рет қаралды 247 М.
Day in the Life of a Data Analyst (Work From Home) | *Realistic*
9:05
Coding with Dee
Рет қаралды 156 М.
A Day in the Life of a Data Analyst (2023)
13:34
CareerFoundry
Рет қаралды 424 М.
How He Got $600,000 Data Engineer Job
19:08
Sundas Khalid
Рет қаралды 166 М.
This Game Is Wild...
00:19
MrBeast
Рет қаралды 177 МЛН