I have done absolutely zero with anything dealing with programming/code. But I’m super interested. My question is… where do I start my journey to become a data engineer? I want to be as efficient as possible. I recently turned 40, and I think I would enjoy this career field, as I want to slow down my physical exertion for jobs. Do I take a course? I won’t do well with just learning Willy Nilly. I need guidance and a path. I’d love to hear back from you. Thanks.
@ayehavgunne9 күн бұрын
I love refactoring
@tutacat29 күн бұрын
It is great that our definitions for artificial intelligence is "What computers cannot yet do"
@passportbro904Ай бұрын
Please answer, how much sql is enough? If I know subqueries is that enough or do I also need window functions etc?
@Ikilledthebanks2 ай бұрын
Thank you for all the excellent content
@MichHawkeye2 ай бұрын
I've been looking for just this video for a year. I made it required viewing for my team of 300 within a large manufacturing organization to help both analysts and leaders better understand why we are building a data warehouse. Thank you so much. You content is brilliant!
@hdebbache20002 ай бұрын
Bro you need to start posting again, your channel will lift off at one point
@markshard2 ай бұрын
In general OBT (or what I prefer OBFT) doesn't scale well. What you have is a data model for each front end. A simple change would require a reorg of OBFT whereas star schema you're just adding a dimension attribute. You can always build OBFT from a star schema but the reverse is more difficult.
@Leo-DatabaseConsultant2 ай бұрын
We can use Data Vault (incremental loading) + Inmon (EDW) + Kimbal (Star schema) + Deta Lake (ELT = Bronze, Silver and Gold data movements) methodologies and use all of them at the same time. The core will be Kimball + Inmon.
@EddieVanWilder2 ай бұрын
What I like about this video is that it's a high-level tutorial, which turns out to be great time-investment for value. Didn't have to spend an hour to understand the process. Great job making it!
@manjeetsingh-uc3cx2 ай бұрын
Good
@nachiketarout3 ай бұрын
Short Simple Crisp and fruitful
@Dipanki-c7k3 ай бұрын
Which software you are using to edit videos
@Ilex-09223 ай бұрын
very helpful, thankyou so much <3
@jonafic3 ай бұрын
I just came across your channel and loved your content! underrated man , you deserve way more subscribers.
@ChatGPT-ef6sr3 ай бұрын
Please don’t stop posting bro. Very real and good content
@Sneeaaakkkkoooo3 ай бұрын
Is a salary - yearly salary or no?
@reanwithkimleng4 ай бұрын
Hello sir what is the difference between digital government and e-government?
@MarkLemmen884 ай бұрын
Thanks for sharing the video Yesterday I make a short video about the simalarities between ice creams and data governance kzbin.info/www/bejne/Y3KxnneXn7VqiMk
@kvbd27104 ай бұрын
Thank you for such a wonderful video. For some reason, the subtitles for this video got messed up, I am unable to get it in English. It is defaulting to Vietnamese. Please help.
@mahiaravaarava4 ай бұрын
The future of data lies at the intersection of analytics and machine learning. While traditional analytics provides valuable insights through historical data analysis, machine learning offers advanced predictive capabilities and automation for handling complex datasets. Combining both approaches will drive more accurate, actionable insights and innovations in data-driven decision-making.
@JuanHernandez-pf6yg5 ай бұрын
Useful. Thank you.
@alireza22955 ай бұрын
This was the best explanation of Apache Spark architecture that i found on YT. thank you.
@sebastianlozano77075 ай бұрын
I like refactoring my own mess, hahah
@timoyang74386 ай бұрын
Thanks for the great explanation, I was so overwhelmed with so many concepts given by the IBM course on coursera, suddenly, those concepts make sense
@passportbro9046 ай бұрын
just found ur channel, doing a data science degree but learning data engineering self taught, this channel is gold. subbed
@crimcrammoo6 ай бұрын
10 years from now we are going cringe call all this “AI”. It’s like calling a plane a rocket ship.
@rmcgraw79436 ай бұрын
I’m an Enterprise Technical Architect with 20+ years experience, but have been an architect in many arenas, first Data, then Application, then Network, then Security, then Infrastructure, and so forth. I have NEVER worked at any organization where an architect was hired before the organizational processes were a cluster F, most often caused by their complete lack of process definitions and/or technical implementation knowledge. They always attempt to make a developer do architecture, who fails expectedly, before they are willing to incur the cost of an architect.
@jessiehopper6 ай бұрын
I totally agree with diagramming the platform when you're new on a team. I've been doing this and every time I get great feedback from the team!
@jessiehopper6 ай бұрын
Having a good PO / PM is really underrated in my opinion.
@KaiserX20246 ай бұрын
Danke Ihnen Frau Navarro! das Video hat sehr geholfen!
@JimRohn-u8c6 ай бұрын
What about 1 - 2 DevOps people?
@abdullahalsqoor28936 ай бұрын
engineering team duties: architecture, infra, relability and pipelines
@smrtysam6 ай бұрын
I’ve been building a data team for the past 9 month. It’s been quite challenging and getting the buy in from the senior leadership team is hard. One of the key roles I’ve manage to fill is a BA to help with our data migrations. I’m still doing the “engineer lead” role along my data lead role. Wish me luck for the future.
@tutacat6 ай бұрын
Spreadsheets are cool, that doesn't make them intelligent. They are using NLP to link ideas together, and sure it can run through generation like a program interpreter, that does not make it intelligent, it just does what it is told, based on training and input. We just don't know what it was programmed to do based on training data with self-learning. We think it is just trained to generate text that sounds like a human. Sure we fed in calculations, etc. but didn't teach it to think or translate language or anything, they have just emerged from the deep structure of multi-dimensional linking. The training is a bit like a compiler, but we don't truly know what is happening yet
@malcolmharris73637 ай бұрын
The downside to such openness and transparency (like what is being shown here) is when minorities play back these kinds of videos to each other to show proof of white privilege. (Which I don't believe in.) However I hope in the end guys like you can be sympathetic to need for diversity programs, because minorities generally don't have 'origin' stories that include highlights like: 1. "I got the senior job with no skills" and 2. "I job hopped because I was bored and I was viewed by employers as... responsible."
@kamilagendasz31157 ай бұрын
Feels like I finally found channel with clearly explained data engineering topics in short form. Keep it up!
@firstshield95077 ай бұрын
Excellent video bro!
@HeadStronger-HS7 ай бұрын
I bet the h1-bs are still there.
@JonathanBiemond7 ай бұрын
Really helpful, practical advice! Thank you
@luisresendez9468 ай бұрын
Your content is great man, keep it up! you've helped a lot
@fishsauce74978 ай бұрын
What many fail to realise is that a bad data warehouse is not just bad table structure, but also, low documentation, redundant calculations, unnecessarily complicated ETL (mostly tech debt). All of which make the warehouse unusable and difficult to maintain. I also see wrong approach when creating a warehouse e.g. just looking at existing reports to create a data model, no data profiling, no business study, no articulation of data loading rules, heavy on undocumented assumptions. Eventually the new shiny warehouse by modellers is also discarded by analysts as it is not fit for purpose, because the same mistake is repated again and again.
@jeanchindeko54778 ай бұрын
4:55 while a good video introduction to Apache iceberg, there are a few point that needs to be clarified here. Delta Lake is not owned by Databricks but by the Linux Foundation since October 2019, unlike the 2 other table formats Hudi and Iceberg. Apache Kudu, Cassandra and Druid are not table formats so cannot and shouldn’t be compared to Apache Iceberg. Databricks don’t have a Delta engine. Delta, Hudi and Iceberg are all 3 available on AWS in AWS Amazon Glue, AWS EMR and Amazon Redshift Spectrum. It’s not just Iceberg which is supported there. All 3 table formats are fully open source, not owned by any company and community driven. They are all 3 supported by the same set of Query engine such as Spark, Flink, Presto, Trino and more. Delta lake is not just supported by Databricks (which you will hear a lot in the Iceberg community) but also by Google BigQuery and Snowflake. Microsoft Fabric platform is built on top of Delta Lake table format.
@splashoui37609 ай бұрын
"that's it"
@Solfeggio689 ай бұрын
Everything is “AI” today because greedy people want to monetize it. Same thing happened with “Blockchain” around 2016. Seemingly overnight there were hundreds of startups selling “blockchain” solutions. Don’t buy they hype, literally
@ol8809 ай бұрын
I dunno man, our management identified that our business may be at risk if nuclear warfare breaks out, so we are building a truly resilient platform which replicates data on Mars. We call it "Storm Data Warehouse", because there's not enough water on Mars to use a Cloud. We're also expecting that every person, dead or yet unborn will be incarnated as an AI consciousness, and thus will be our user, so we have to build high-quality robust pipelines that ingest exabytes of data in real-time, so that the CEO can bask in the glow of twenty-four screens that erupt to life as he wakes up at 4:32am to look at the KPI dashboards to change the world.
@nullQueries9 ай бұрын
But you only have a $3000 budget, and it needs to be done by June
@saisathya77559 ай бұрын
Aio as an data engineer idk but several times i just laughed when you say moving data from one place to another 😂 but it was fun
@millionare11929 ай бұрын
Wow..had a lot of fun and it removed anxiety
@Gavin-w4r9 ай бұрын
“In a world deluged by irrelevant information, clarity is power.” Yuval Noah Harari
@TheR0yalBeast9 ай бұрын
Very clean
@pipicovers9 ай бұрын
Steps to make pipeline better 1. Good auditing and logging: error handling 2. Repeatable and identical 3. Self healing: finding a way to find the delta , log files and compare, add a data lake before data warehouse , add hash or water marks before compare 4. Decouple EL and T: Landon Rae formate, transform to Dwh, make reporting table clean, 5. Always available: trancate and load refresh faster than update. Or build semantic layer 6. CICD: coded, git connected, versioned , rollbacks