Functional Data Engineering - A Set of Best Practices | Lyft

  Рет қаралды 78,712

Data Council

Data Council

Күн бұрын

Пікірлер: 25
@moverecursus1337
@moverecursus1337 Жыл бұрын
Interesting the approach to the slowing change dimension. Data storage nowadays are cheap, the time and price of engineering is a lot more expensive.
@ChristopherSlattery1980
@ChristopherSlattery1980 6 жыл бұрын
Interesting talk. Nice to see a tech video where they display the slides as well as the speaker.
@boringmanager9559
@boringmanager9559 4 жыл бұрын
wow, I started to understand those concepts so much better now
@ariesykes1432
@ariesykes1432 3 жыл бұрын
Really interesting video! It definitely relates to what I am working on currently. I work in IT recruiting. I'm always looking for a data engineers for my jobs and this video helps me understand a little more about what data engineers have to face in their field. Thank you!
@VajoLukic
@VajoLukic 5 жыл бұрын
It is such a fantastic talk! It summarizes all the good practices for modern data management and puts them into the right perspective so that "old school" BI/DW people can understand.
@millerblaine2047
@millerblaine2047 3 жыл бұрын
i dont mean to be so off topic but does anyone know of a trick to log back into an Instagram account?? I stupidly lost the login password. I would appreciate any tips you can offer me!
@nhuwlaftooi
@nhuwlaftooi Жыл бұрын
Really great talk! Help me to understand many new concepts
@cyclogenisis
@cyclogenisis 2 жыл бұрын
Overall not a bad presentation. Although, I do not think using Presto with 1 million per day snapshot on dimensions is viable (specifically out of box). Presto likes to put all joins onto the main fact into memory, modelling this way and using Presto isn't really in line with how it was meant to be used if your joining all the records each day. But in general, I do like the shift of keeping everything in dimensions if the technology allows for it. Edit: The Q&A section talked more to this, he adapted his answer to "apply common sense"
@karangupta_DE
@karangupta_DE 2 жыл бұрын
A persistent staging area might only be effective when there is one place where your raw data resides. For example if you stage your data on AWS S3 and then copy the data again to snowflake, you will end up having two places of storage with the same raw data.
@lbb2rfarangkiinok
@lbb2rfarangkiinok 2 жыл бұрын
Talked a bout a lot of interesting stuff but I had a hard time really figuring out what is meant by functional engineering. Each individual topic was very clear for me, but not the red thread linking it all together.
@chinmayarankalle4389
@chinmayarankalle4389 2 жыл бұрын
Great efforts thanks @maxime
@striker865
@striker865 6 жыл бұрын
Awesome talk! Thanks for taking the time to share some best practices from top the talent!
@lambuth
@lambuth 5 жыл бұрын
Great talk. And quite the resemblance to the singer from 311.
@Sowji.kilaru
@Sowji.kilaru Ай бұрын
Wondering if anyone can help with a question. Would this still be applicable to star schema model? If I snapshot all my dimensions and create date partition, should I add date to the primary key. That is surrogate key + date. This will be FK in fact table. Am not clear how date from each dimension table be linked in fact table.
@hgiagiamou
@hgiagiamou 6 жыл бұрын
Great talk!
@qwaszx822
@qwaszx822 5 жыл бұрын
Thanks a lot for the video. I have a question. For modern datawarehouse solutions are there any other data models emerged apart from starschema, snowflake ?
@dragonfly4484
@dragonfly4484 5 жыл бұрын
data vault and incremental approach are kind of emerging as norms in some quarters
@sspaeti
@sspaeti 6 жыл бұрын
Thanks for sharing your great knowledge with us again. One question about snapshotting let's say daily. What if multiple changes in dim_supplier happen, then you wouldn't catch these once. Would that just be a small tradeoff you would accept or would you have something else in mind to track that?
@mistercrunch
@mistercrunch 6 жыл бұрын
If keeping track of intra-day changes is important I'd recommend denormalizing the dimension attribute into the fact table. Conceptually if some dimensional attributes are "flickering", meaning going back and forth and changing fairly often, it's pointing towards that attribute being logically attached to the fact and not to the dimension.
@kyosungchoo1436
@kyosungchoo1436 5 жыл бұрын
I think you use changing history to catch this changing, then as Maxime Beauchemin, this changing history should be in fact table.
@allthatyouare
@allthatyouare 10 ай бұрын
🤯
@marigoldx22
@marigoldx22 6 жыл бұрын
Maha Guru!
@borat_trades2860
@borat_trades2860 4 жыл бұрын
honestly did not get much from this talk - pretty technical
@santanubaishya8357
@santanubaishya8357 4 жыл бұрын
Great talk!
Data Engineering and Data Science: Bridging the Gap | DataEDGE 2016
30:13
Berkeley School of Information
Рет қаралды 26 М.
Thank you Santa
00:13
Nadir Show
Рет қаралды 29 МЛН
Муж внезапно вернулся домой @Oscar_elteacher
00:43
История одного вокалиста
Рет қаралды 6 МЛН
The Future of Data Engineering in a Post-AI World
34:37
Data Council
Рет қаралды 4,6 М.
Bhavani Ravi - Apache Airflow in Production - Bad vs Best Practices
35:55
Building Data Engineering Teams | Datadog
42:03
Data Council
Рет қаралды 4 М.
Zach Wilson on what makes a great data engineer
34:02
The Data Engineering Show - Podcast
Рет қаралды 8 М.
SQL Best Practices - Designing An ETL - Part 1
24:42
Seattle Data Guy
Рет қаралды 74 М.
Real Interview Q&A for Senior Data Engineer #1 | Surfalytics
30:26
Surfalytics TV
Рет қаралды 7 М.
Data Engineering Best Practices
9:16
AmpCode
Рет қаралды 634
Creating a Data Engineering Culture | Big Data Institute
44:37
Data Council
Рет қаралды 10 М.
Data Governance Fundamentals - Nicola Askham
28:39
Hyperight AB
Рет қаралды 33 М.
Thank you Santa
00:13
Nadir Show
Рет қаралды 29 МЛН