Data Productivity at Scale
32:54
2 ай бұрын
#Sherules with Machine Learning
57:36
Netflix Research: Analytics
2:23
6 жыл бұрын
Netflix Research: Machine Learning
3:01
Netflix Research: Recommendations
2:33
Пікірлер
@rembautimes8808
@rembautimes8808 9 сағат бұрын
Quite interesting that a survey was performed so that the points were grounded by data
@harshasaitammineni8150
@harshasaitammineni8150 29 күн бұрын
How can we connect with Betty Li? Any LinkedIn please?
2 ай бұрын
Can you enable captioning of the videos?
@reachkarthikt
@reachkarthikt 2 ай бұрын
Whats the point of posting this without proper capture
@TheImmaculate84
@TheImmaculate84 2 ай бұрын
This is really cool. Great high level summary of what there is to know about modelling and AI. Thanks Jide
@Babulal32218
@Babulal32218 2 ай бұрын
Thanks for sharing the internals at such details level. QQ as per initial design it was mentioned that for quick response request goes to key value data store and then towards the end it was mentioned those requests are catered by Cassandra.Also nowhere in actual design the request is going to impression table as shown in initial design
@AlohaTimes
@AlohaTimes 2 ай бұрын
Buffet of choices also confusing and fattening. There is beauty in a few lean options. Who likes a complex menu of this or that?
@AntonBryzgalov
@AntonBryzgalov 2 ай бұрын
Too few details on how the model was actually trained. One of the slides says OpenAI -> Weaviate. How does it actually happen? Weaviate is just a database after all: how the queries towards it are built? A blogpost with details will be highly appreciated. The idea is great but some additional technical details have to be disclosed.
@jonassteinberg3779
@jonassteinberg3779 2 ай бұрын
Exceedingly high level
@djangoworldwide7925
@djangoworldwide7925 2 ай бұрын
I'd expect Netflix's channel to upload better resolution and size of the slides. :/
@jonassteinberg3779
@jonassteinberg3779 2 ай бұрын
"my ducatti is current broken again" lollll
@VikrantVerma22
@VikrantVerma22 2 ай бұрын
Can you share the paper/blog link please? thx.
@ak8376
@ak8376 2 ай бұрын
Very detailed talk, extremely informative.. Great work Tulika!
@josenavio6445
@josenavio6445 2 ай бұрын
amazing
@cstephens16
@cstephens16 2 ай бұрын
awesome presentation and perfect timing for me. i have to give a presentation in a few days explaining all this new composability in data/database word and why developers should care about it.
@NeeruGautam-fp2ej
@NeeruGautam-fp2ej 2 ай бұрын
Good job keep it up 👍
@madhuriroy4005
@madhuriroy4005 2 ай бұрын
Veri nice 👍
@theukulelegod
@theukulelegod 2 ай бұрын
Ahhh I wish we had the slides in this one 😢
@iaroslavzeigerman9876
@iaroslavzeigerman9876 2 ай бұрын
The slides kick in around the 8th minute. So the viewers miss out on some memes, but the core parts of the talk are still there 😂
@bcroy8924
@bcroy8924 2 ай бұрын
Very nice presentation. Keep it up.
@ishakaushal1390
@ishakaushal1390 2 ай бұрын
Very informative, keep up 👍
@sangeetaprasad1879
@sangeetaprasad1879 2 ай бұрын
Great
@suhaniahuja7631
@suhaniahuja7631 2 ай бұрын
Great ! ❤
@musicalPartner
@musicalPartner 2 ай бұрын
Great! Informative 👍👍
@labsanta
@labsanta 2 ай бұрын
The Struggle of Enterprise Data Modeling: A Data Architect's Journey [03:27](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Transitioning from data architect to generative AI expert - Discussing journey as a data professional over 16 years, focusing on data modeling and architecture roles at various companies - Detailing challenges faced as a data architect in managing data schemas, infrastructure, and collaborating with developers on data placement [06:54](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Challenges with disparate data and maintaining consistency in large organizations. - Data was duplicated and scattered across different teams, leading to difficulties in answering questions. - Complex processes of pulling and joining data from disparate systems and writing code for data consistency and unification. [10:21](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Automating data discovery, mapping, and integrations for a unified and accessible data view. - The AI agent automates data mapping, integrations across multiple organizations, and discovers data and relationships. - It also interprets metadata, infers data types and constraints, builds an ontological model, and continuously updates the model. [13:48](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Automating data architecture through generative AI - Data modeling involves logical and physical perspectives including entity attributes, relationships, inventory, structure definition, and data population. - Data collection sources range from Postgres, S3, Data Lake, operational systems like Salesforce and Zendesk, involving querying, schema inference, and reverse engineering SQL code. [17:15](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Using generative AI to enhance data modeling and querying - The process involved building a data ontology and pushing it into a vector database, specifically Weavio, to enable querying and building multiple levels of relationships - The aim was to provide a user-friendly experience by enabling free text search without the need to build a separate model [20:42](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Generative AI interprets queries for quick data access - Generative AI interprets user queries accurately - Feedback loop ensures data accuracy and user satisfaction [24:09](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Automated model updates and data tracking for improved decision-making - Ensures agents can learn and adapt by monitoring ontology and updating models with new data sources. - Removes the need for explicit database specifications, enabling intuitive free text search for better decision-making. [27:35](kzbin.info/www/bejne/eqXdenyMf9ZrraM) Automating Data Architect with Generative AI - Implemented Snowflake data warehouse for executives to improve data queries and comparisons - Considering enhancing system with knowledge graph and open AI integration for better results
@autkarsh8830
@autkarsh8830 2 ай бұрын
Quite an elaborate insights into inpressions🎉
@tanushreebhatt6779
@tanushreebhatt6779 2 ай бұрын
Informative, good job!
@agammishra9674
@agammishra9674 4 ай бұрын
Great content, learnt a lot....I wanted to know [ any viewer can answer as well if they got the answer] , how they ensured that in their SQS has no duplicates ? also, if batches are 10 mins apart, can't we use HWM table in OLTP systems to ensure we get ACID complaint ???
@Dom-zy1qy
@Dom-zy1qy 4 ай бұрын
Hello netflix, i would like to be hired by you guys. I am a slightly below average software engineer. Maybe you guys could let me sweep the floors or something? I can be a FAANG janitor! Would just ask for maybe like $11 an hour plus a salad from the cafeteria maybe. I look forward to hearing back from you guys.
@grawss
@grawss 4 ай бұрын
Wtf is this guy wearing?
@jonassteinberg3779
@jonassteinberg3779 2 ай бұрын
doesn't really matter tbh
@mahesh26sai
@mahesh26sai 4 ай бұрын
Thanks for sharing this to public!
@joswinpinto360
@joswinpinto360 5 ай бұрын
Gonna be in the team soon!!
@aditya_pawar
@aditya_pawar 6 ай бұрын
Wow, Must see video for every reliable data engineer!
@aditya_pawar
@aditya_pawar 6 ай бұрын
What is High play starts in the example for Context specific Audits @11:30
@svdfxd
@svdfxd 6 ай бұрын
How I wish
@ed7470
@ed7470 7 ай бұрын
Intro dope afff
@iirdna
@iirdna 7 ай бұрын
how you avoiding too high tide of a changes? meaning - is any late data arriving triggers Psyberg? even just few thousand of rows? or you accumulating changes at some sort of gates/elevators and process when enough late data accumulated to justify downstream reprocessing?
@elricofr
@elricofr 7 ай бұрын
Thanks for sharing. For the comparison between Extractor pattern and DRY principle, it stands but it's not exactly the same driver: DRY principle in programming is to avoid to replicate the same logic - as code - multiple times (to avoid repetitions and incoherences). And this logic can be applied multiple times during the run. Here, the goal is to avoid repetitions for the run itself.
@mayjoec
@mayjoec 7 ай бұрын
Is there any way you can enable transcript on the youtube video
@user-ko2qt4cf1y
@user-ko2qt4cf1y 7 ай бұрын
will they open source Maestro like Airbnb/Airflow??
@vipinahuja2996
@vipinahuja2996 7 ай бұрын
very smart, using new Acronyms for old Audit tables.
@TamilSelvanSS
@TamilSelvanSS 7 ай бұрын
Straight up talk 👏
@Mario-yd3ht
@Mario-yd3ht 8 ай бұрын
Where can I download this slide?
@homemaide
@homemaide 8 ай бұрын
Great job Pallavi and Lee! Just had a couple questions: 1. Iceberg and Spark: do you have any challenges running these? Spark Shell not working after adding support for Iceberg, dependency issues w/ AWS and Iceberg, etc.? 2. Why use SQS instead of Kafka? 3. How do you overcome interpreting sessions / sessionization in real time?
@homemaide
@homemaide 8 ай бұрын
Great presentation! Thank you for sharing this! 4:56 - why use Iceberg instead of Delta Lake and Hudi? 8:26 - how do data engineers verify quality data? Isn't that the business office's or data science team's responsibility? 10:08 - DE isn't always told when data source context changes/is updated. 100% true. 12:33 - sometimes = always, in my experience. :) 13:42 - why use python? perhaps due to the schedules? wouldn't scala be faster? 17:55 - Janitor sounds like an incredibly helpful tool!
@ritaokonkwo
@ritaokonkwo 8 ай бұрын
So insightful. Thanks!
@user-rl7fc2ty4k
@user-rl7fc2ty4k 8 ай бұрын
4:15 whats go table standard s?
@smitaphadnis9821
@smitaphadnis9821 8 ай бұрын
Very useful.
@rajvellaturi
@rajvellaturi 9 ай бұрын
This is such an exciting talk. I faced this problem in my experience working as a DE. Identifying and reprocessing those late-arriving records is resource-intensive and time consuming for sure. Thanks to Iceberg for making it easy and possible to put together a solution with the help of metadata.
@charlesjoseph8717
@charlesjoseph8717 9 ай бұрын
Very useful information, thanks for sharing, I am a data engineer and found this series very useful. When you are talking about processed partition , how is the partition selected, through a CTE or Subquery?
@mrstudent1957
@mrstudent1957 9 ай бұрын
Someday.... I'll work for Netflix 😊