The Hidden Life of Embeddings: Linus Lee

  Рет қаралды 7,530

AI Engineer

AI Engineer

Күн бұрын

We love text embeddings as a critical pillar of LLM applications, but there's so much to text embeddings beyond their value in vector search. This talk will be a grand tour through a series of experimental projects from my last two years of research for visualizing, manipulating, and interpreting embeddings. We'll start with the basics (t-SNE, UMAP, and PCA), talk about how language models can be used to manipulate and interpret embeddings, and end by using a new tool I've built that lets us directly observe which features popular embedding models like to encode into their embeddings.
Recorded live in San Francisco at the AI Engineer Summit 2023. See the full schedule of talks at ai.engineer/su... & join us at the AI Engineer World's Fair in 2024! Get your tickets today at ai.engineer/wo...
About Linus Lee
Linus is a Research Engineer at Notion prototyping new software interfaces for augmenting our collaborative work and creativity with AI. He has spent the last few years experimenting with AI-augmented tools for thinking, like a canvas for exploring the latent space of neural networks and writing tools where ideas connect themselves. Before Notion, Linus spent a year as an independent researcher, during which he was Betaworks's first Researcher in Residence.

Пікірлер: 10
@gopikrishna8063
@gopikrishna8063 10 ай бұрын
just wow...fascinating demo.👍
@endlessvoid7952
@endlessvoid7952 10 ай бұрын
Wow, that was fascinating! Awesome demo 👏
@kevon217
@kevon217 10 ай бұрын
Awesome demo and great work/tools.
@swyxTV
@swyxTV 10 ай бұрын
someone said “this was the first talk were there were audible gasps in the audience”. amazing demos… and all on the side too all the mentions of Latent Space is music to my ears!
@Star-rd9eg
@Star-rd9eg 6 ай бұрын
did he say when he would releease code?
@johntanchongmin
@johntanchongmin 10 ай бұрын
Any code available? Great talk!
@twoplustwo5
@twoplustwo5 10 ай бұрын
🎯 Key Takeaways for quick navigation: 00:01 🎵 Introduction and Background - Introduction to the speaker, Linus Lee, who works on AI at Notion. - Brief overview of his past work with language models and embedding models. - Mention of Notion AI's progress and features since its launch in November 2022. 01:30 🧠 Discussing Latent Spaces - Explanation of the concept of latent spaces in AI models. - Comparison of controlling language models to steering a car from the back seat. - Discussion on the potential of gaining more control by looking inside the model. 03:09 📊 Understanding Embeddings - Explanation of how embeddings represent the most salient features of a text or image. - Discussion on the potential of disentangling meaningful attributes from embeddings. - Suggestion of building more expressive interfaces by intervening inside the model. 05:20 🛠️ Manipulating Embeddings - Demonstration of how to manipulate embeddings to generate different versions of a text. - Explanation of how to project texts into meaningful directions in the embedding space. - Discussion on the potential of mixing embeddings to generate new texts. 09:43 🔄 Adapting Embedding Models - Explanation of how to adapt an embedding model to read out text from other embedding spaces. - Demonstration of recovering text details from OpenAI's embedding space. - Discussion on the potential of manipulating image embeddings. 13:58 📚 Models Used and Research - Overview of the custom text model used in the demonstrations. - Mention of the recent research in the field of latent spaces. - Announcement of the release of the models used in the demonstrations on Hugging Face. 17:01 🎓 Conclusion and Takeaways - Emphasis on the importance of making complex models tangible and interactive. - Discussion on the potential of generative models as a laboratory for knowledge. - Encouragement to build more human interfaces to knowledge. Made with HARPA AI
@Star-rd9eg
@Star-rd9eg 7 ай бұрын
What program is he running?
@DanDascalescu-dandv
@DanDascalescu-dandv 4 ай бұрын
Probably a custom-built UI (maybe with Streamlit) for his Contra model.
@alexchiang2617
@alexchiang2617 10 ай бұрын
tokens everything, then Fourier transform everywhere?
Embeddings: What they are and why they matter
38:38
Simon Willison
Рет қаралды 23 М.
How AI 'Understands' Images (CLIP) - Computerphile
18:05
Computerphile
Рет қаралды 202 М.
WORLD BEST MAGIC SECRETS
00:50
MasomkaMagic
Рет қаралды 53 МЛН
GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem
19:15
SEO Yesterday... Meet Generative Engine Optimization (GEO)
15:59
The AI Almanac
Рет қаралды 2,8 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
The Future of Knowledge Assistants: Jerry Liu
16:55
AI Engineer
Рет қаралды 100 М.
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 992 М.
Notion AI Lead Linus Lee, Thursday Nights in AI
34:34
Outset Capital
Рет қаралды 1,4 М.