Evaluating LLM-based Applications

  Рет қаралды 18,359

Databricks

Databricks

9 ай бұрын

Evaluating LLM-based applications can feel like more of an art than a science. In this workshop, we'll give a hands-on introduction to evaluating language models. You'll come away with knowledge and tools you can use to evaluate your own applications, and answers to questions like:
- Where do I get evaluation data from, anyway?
- Is it possible to evaluate generative models in an automated way?
- What metrics can I use?
- What's the role of human evaluation?
Talk by: Josh Tobin
Here’s more to explore:
LLM Compact Guide: dbricks.co/43WuQyb Big Book of MLOps: dbricks.co/3r0Pqiz
Connect with us: Website: databricks.com
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc
Facebook: / databricksinc

Пікірлер: 9
@AnandShah-ds
@AnandShah-ds 6 ай бұрын
Evaluations aside, I really enjoyed the presentation. I was hooked. Great story-telling skills Josh. Thanks for sharing your experience. We count on volunteers like you to spread knowledge.
@ndamulelosbg8887
@ndamulelosbg8887 2 ай бұрын
This is an exellent coverage of the challenging task of llm evaluatuon
@ndamulelosbg8887
@ndamulelosbg8887 2 ай бұрын
"Your opininon on LLMs does not matter" - I found this to be a great quote
@vaishnavipatil3319
@vaishnavipatil3319 9 ай бұрын
Thank you for clearing this concepts. Would like to see more videos from you on evaluation frameworks, methods.
@asfandiyar5829
@asfandiyar5829 8 ай бұрын
Just what I was after. Thanks
@manishsharma2211
@manishsharma2211 8 ай бұрын
Good work
@SpartanPanda
@SpartanPanda 7 ай бұрын
Great storyline
@bharath_v
@bharath_v 5 ай бұрын
Good One!
@threevia.travel
@threevia.travel 3 ай бұрын
Very generic, expected something more tangible! Sounds common sense which might work or might not work
How to Build LLMs on Your Company’s Data While on a Budget
40:37
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 1,8 МЛН
Let's all try it too‼︎#magic#tenge
00:26
Nonomen ノノメン
Рет қаралды 44 МЛН
How to open a can? 🤪 lifehack
00:25
Mr.Clabik - Friends
Рет қаралды 13 МЛН
О, сосисочки! (Или корейская уличная еда?)
00:32
Кушать Хочу
Рет қаралды 3,3 МЛН
Fine-tuning Large Language Models (LLMs) | w/ Example Code
28:18
Shaw Talebi
Рет қаралды 221 М.
Evaluation for Large Language Models and Generative AI - A Deep Dive
1:16:49
Rajistics - data science, AI, and machine learning
Рет қаралды 7 М.
Deep Dive into LLM Evaluation with Weights & Biases
59:11
DeepLearningAI
Рет қаралды 16 М.
"okay, but I want Llama 3 for my specific use case" - Here's how
24:20
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
15:21
Готовый миниПК от Intel (но от китайцев)
36:25
Пленка или защитное стекло: что лучше?
0:52
Слава 100пудово!
Рет қаралды 1,5 МЛН
Как открыть дверь в Jaecoo J8? Удобно?🤔😊
0:27
Суворкин Сергей
Рет қаралды 1,2 МЛН
Apple Event - May 7
38:22
Apple
Рет қаралды 6 МЛН