Evaluating LLM-Based Apps: New Product Release | Deepchecks LLM Validation

  Рет қаралды 618

LLMOps Space

LLMOps Space

8 ай бұрын

In this session, Shir Chorev, CTO at Deepchecks, and Yaron, VP Product at Deepchecks discussed LLM hallucinations, evaluation methodologies, golden sets, and gave a live demonstration of the new Deepchecks LLM evaluation module.
Deepchecks LLM Evaluation: deepchecks.com/solutions/llm-...
Topics that were covered:
✅ Hallucinations: Cases when the model generates outputs that aren’t grounded in the context given to the LLM. We'll discuss this well-known problem as well as a robust approach towards solving it.
✅ Evaluation Methodologies: We’ll explore various methodologies for evaluating LLMs, including both automated and manual techniques. We’ll also talk about structuring the golden set that will be used for benchmarking the LLM’s performance, as well as why it’s important.
✅ Deepchecks LLM Evaluation: A live demonstration of Deepcheck’s new LLM evaluation module, and the main highlight of this session.
About LLMOps Space:
LLMOps.Space is a global community for LLM practitioners. 💡📚
The community focuses on content, discussions, and events around topics related to deploying LLMs into production. 🚀
Join discord: llmops.space/discord

Пікірлер
What are AI Agents?
12:29
IBM Technology
Рет қаралды 120 М.
Jumping off balcony pulls her tooth! 🫣🦷
01:00
Justin Flom
Рет қаралды 28 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 21 МЛН
Why Agent Frameworks Will Fail (and what to use instead)
19:21
Dave Ebbelaar
Рет қаралды 35 М.
What is LangChain?
8:08
IBM Technology
Рет қаралды 183 М.
Running Generative AI & LLM on a Kubernetes Cluster | Cloud Institute
30:32
Fine-tuning LLMs with Hugging Face SFT 🚀 | QLoRA | LLMOps
53:56
LLMOps Space
Рет қаралды 1,7 М.
Evaluating LLM-based Applications
33:50
Databricks
Рет қаралды 23 М.
Data Scientist vs. AI Engineer
10:39
IBM Technology
Рет қаралды 160 М.
How to set up RAG - Retrieval Augmented Generation (demo)
19:52
Don Woodlock
Рет қаралды 22 М.
КРАХ WINDOWS 19 ИЮЛЯ 2024 | ОБЪЯСНЯЕМ
10:04
Лучший браузер!
0:27
Honey Montana
Рет қаралды 1,1 МЛН
Что делать если в телефон попала вода?
0:17
Лена Тропоцел
Рет қаралды 3,4 МЛН