Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework

[Webinar] LLMs for Evaluating LLMs

How to Evaluate LLM Performance for Domain-Specific Use Cases

БУ, ИСПУГАЛСЯ?? #shorts

The Joker and the Angel fell into the hole, but luckily Harley Quinn rescued them #Angel

Happy birthday to you by Secret Vlog

Wait… Maxim, did you just eat 8 BURGERS?!🍔😳| Free Fire Official

Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework

Рет қаралды 23,676

DeepLearningAI

DeepLearningAI

Күн бұрын

Пікірлер: 21

@ajeethkumar6296

@ajeethkumar6296 5 ай бұрын

Thanks for the clear cut explanation

@HonestGraduate

@HonestGraduate Жыл бұрын

Thank you for the presentation and demo!

@MMSS-e9o Жыл бұрын

The real contribution seems to be the prompt they used to generate the CoT and the metric value... Could you share the code used for the metric and the prompt for ChatPGT?

@purvislewies3118

@purvislewies3118 Жыл бұрын

Blessed love...givethanks...Cape Town

@KokkeOP Жыл бұрын

The paper and the Slides are both in the description, guys. :) read.

@MMSS-e9o Жыл бұрын

Nice talk! Could you please share the notebook?

@senderlapin Жыл бұрын

Я из России. Спасибо за вебинар.

@JuliusOpusprofundum

@JuliusOpusprofundum Жыл бұрын

F u. I AM FROM UKRAINE.

@danteblink Жыл бұрын

Do you think human intervention in the evaluation process is going to last? It seems its a process that LLMs could achieve by themselves in the near future.

@JuliusOpusprofundum

@JuliusOpusprofundum Жыл бұрын

❤

@zaursamedov8906

@zaursamedov8906 Жыл бұрын

Guys would u be able to drop the notebook please?

@hcrespo3 Жыл бұрын

I'm also interested, thanks

@komalmistry7284

@komalmistry7284 Жыл бұрын

Could someone share the link to the paper that was mentioned here "ChainPoll" , I believe.

@Deeplearningai

@Deeplearningai Жыл бұрын

It is in the video description!

@davidvilla2402

@davidvilla2402 Жыл бұрын

I don't know how bt I searched the n word and it came up

[Webinar] LLMs for Evaluating LLMs

49:07

[Webinar] LLMs for Evaluating LLMs

Arthur

Рет қаралды 10 М.

How to Evaluate LLM Performance for Domain-Specific Use Cases

56:43

How to Evaluate LLM Performance for Domain-Specific Use Cases

Snorkel AI

Рет қаралды 2,9 М.

БУ, ИСПУГАЛСЯ?? #shorts

00:22

БУ, ИСПУГАЛСЯ?? #shorts

Паша Осадчий

Рет қаралды 1,5 МЛН

The Joker and the Angel fell into the hole, but luckily Harley Quinn rescued them #Angel

00:20

The Joker and the Angel fell into the hole, but luckily Harley Quinn rescued them #Angel

超人夫妇

Рет қаралды 96 МЛН

Happy birthday to you by Secret Vlog

00:12

Happy birthday to you by Secret Vlog

Secret Vlog

Рет қаралды 5 МЛН

Wait… Maxim, did you just eat 8 BURGERS?!🍔😳| Free Fire Official

00:13

Wait… Maxim, did you just eat 8 BURGERS?!🍔😳| Free Fire Official

Garena Free Fire Global

Рет қаралды 9 МЛН

Evaluating LLM-based Applications

33:50

Evaluating LLM-based Applications

Databricks

Рет қаралды 27 М.

How to Limit LLM Hallucinations

25:32

How to Limit LLM Hallucinations

Krista AI

Рет қаралды 371

Strategies to Monitor LLM Hallucinations | Webinar

30:39

Strategies to Monitor LLM Hallucinations | Webinar

NannyML

Рет қаралды 279

Mitigating LLM Hallucination Risk Through Research Backed Metrics

42:20

Mitigating LLM Hallucination Risk Through Research Backed Metrics

Databricks

Рет қаралды 445

Why Large Language Models Hallucinate

9:38

Why Large Language Models Hallucinate

IBM Technology

Рет қаралды 204 М.

How to Construct Domain Specific LLM Evaluation Systems: Hamel Husain and Emil Sedgh

18:45

How to Construct Domain Specific LLM Evaluation Systems: Hamel Husain and Emil Sedgh

AI Engineer

Рет қаралды 6 М.

LLM Hallucinations in RAG QA - Thomas Stadelmann, deepset.ai

1:02:56

LLM Hallucinations in RAG QA - Thomas Stadelmann, deepset.ai

deepset

Рет қаралды 7 М.

How to Build LLMs on Your Company’s Data While on a Budget

40:37

How to Build LLMs on Your Company’s Data While on a Budget

Databricks

Рет қаралды 38 М.

Emerging architectures for LLM applications

55:19

Emerging architectures for LLM applications

Superwise

Рет қаралды 51 М.

AI Tools That Merge Research Speed with Human Insight - Interview with Jack Bowen, CoLoop | AI 4 UX

43:06

AI Tools That Merge Research Speed with Human Insight - Interview with Jack Bowen, CoLoop | AI 4 UX

Brilliant Experience

Рет қаралды 134

БУ, ИСПУГАЛСЯ?? #shorts

00:22

БУ, ИСПУГАЛСЯ?? #shorts

Паша Осадчий

Рет қаралды 1,5 МЛН