CSC401 2511 W24 L8 Large Language Models (LLMs) 26 Feb 2024

  Рет қаралды 209

Raeid

Raeid

Күн бұрын

Delves into LLM alignment techniques like instruction fine-tuning (InstructGPT, ChatGPT) using Reinforcement Learning from Human Feedback (RLHF).

Пікірлер
RLHF: How to Learn from Human Feedback with Reinforcement Learning
59:17
Cooperative AI Foundation
Рет қаралды 6 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
龟兔赛跑:好可爱的小乌龟#short #angel #clown
01:00
Super Beauty team
Рет қаралды 42 МЛН
My Daughter's Dumplings Are Filled With Coins #funny #cute #comedy
00:18
Funny daughter's daily life
Рет қаралды 32 МЛН
小天使和小丑太会演了!#小丑#天使#家庭#搞笑
00:25
家庭搞笑日记
Рет қаралды 59 МЛН
Were Israel’s Actions in the Gaza War Justified? Eylon Levy vs. Mehdi Hasan
1:06:50
Yann Dubois: Scalable Evaluation of Large Language Models
1:37:47
Mayur Naik
Рет қаралды 3,5 М.
Large Language Models (LLMs) - Everything You NEED To Know
25:20
Matthew Berman
Рет қаралды 107 М.
Aligning LLMs with Direct Preference Optimization
58:07
DeepLearningAI
Рет қаралды 26 М.
Fine tuning LLMs for Memorization
46:51
Trelis Research
Рет қаралды 13 М.
What is Prompt Tuning?
8:33
IBM Technology
Рет қаралды 211 М.