CSCI 5722: Computer Vision | Spring 25 | Special Lecture - DeepSeek

  Рет қаралды 9,917

AI by Hand

AI by Hand

Күн бұрын

Пікірлер: 21
@kaidongliang3483
@kaidongliang3483 8 сағат бұрын
非常棒
@hasainahmed1883
@hasainahmed1883 8 күн бұрын
I don't know how this 1 hr passed in reel era and now I am looking for more such great content.
@Coconut-Crusted-French-Toast
@Coconut-Crusted-French-Toast 5 күн бұрын
Thank you for the detailed walkthrough. Very helpful.
@GGWPTrader
@GGWPTrader 8 күн бұрын
Mad respect sir.. i'm gonna having fun & learn so much from your channel..
@windmaple
@windmaple 8 күн бұрын
Thank you for making this video available!
@Artem-c1p9q
@Artem-c1p9q 3 күн бұрын
Excellent work! Love it! ❤❤❤
@ai-by-hand
@ai-by-hand 2 күн бұрын
Thank you! 😊
@maycodes
@maycodes 2 күн бұрын
Thank you professor ❤❤❤
@dongwoo113
@dongwoo113 2 күн бұрын
감사합니다!
@camille-t7z
@camille-t7z 7 күн бұрын
amazing content, thanks a lot!!!
@mueezurrehman8572
@mueezurrehman8572 8 күн бұрын
Really informative and detailed. Thanks.
@ihcnehc
@ihcnehc 7 күн бұрын
Thank YOU!
@pastrop2003
@pastrop2003 7 күн бұрын
vere well done, thank you!
@mohsinkhalid2375
@mohsinkhalid2375 8 күн бұрын
Professor what about the fine-tuning part? How RL is utilized to fine-tune the model.
@anilshinde8025
@anilshinde8025 7 күн бұрын
Great lecture Professor. Would like to know role of Group Relative Policy optimization role in DeepSeek
@airesearch2024
@airesearch2024 9 күн бұрын
Could you create a play list for this course so we can keep track of
@fintech1378
@fintech1378 7 күн бұрын
how to order the book
@deeal
@deeal 5 күн бұрын
why do we do this dimension reduction? from 5=> 4?
@zeeshanashraf4502
@zeeshanashraf4502 4 күн бұрын
@deeal tldr explanation - reducing dimensions increases the speed of training and inference. It also reduces the size of the model. 5 is the model size/hidden dimension(H.D), 4 is the per head dimension(P.H.D) and 6 is the Context Length(C.L). If the weight matrix does not reduce the dimensionality , it's size will be (H.D x H.D) instead of current dimension (P.H.D x P.H.D). This will result in larger number of parameters being learnt. Also, if dimensionality is not reduced, K, V, Q matrices will also have size (H.D X C.L) which is much larger than (PHD X CL). Multiplying such large matrices is very expensive, approximately O(n^3). So smaller matrices are used.
@deeal
@deeal 4 күн бұрын
Thanks for the explanation, much appreciated. Interesting, so why not start with that input size to begin with? I understand the input are the embedding right? I am missing something 😅
CSCI 5722: Computer Vision | Spring 25 | Lecture 4 - Feature
1:01:38
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,7 МЛН
Wednesday VS Enid: Who is The Best Mommy? #shorts
0:14
Troom Oki Toki
Рет қаралды 50 МЛН
Hilarious FAKE TONGUE Prank by WEDNESDAY😏🖤
0:39
La La Life Shorts
Рет қаралды 44 МЛН
DeepSeek-V3
1:21:39
Gabriel Mongaras
Рет қаралды 11 М.
The Future of Microprocessors • Sophie Wilson • GOTO 2024
57:37
GOTO Conferences
Рет қаралды 37 М.
CSCI 5722: Computer Vision | Spring 25 | Lecture 1 - Image
29:45
AI by Hand
Рет қаралды 3,8 М.
Building a fully local "deep researcher" with DeepSeek-R1
14:21
LangChain
Рет қаралды 161 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 891 М.
ChatGPT is made from 100 million of these [The Perceptron]
24:01
Welch Labs
Рет қаралды 235 М.
The Man Behind DeepSeek (Liang Wenfeng)
18:03
East Money
Рет қаралды 358 М.
The 8 AI Skills That Will Separate Winners From Losers in 2025
19:32