LLMs | Reasoning in LLMs  | Lec 19
1:13:45
Пікірлер
@aiwithhi
@aiwithhi 3 сағат бұрын
At 46:50 that is not bigram count table at down rather those are counts resembling smoothed probabilities. Bigram counts won’t come down due to smoothing. I I count cannot be 3.8, probability adjusted count can become float.
@muralidhar40
@muralidhar40 Күн бұрын
DONALD TEUMPS DEATH is correct
@hermanhjaramillo
@hermanhjaramillo 3 күн бұрын
Thank you for your wonderful lectures. It is not clear to me what is the definition of the proximity function phi(x,y)?
@anmoljain1131
@anmoljain1131 4 күн бұрын
GOD , all my doubts are cleared. Struggled for months and here it comes to end. Thank a lot.
@noadsensehere9195
@noadsensehere9195 4 күн бұрын
Amazing! ALL the stuff of deep learning and graph, fine-tuning, alignment, transformer architecture, etc IS CLUBBED here, Love it!
@hermanhjaramillo
@hermanhjaramillo 6 күн бұрын
At time 59:38 why you have only one positive sample int eh log likelihood and k negative samples. I thought there were the sane number of positive and negative samples "k".
@mauliktailor7753
@mauliktailor7753 7 күн бұрын
This is an awesome course, started as a KZbin suggestion now binge-watching whole series. Thanks for all the handwork to all instructors
@hermanhjaramillo
@hermanhjaramillo 7 күн бұрын
You show the equation C^* = (c+1) N_c+1/N_c. Where did this formula come from? what is the explanation for it? any mathematical proof or at least any intuition for it?
@hermanhjaramillo
@hermanhjaramillo 7 күн бұрын
Excellent courses! It would be best if you considered repeating the questions asked. They are inaudible when coming from students.
@chetansonigara
@chetansonigara 8 күн бұрын
Any Diffrence Between this course and previously completed course this channel on large language model ?
@lablcs2
@lablcs2 7 күн бұрын
Different Instructors, slightly modified course structure, see details here: onlinecourses.nptel.ac.in/noc25_cs45/preview!
@monk-thecoder5388
@monk-thecoder5388 8 күн бұрын
are these the same videos as the LLM playlist in this channel.
@lablcs2
@lablcs2 7 күн бұрын
No, you can check the course details here: onlinecourses.nptel.ac.in/noc25_cs45/preview!
@BhuvanWebOsmo
@BhuvanWebOsmo 8 күн бұрын
Thanks for bringing this course!!
@rounaksaha2494
@rounaksaha2494 8 күн бұрын
That's the course I need also following LLM lecture playlist
@harishravi9936
@harishravi9936 10 күн бұрын
@30:42, the definition of monotonicity seems to be incorrect.
@Justuy
@Justuy 12 күн бұрын
really enjoyed this lecture. Had an idea of perplexity before but never got its intuition this clearly
@Rahul-by2nn
@Rahul-by2nn 14 күн бұрын
that's great ❤️
@amritkumar-ge4tn
@amritkumar-ge4tn 14 күн бұрын
As usual awesome content and tshirt ! thanks Yatin :D.
@harshjha6774
@harshjha6774 14 күн бұрын
amazing
@amritkumar-ge4tn
@amritkumar-ge4tn 16 күн бұрын
love Yatin's t-shirts :D
@ShivangiTomar-p7j
@ShivangiTomar-p7j 16 күн бұрын
VERY VERY GOOD. THANKS!!!
@digambarkilledar003
@digambarkilledar003 29 күн бұрын
Thank you so much for this wonderful series !!
@sanskarkhandelwal
@sanskarkhandelwal Ай бұрын
Sir Can you list some good NLP Courses in description or pinned comments
@lablcs2
@lablcs2 Ай бұрын
I request you to visit our website's materials section: lcs2.in/llm2401, Thank you!
@sanskarkhandelwal
@sanskarkhandelwal Ай бұрын
@@lablcs2 ok sir 👍
@IamPotato_007
@IamPotato_007 Ай бұрын
I'm so thankful for these lectures 🙏
@tsgaming1173
@tsgaming1173 Ай бұрын
👍🏻
@sanskarkhandelwal
@sanskarkhandelwal Ай бұрын
Nice lec sir , There is a lack of such quality content on llms ❤
@ashwinkumar5223
@ashwinkumar5223 Ай бұрын
Thankyou Sir
@MohdAthar-t4p
@MohdAthar-t4p Ай бұрын
Great! There should be a few lectures that include practical implementations.
@none-hr6zh
@none-hr6zh Ай бұрын
In palm ,how does palm model knows when to use python interperter for calculation and when to use only llms?
@Lancer_Soldier8349
@Lancer_Soldier8349 Ай бұрын
very helpfull lecture
@GauravJain-zo8gt
@GauravJain-zo8gt Ай бұрын
thank you sir for making this course in public
@amritkumar-ge4tn
@amritkumar-ge4tn Ай бұрын
beautiful explantation of the concept of Perplexity. Thank u so much prof for enlightening me
@WithRakshith
@WithRakshith Ай бұрын
Sir being from a tier 3 college this series is a real gem for a student like me. I am still in my 5th sem Btech in cse but this pg level course is so well taught i really liked the playlist hatsoff for ur efforts please i beg u to continue more concepts and not to stop
@lablcs2
@lablcs2 Ай бұрын
I'm glad you are finding the course beneficial. We will continue to explore new concepts.
@MariaM-pu4fx
@MariaM-pu4fx Ай бұрын
Is there an error on this slide? 17:19 • Bigram: P(begun | has) - „begun” after „has”. • Trigram: P(begun | season has) - „begun” after „season has”.
@lablcs2
@lablcs2 Ай бұрын
No. This seems correct. Could you please explain where the confusion is?
@jbm5195
@jbm5195 Сағат бұрын
@@lablcs2The slide in the video seems wrong.
@funshorts5545
@funshorts5545 2 ай бұрын
I never commented on any video, this is my first comment, I am working as AI Engineer currently, thought of refreshing some basics, and started watching Retrieval-based Language Models-I lec 16, I felt it's worth watching complete playlist. whenever I watch some recent technology or subject content on youtube from stanford, used to feel like when will our IIT's teach such latest industry ready techology right in the college. But now I can see that happening, I am heartfully thanking you guys for making it public and accessible on youtube. India needs more such kind of courses on latest trends and techonlogy. I never did it, but I bow to your team of teachers 🙏....I beg you to continue such work, hope other branches follow your path of teaching industry ready/required skills&topics to UG/PG students(btw I am mechanical engineer enjoying working AI ML domain, that's why this last point 😉)
@gopalrkate
@gopalrkate Ай бұрын
I am also on the same path
@chetansonigara
@chetansonigara Ай бұрын
as data scientists , i am also previous same approch read book and reserch paper & watch stanford lecture
@utkarshtripathi9118
@utkarshtripathi9118 2 ай бұрын
please continue making these types of videos please
@utkarshtripathi9118
@utkarshtripathi9118 2 ай бұрын
very ossm exllent lacture sir
@dineshpandey2122
@dineshpandey2122 2 ай бұрын
Thank you Sir for sharing these videos.Really helpful.
@RADHESHYAM-mv5vd
@RADHESHYAM-mv5vd 2 ай бұрын
What are the pre_requisite to study the course??
@lablcs2
@lablcs2 2 ай бұрын
Watch at kzbin.info/www/bejne/sH7RZGqPrbV3bKs.
@RADHESHYAM-mv5vd
@RADHESHYAM-mv5vd 2 ай бұрын
Thank you 👍🏻
@kiit8337
@kiit8337 2 ай бұрын
Professor what are the prerequisite to follow the whole playlist
@sanskarkhandelwal
@sanskarkhandelwal 2 ай бұрын
Such amazing quality content, On latest technologies, That's why they are IIT profs
@Santosh-t9d6b
@Santosh-t9d6b 2 ай бұрын
how the router picks an expert in case all experts have same score ?
@lablcs2
@lablcs2 2 ай бұрын
It is highly improbable due to the random initialization of weights. Should such a scenario occur, the selection process would typically rely on the implementation specifics; most commonly, PyTorch's `topk` function returns the first occurrence.
@Santosh-t9d6b
@Santosh-t9d6b 2 ай бұрын
In regular transformer, each token had positionwise FFN. In the presentation, every token is seeing same set of FFNs, why is this? Shouldn't it be FFN1_0, FFN2_0,.. for 1st token and FFN1_n,FFN2_n,... for nth token?
@lablcs2
@lablcs2 2 ай бұрын
Each token is associated with a unique positional encoding, rather than a distinct feedforward neural network (FFN). Within a layer, there is a single FFN that processes all tokens simultaneously. This design is crucial for handling sequences of variable lengths. If separate FFNs were used for each token, it would raise practical concerns about the number of FFNs required to accommodate different sequence lengths.
@Santosh-t9d6b
@Santosh-t9d6b 2 ай бұрын
is flash attention of any use in case of edge devices?
@Santosh-t9d6b
@Santosh-t9d6b 2 ай бұрын
how is the window size determined?
@lablcs2
@lablcs2 Ай бұрын
It is a design choice (I.e., a hyperparameter).
@Santosh-t9d6b
@Santosh-t9d6b 2 ай бұрын
pls elaborate the autoregression part, specifically how the i/p tokens are fed to decoder . Let say decoder takes 512 input tokens, at first only <SOS> token is sent, but what about rest 511 tokens?
@lablcs2
@lablcs2 Ай бұрын
During training, the whole input sentence, I.e., all input tokens, are fed together as part of a batch. As it is an auto regressive model, we use masked self-attention so that the transformer can't attend to the future tokens while handling the current token. During inference, the tokens are fed one-by-one as they are generated successively.
@mohdathar5591
@mohdathar5591 2 ай бұрын
Great
@chiragjain6201
@chiragjain6201 2 ай бұрын
How to get access to the PPT??
@lablcs2
@lablcs2 2 ай бұрын
Visit lcs2.in/llm2401
@chiragjain6201
@chiragjain6201 2 ай бұрын
How can I download the slides for this video?
@lablcs2
@lablcs2 2 ай бұрын
Visit lcs2.in/llm2401.
@muhammadmuneebtariq861
@muhammadmuneebtariq861 2 ай бұрын
Sir still same issues persits in google trasnlators, I checked He is docter and She is docter
@walkWithDinkar
@walkWithDinkar 2 ай бұрын
❤❤❤