YanAITalk

Пікірлер

@EliteSecondarySchool 5 күн бұрын

Very informative. Thank🎉🎉🎉

@user-wr4yl7tx3w 19 күн бұрын

is there an URL for the paper?

@yanaitalk 19 күн бұрын

Yes, here is the link to the paper: arxiv.org/pdf/2303.09014

@cliffordino Ай бұрын

Nicely done and very helpful! Thank you!! FYI, the stress is on the first syllable of "INference", not the second ("inFERence").

@yanaitalk Ай бұрын

Copy that! Thank you😊

@shanesimon2495 Ай бұрын

Hi Yan, This was just so helpful. What community is interacting in the video? Do you have a discord community?

@yanaitalk Ай бұрын

It is from the Houston Machine Learning meetup: www.meetup.com/houston-machine-learning/.

@johndong4754 2 ай бұрын

Ive been learning about LLMs over the past few months, but i havent gone into too much depth. Your videos seem very detailed and technical. Which one(s) would you recommend starting off with?

@yanaitalk 2 ай бұрын

There are excellent courses from DeepLearning.ai on Coursera. To go even deeper, I recommend to directly read the technical papers which gives you more depth of understanding.

@HeywardLiu Ай бұрын

1. Roofline model 2. Transformer arch. > bottleneck of attention > flash attention 3. LLM Inference can be divided into: prefilling-stage (compute-bound) and decoding-stage (memory-bound) 4. LLM serving: paged attention, radix attention If you want to optimize the inference performance, this review paper is awesome: LLM Inference Unveiled: Survey and Roofline Model Insights

@saurabhmodi5128 2 ай бұрын

Great insights on Ad-click prediction. Please make more videos winning solutions on Ranking, semantic analysis, etc. PS: Please continue this series, very less content on youtube on this

@amitsingha1637 3 ай бұрын

Thanks

@yavagedex 4 ай бұрын

nice !

@ashwinkumar5223 5 ай бұрын

Superb.. Thanks

@ElmiraTalebi-bl9gv 7 ай бұрын

Thanks for creating this video. It has great explanation. Can we have access to the pdf file of slides?

@Closer_36 8 ай бұрын

came here from "stanford cs224w: ml with graph" series, in order to have more kind(or lower level, i can understand) explanation. it is very helpful for me. thanks

@natancruz2499 8 ай бұрын

Hi @YanAITalk, this video is great you were very good at explaining the T5. I do have a question, do you think this is a good model to build a Formality Style Transfer model?

@yanaitalk 8 ай бұрын

I think so if you want to fine-tune a smaller model to put in production.

@natancruz2499 8 ай бұрын

@@yanaitalk Great, thanks for the video and the response!

@divyarawatt 8 ай бұрын

Great work Yan !!!!!!!!!

@hardiksharma4219 8 ай бұрын

Great explanation

@shantanubapat6937 8 ай бұрын

This video should have million views and likes. Amazing talk. Thank you!

@yanaitalk 9 ай бұрын

The content presented has also been posted in the blog: medium.com/@YanAIx/demystify-gpt-4-58b332a4c731

@junaidbutt3000 11 ай бұрын

Really awesome presentation - very clear and yes it seems like the success in this approach is in formatting the datasets in an appropriate way. I know it was mentioned that there could be a future potential session just on PPO so I'm looking forward to that. Is there a way to join the reading group or follow it? The meetup link doesn't seem to work?

@yanaitalk 11 ай бұрын

Glad it was helpful! You can join our meet ups here:www.meetup.com/houston-machine-learning/

@MR_GREEN1337 Жыл бұрын

Are graph NNs widely used?

@yanaitalk 11 ай бұрын

Yes, especially to large scale knowledge graph and biomedical network.

@pythonsalone6294 Жыл бұрын

Thank you so much. This reliefs me from stress

@yanaitalk Жыл бұрын

Blog of the presentation: medium.com/@YanAIx/introduction-to-vision-transformer-vit-c5ac6fe81991

@yanaitalk Жыл бұрын

Step by step introduction to Transformers: medium.com/@YanAIx/step-by-step-into-transformer-79531eb2bb84

@andylee8283 Жыл бұрын

informative <3

@samriddhlakhmani284 Жыл бұрын

What is edge_index?

@PARTHBRAHMBHATT-np4pr 9 ай бұрын

Its Adjacency Matrix

@somritasarkar6608 Жыл бұрын

Very well explained.Thank you

@AdityaAgarwal-v3b Жыл бұрын

great video

@pra8495 Жыл бұрын

why Q,K,V are 8 by 8 and not 8 by 6 ??

@beautyisinmind2163 Жыл бұрын

Can we use EMB for multiclass classification problem and extract the explanation?

@teetanrobotics5363 Жыл бұрын

Could you please upload the remaining vidoes and organize the playlist ?

@therohanjaiswal Жыл бұрын

After a lot of searching, finally I found this video and I must say this is the best explanation I found on KZbin for TexTCNN. Thanks a lottttt for this.

@pravinpoudel1307 2 жыл бұрын

12:44 I think for first graph the feature vector should be [6, 2, 1, 2, 1, 0, 0, 2, 1, 0] which is not same for last three places for shown in video

@rvdjc 2 жыл бұрын

Hi, thanks for a wonderful session. Would you be able to upload word2vec session as well? Thank you!

@binjone4993 2 жыл бұрын

Yan thanks for the info, where can i send you a question that i need help for pagerank centrality? email or discord

@fengwang9752 2 жыл бұрын

A lot of background noise...

@yanaitalk 2 жыл бұрын

Thank you for the comment. But I don't hear any background noise when I play it.

@fengwang9752 2 жыл бұрын

@@yanaitalk 7:19 someone were knocking on the door/wall, and you chuckled. It is apparent and distracting when listening with earphones.

@yanaitalk 2 жыл бұрын

@@fengwang9752 I see. Sorry about that. Let me see if I can filter it out.

@DungPham-ai 2 жыл бұрын

Amazing thank so much

@aicoding2010 2 жыл бұрын

Thank you for this tutorial. I saw that at 48:33 you defined a context_vector, then you added the context_vector to input of decoder anh finally you put into GRU. At the moment, you didn't add the context_vector with hidden_state from the decoder for prediction every word. I mean that, we need to do: output += context_vector before Dense layer.

@yanaitalk 2 жыл бұрын

Yes, this is correct. Thanks for pointing it out!

@lamis_18 2 жыл бұрын

nice explanation thx

@krishanuborah5552 2 жыл бұрын

Do you have the source code?

@yanaitalk 2 жыл бұрын

I don't have source codes. The winners may publish their codes in their repo.

@moustiqu3 2 жыл бұрын

@@yanaitalk and where are the repository ?

@teetanrobotics5363 3 жыл бұрын

the lecture for word2vec is missing

@trancosger 3 жыл бұрын

Where can we download the presentation files and the source code of the solutions? Thank you.

@yanaitalk 3 жыл бұрын

You can find the slides here: www.slideshare.net/xuyangela/kaggle-winning-solutions-retail-sales-forecasting

@mariaanastasya663 3 жыл бұрын

I want to ask about the attention, why the information in seq2seq LSTM can lost? I just want to know the reason and the technical so the information can lost.

@vihaanwalter2350 3 жыл бұрын

pro trick : watch series at instaflixxer. I've been using them for watching lots of of movies these days.

@justiceadriel8585 3 жыл бұрын

@Vihaan Walter Yup, have been watching on InstaFlixxer for months myself :D

@nishantpall1747 3 жыл бұрын

in vanilla RNN's the information sort of dilates as you progress through the sequence, this is because of vanishing gradients. LSTMs are better equipped to handle this issue, although for normal seq2seq models, as you're producing a context vector of fixed size, its common that information is lost as the sequence length is really long and the context vector from encoder is of fixed size, there's where attention helps us. I hope this clears it up.

@mariaanastasya2619 3 жыл бұрын

@@nishantpall1747 yes, thank you so much.

@pratikanand2200 3 жыл бұрын

as pretrained wordvec like glove have incorporate many dimension of meaning of word then why we need to further fine tune?

@yanaitalk 3 жыл бұрын

the meanings of words can be different based on different context, we may fine tune to adjust their meanings.

@liqin3892 3 жыл бұрын

Thank you, your lecture helps me a lot.