Yes, here is the link to the paper: arxiv.org/pdf/2303.09014
@cliffordinoАй бұрын
Nicely done and very helpful! Thank you!! FYI, the stress is on the first syllable of "INference", not the second ("inFERence").
@yanaitalkАй бұрын
Copy that! Thank you😊
@shanesimon2495Ай бұрын
Hi Yan, This was just so helpful. What community is interacting in the video? Do you have a discord community?
@yanaitalkАй бұрын
It is from the Houston Machine Learning meetup: www.meetup.com/houston-machine-learning/.
@johndong47542 ай бұрын
Ive been learning about LLMs over the past few months, but i havent gone into too much depth. Your videos seem very detailed and technical. Which one(s) would you recommend starting off with?
@yanaitalk2 ай бұрын
There are excellent courses from DeepLearning.ai on Coursera. To go even deeper, I recommend to directly read the technical papers which gives you more depth of understanding.
@HeywardLiuАй бұрын
1. Roofline model 2. Transformer arch. > bottleneck of attention > flash attention 3. LLM Inference can be divided into: prefilling-stage (compute-bound) and decoding-stage (memory-bound) 4. LLM serving: paged attention, radix attention If you want to optimize the inference performance, this review paper is awesome: LLM Inference Unveiled: Survey and Roofline Model Insights
@saurabhmodi51282 ай бұрын
Great insights on Ad-click prediction. Please make more videos winning solutions on Ranking, semantic analysis, etc. PS: Please continue this series, very less content on youtube on this
@amitsingha16373 ай бұрын
Thanks
@yavagedex4 ай бұрын
nice !
@ashwinkumar52235 ай бұрын
Superb.. Thanks
@ElmiraTalebi-bl9gv7 ай бұрын
Thanks for creating this video. It has great explanation. Can we have access to the pdf file of slides?
@Closer_368 ай бұрын
came here from "stanford cs224w: ml with graph" series, in order to have more kind(or lower level, i can understand) explanation. it is very helpful for me. thanks
@natancruz24998 ай бұрын
Hi @YanAITalk, this video is great you were very good at explaining the T5. I do have a question, do you think this is a good model to build a Formality Style Transfer model?
@yanaitalk8 ай бұрын
I think so if you want to fine-tune a smaller model to put in production.
@natancruz24998 ай бұрын
@@yanaitalk Great, thanks for the video and the response!
@divyarawatt8 ай бұрын
Great work Yan !!!!!!!!!
@hardiksharma42198 ай бұрын
Great explanation
@shantanubapat69378 ай бұрын
This video should have million views and likes. Amazing talk. Thank you!
@yanaitalk9 ай бұрын
The content presented has also been posted in the blog: medium.com/@YanAIx/demystify-gpt-4-58b332a4c731
@junaidbutt300011 ай бұрын
Really awesome presentation - very clear and yes it seems like the success in this approach is in formatting the datasets in an appropriate way. I know it was mentioned that there could be a future potential session just on PPO so I'm looking forward to that. Is there a way to join the reading group or follow it? The meetup link doesn't seem to work?
@yanaitalk11 ай бұрын
Glad it was helpful! You can join our meet ups here:www.meetup.com/houston-machine-learning/
@MR_GREEN1337 Жыл бұрын
Are graph NNs widely used?
@yanaitalk11 ай бұрын
Yes, especially to large scale knowledge graph and biomedical network.
@pythonsalone6294 Жыл бұрын
Thank you so much. This reliefs me from stress
@yanaitalk Жыл бұрын
Blog of the presentation: medium.com/@YanAIx/introduction-to-vision-transformer-vit-c5ac6fe81991
@yanaitalk Жыл бұрын
Step by step introduction to Transformers: medium.com/@YanAIx/step-by-step-into-transformer-79531eb2bb84
@andylee8283 Жыл бұрын
informative <3
@samriddhlakhmani284 Жыл бұрын
What is edge_index?
@PARTHBRAHMBHATT-np4pr9 ай бұрын
Its Adjacency Matrix
@somritasarkar6608 Жыл бұрын
Very well explained.Thank you
@AdityaAgarwal-v3b Жыл бұрын
great video
@pra8495 Жыл бұрын
why Q,K,V are 8 by 8 and not 8 by 6 ??
@beautyisinmind2163 Жыл бұрын
Can we use EMB for multiclass classification problem and extract the explanation?
@teetanrobotics5363 Жыл бұрын
Could you please upload the remaining vidoes and organize the playlist ?
@therohanjaiswal Жыл бұрын
After a lot of searching, finally I found this video and I must say this is the best explanation I found on KZbin for TexTCNN. Thanks a lottttt for this.
@pravinpoudel13072 жыл бұрын
12:44 I think for first graph the feature vector should be [6, 2, 1, 2, 1, 0, 0, 2, 1, 0] which is not same for last three places for shown in video
@rvdjc2 жыл бұрын
Hi, thanks for a wonderful session. Would you be able to upload word2vec session as well? Thank you!
@binjone49932 жыл бұрын
Yan thanks for the info, where can i send you a question that i need help for pagerank centrality? email or discord
@fengwang97522 жыл бұрын
A lot of background noise...
@yanaitalk2 жыл бұрын
Thank you for the comment. But I don't hear any background noise when I play it.
@fengwang97522 жыл бұрын
@@yanaitalk 7:19 someone were knocking on the door/wall, and you chuckled. It is apparent and distracting when listening with earphones.
@yanaitalk2 жыл бұрын
@@fengwang9752 I see. Sorry about that. Let me see if I can filter it out.
@DungPham-ai2 жыл бұрын
Amazing thank so much
@aicoding20102 жыл бұрын
Thank you for this tutorial. I saw that at 48:33 you defined a context_vector, then you added the context_vector to input of decoder anh finally you put into GRU. At the moment, you didn't add the context_vector with hidden_state from the decoder for prediction every word. I mean that, we need to do: output += context_vector before Dense layer.
@yanaitalk2 жыл бұрын
Yes, this is correct. Thanks for pointing it out!
@lamis_182 жыл бұрын
nice explanation thx
@krishanuborah55522 жыл бұрын
Do you have the source code?
@yanaitalk2 жыл бұрын
I don't have source codes. The winners may publish their codes in their repo.
@moustiqu32 жыл бұрын
@@yanaitalk and where are the repository ?
@teetanrobotics53633 жыл бұрын
the lecture for word2vec is missing
@trancosger3 жыл бұрын
Where can we download the presentation files and the source code of the solutions? Thank you.
@yanaitalk3 жыл бұрын
You can find the slides here: www.slideshare.net/xuyangela/kaggle-winning-solutions-retail-sales-forecasting
@mariaanastasya6633 жыл бұрын
I want to ask about the attention, why the information in seq2seq LSTM can lost? I just want to know the reason and the technical so the information can lost.
@vihaanwalter23503 жыл бұрын
pro trick : watch series at instaflixxer. I've been using them for watching lots of of movies these days.
@justiceadriel85853 жыл бұрын
@Vihaan Walter Yup, have been watching on InstaFlixxer for months myself :D
@nishantpall17473 жыл бұрын
in vanilla RNN's the information sort of dilates as you progress through the sequence, this is because of vanishing gradients. LSTMs are better equipped to handle this issue, although for normal seq2seq models, as you're producing a context vector of fixed size, its common that information is lost as the sequence length is really long and the context vector from encoder is of fixed size, there's where attention helps us. I hope this clears it up.
@mariaanastasya26193 жыл бұрын
@@nishantpall1747 yes, thank you so much.
@pratikanand22003 жыл бұрын
as pretrained wordvec like glove have incorporate many dimension of meaning of word then why we need to further fine tune?
@yanaitalk3 жыл бұрын
the meanings of words can be different based on different context, we may fine tune to adjust their meanings.
@liqin38923 жыл бұрын
Thank you, your lecture helps me a lot.
@1UniverseGames3 жыл бұрын
Can you share code for study
@muhammaderfan28474 жыл бұрын
Ma'am can you please deliver a lecture on Get To The Point: Summarization with Pointer-Generator network arxiv.org/pdf/1704.04368.pdf can you please deliver a lecture on how to implement this model for CNN/daily mail news data sets?
@muhammaderfan28474 жыл бұрын
nice work ma'am :) very helpful :)
@muhammaderfan28474 жыл бұрын
can you pleas deliver a lecture about abstractive text summarization using seq2seq attention model.
@muhammaderfan28474 жыл бұрын
Ma'am you really did so nice work and you really deliver this lecture so nice.
@muhammaderfan28474 жыл бұрын
Really nice work. can you please share your github code link.
@@yanaitalk thanks ma'am. do you have your email please actually i am doing research work on attractive text summarization. I need your help please.
@ahmer98004 жыл бұрын
will you post more presentations going forward? this is immensely helpful for people who cannot make the meetings :)
@yanaitalk4 жыл бұрын
Yes, we shall post more presentations. Stay tuned :)
@yanaitalk6 жыл бұрын
A Data-Driven Question Generation Model for Educational Content - by Jack Wang Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhenzhen Zhong: kzbin.info/www/bejne/bpfQlKFtqtGfbq8