I really like the way you explain the paper. A lot of concepts I confused have been touched, but I wish the block parts been explained more detailed, like why the modules are used like that in those blocks. Anw thank you so much for the video, +1 subscriber, and hope to see more from you in the future.
@AIBites3 ай бұрын
sure. So are you more interested in papers and theory? Or would you like more on hands-on LLMs, RAG, etc. Just trying to understand the audience better. :)
@ariisaac51112 ай бұрын
@@AIBites I'm more interested in the research papers and theories and any insightful implications that you can contribute along the way. What you did here is a nice Baseline. thx!
@thesimplicitylifestyle6 ай бұрын
It's so much fun looking under the hood. Thanks for explaining it so well! 😎🤖
@AIBites3 ай бұрын
my pleasure :)
@yuanyuan49855 ай бұрын
Thank you so much for providing this video!!!!!
@AIBites3 ай бұрын
my pleasure Yuan! 🙂
@newbie80513 ай бұрын
Well the graphs at 2:18 are incorrect, sigmoid and tanh have different ranges, so the output gate should have range - 1 to 1 (tanh)
@AIBites3 ай бұрын
thats a great spot. Copy pasting oversight I guess 🙂 will pay more attention while making the videos on attention. Thank you 😀
@newbie80516 ай бұрын
Could only grasp the sLSTM on the first read So the exponential activation pushes up everything So we use log to get every activation in a smaller range ? damn, pretty interesting
@AIBites3 ай бұрын
thank you. Yes, whenever I don't understand equations, I plug in numbers to push values to the extremes. This way, it paints a better picture to understand! :)