Continuing on with Large Language Models! In this one I walk you through the codebase behind OPT-175B model from MetaAI. Feedback is welcome it'll help shape this channel
@vishalsingh-yj8bk2 жыл бұрын
Hi Aleksa I just had one doubt. OPT's architecture is the same as GPT-2 right?
@dimitrismit67142 жыл бұрын
Very good video, goes in a lot of depth. Thanks for the explanation!
@TheAIEpiphany2 жыл бұрын
Thanks Dimitris!
@dimitrismit67142 жыл бұрын
@@TheAIEpiphany Although, I have to note that the format I liked best for your videos is when you first explain it theoretically then show the implementation in code, like the video about VQ-VAEs. I think that way it ties it very well
@TheAIEpiphany2 жыл бұрын
@@dimitrismit6714 thanks for that!
@j.hanleysmith83332 жыл бұрын
Next level!
@hooshisar24352 жыл бұрын
Awesome thank you for this! Just at a curiosity, would it be easier for Windows users to use the Windows Subsystem for Linux(WSL2) for repos meant to be using Linux?
@TheAIEpiphany2 жыл бұрын
Thanks! Yes in some cases but I think there are still rough edges and I felt more comfortable going this route this time. Have a dedicated Linux machine for ML - that's the best option. I am currently working on assembling my own deep learning rig.
@JupiterNj2 жыл бұрын
Great video, thanks!
@TheAIEpiphany2 жыл бұрын
🙏
@oc1655 Жыл бұрын
hey Aleksa, thank you for these. i love your patience and approach of step-by-step analysis. at 1:05:29 why is the code multiplying the loss by four? for gradient accumulation, i know we divide by steps. but i'm not understanding the reason behind multiplying with the number of (nodes? devices?)
@lhomme_flaneur2 жыл бұрын
hey man i love your videos, just keep it lit. can you also make a tutorial on how to make a really basic ml framework from zero? that would be awesome. everybody is using libs and apis but that'd be cool just to have a simple package, maybe only for linear regression.