Please note, with the automatic dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
@mrpocockКүн бұрын
Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.
@ProgrammingWIthRiley18 сағат бұрын
Brother, you are amazing. Thank you for doing this.
@wwkk4964Күн бұрын
Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.
@davidwynter6856Күн бұрын
Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?
@King_DeundelКүн бұрын
BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)
@themax2goКүн бұрын
i'm having a plantbased BLT right now
@TalsBadKidneyКүн бұрын
very very cool
@JeomonGeorgeКүн бұрын
Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13