PaLM Pathways Language Model explained | 540 Billion parameters can explain jokes!?

Рет қаралды 21,263

AI Coffee Break with Letitia

Күн бұрын

Пікірлер: 39

@Ma2rten 2 жыл бұрын

I am a coauthor of the PaLM paper. Thanks for choosing to cover it!

@AICoffeeBreak 2 жыл бұрын

Thanks for the visit! And congrats for the cool work. 👏 I'm eager to see what you have lined up next.

@michaelfischer841 2 жыл бұрын

thank you for your brilliant work

@michaelfischer841 2 жыл бұрын

when you are training these things -- are you also using the contents of university level reference materials in PDF format -- which can be converted using pdftotext on the command line

@sabofx 2 жыл бұрын

@Maarten Bosma: I've viewed several videos on PaLM like this one and one by Dr Alan D Thompson. Is there any way that I could have a conversation/chat with the PaLM AI? I would love to test its reasoning capabilities myself. Is there somewhere where I can sign up for access? Looking forward to your reply! Cheers, Joost.

@anthonyrepetto3474 2 жыл бұрын

In regard to PaLM developing certain capabilities only once it reaches a threshold: We now know that even random graphs, of sufficient size and connectivity, undergo a 'phase-change' into states of higher order, as explained on Quanta's recent article, "Elegant Six-Page Proof Reveals the Emergence of Random Structure" - So, even though the model is not an innovation, it does provide a potential insight: making models bigger can cross *thresholds* into sudden new abilities!

@fedelopez77 Жыл бұрын

"Few-shot learning, as we see it from GPT-3 onwards, is just glorified pattern completion" --> Standing ovation, just awesome

@Mutual_Information 2 жыл бұрын

I'm glad you chose PaLM. It felt like DALLE was sucking up all the attention when PaLM was doing some seriously impressive things we haven't seen before. Very nice video. As always :)

@iliemihai949 2 жыл бұрын

Foarte tare Letitia, unul dintre cele mai bune canale de urmarit in materie de NLP. In lunile urmatoare vom lansa un model de GPT2-780M pe limba romana antrenat pe 40 GB text.

@AICoffeeBreak 2 жыл бұрын

Wow, de abia aștept să văd. 👀

@HoriaNeagu 2 жыл бұрын

Salut! S-a concretizat cumva acest proiect?

@hannesstark5024 2 жыл бұрын

Fantastic! Thank you for this summary which prevents me from having to read slightly boring papers :7

@JuliusSmith 2 жыл бұрын

I have to watch all your videos now! Your style is perfect for me - thanks for making them!

@AICoffeeBreak 2 жыл бұрын

Glad you found us! 😁

@JuliusSmith 2 жыл бұрын

Maybe "few shot orientation" would be a better term

@AICoffeeBreak 2 жыл бұрын

🤣

@michaelfischer841 2 жыл бұрын

your commentary and insight is top notch

@JM-ln2zm 2 жыл бұрын

Great Video Letitia! i have a question. So PaLM was trained on 6100 TPU's, lets say you created a language translator using PaLM, In order for me now to use this newly created language translator do i still need access to the 6100 TPU's or can it be run on less TPU's once the model has been trained?

@AICoffeeBreak 2 жыл бұрын

Hi, thanks for the question. Maybe someone knows this more thoroughly than me, but no, the parallelization on more than 6k TPUs is for speeding up training, for storing gradients. For inference, they do not need the gradients, they just need to load the parameters. Due to the enormous number, it is surely more than one TPU they need for inference, since it needs so much memory. If you are happy to wait a bit (I do not know how long "a bit" is for such enormous models), you could even load on a CPU with enough RAM for inference. 😅

@mrshankj5101 2 жыл бұрын

I don't think AI language models are boring! paLM and GPT-3 is awesome!

@DerPylz 2 жыл бұрын

I wonder what the output would be without the "few-shots", so not giving the 2 examples of correctly solved tasks before the prompt. Do you think there would be no answer at all, or just a very bad one?

@odysseashlap 2 жыл бұрын

There would be an irrelevant answer

@scottpulver 2 жыл бұрын

Irrelevant followed by 1 perfect

@Abdulazizab2 2 жыл бұрын

Checkout the GPT-3 paper "Language Models are Few-Shot Learners" as they evaluate the 'few-shots' and also 'zero-shot' where you don't provide any prompt for a given task. For some tasks, zero shot does well and other tasks the model needs to be driven by at least one example i.e. 'one-shot'.

@bazejmarciniak5682 Жыл бұрын

Your channel is a gem! Thanks for your great work!

@Skinishh 2 жыл бұрын

Thank you for the great video, as always! I wonder why these large LMs are all decode-only as GPT and not encoder-decoder as T5? 🤔

@Skinishh 2 жыл бұрын

Answering my own question: these kinds of models are only interested in next-text generation and not in fine-tuning tasks or mask completion as T5. Therefore, only a decoder is needed for text generation.

@Ma2rten 2 жыл бұрын

@@Skinishh Google Research has also done work on large encoder-decoder models - most recently ST-MoE-32B. Decoder-only models tend to work best for open ended text generation and few shot. Encoder-Decoder models for classification and close ended text generation (e.g. machine translation).

@wilfredomartel7781 Жыл бұрын

How can i test it?

@AICoffeeBreak Жыл бұрын

Sadly, PaLM was not made available to the public. We can read about it in the paper.

@federicolusiani7753 2 жыл бұрын

Thank you so much for these videos!! The quality of the explanations and insights you provide is unmatched.

@AICoffeeBreak 2 жыл бұрын

Thanks, so nice of you! :)

@tildarusso 2 жыл бұрын

Nice wrap up. As you said, it is XXXL large but nothing new - boring as usual imho. Thanks you for saving the 87-page reading time for a lot of people!

@jeanpicard1844 2 жыл бұрын

I’m confused as to what you mean about toxicity and why it’s being toxic? Or how it’s being toxic? Is there an example of something you can point me to? Maybe I’m just missing the meaning of a term as it is used in the AI/Language space.

@AICoffeeBreak 2 жыл бұрын

Maybe you can read more about it here. 🔗 GPT-3 examples of toxic behaviour: venturebeat.com/2022/01/27/openai-rolls-out-new-text-generating-models-that-it-claims-are-less-toxic/

@micknamens8659 2 жыл бұрын

"toxic" means it's an unwanted fact - i.e. denied & forbidden by cancel culture