HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning (w/ Author)

Рет қаралды 17,529

Күн бұрын

Пікірлер: 40

@YannicKilcher 2 жыл бұрын

OUTLINE: 0:00 - Intro & Overview 3:05 - Weight-generation vs Fine-tuning for few-shot learning 10:10 - HyperTransformer model architecture overview 22:30 - Why the self-attention mechanism is useful here 34:45 - Start of Interview 39:45 - Can neural networks even produce weights of other networks? 47:00 - How complex does the computational graph get? 49:45 - Why are transformers particularly good here? 58:30 - What can the attention maps tell us about the algorithm? 1:07:00 - How could we produce larger weights? 1:09:30 - Diving into experimental results 1:14:30 - What questions remain open? Paper: arxiv.org/abs/2201.04182 ERRATA: I introduce Max Vladymyrov as Mark Vladymyrov

@chochona019 2 жыл бұрын

Long introduction was great, it is good to be able to understand with drawings what is actually happening.

@Zed_Oud 2 жыл бұрын

Describing it as a buffet is exactly right for this amount of content. This makes it great for everyone: those looking for a summary, an in-depth dive, or looking to implement/adapt it for themselves.

@NavinF 2 жыл бұрын

Love the longer first half that’s more like your earlier work. IMO the interview should be a short Q&A that lets the authors respond about parts you were unsure about or criticized. I much prefer when the paper review is more in depth (ideally even longer than in this video)

@mahdipourmirzaei1048 2 жыл бұрын

I am a big fan of your long introduction version. In my opinion, the way you are illustrating your thought is way more insightful than at least half of the videos which authors were included. In many papers, authors could act as supplementary information for the main concepts.

@YvesQuemener 2 жыл бұрын

As feedback is called for, just wanted to say that I mostly watch the paper explanations. I like the way you explain, that's really good to have.

@enniograsso2968 2 жыл бұрын

Hi Yannic, I've been following your channel since the very beginning and I always enjoyed your style. Since you're asking for comments about this new style format on interviewing papers' authors, I'd like to share my 2-cent impressions. I'd rather much preferred your former style of unbiased reviews by your own which were really professional and right to the technical points. These interviews on the other hand are more "deferential" and less unbiased. I found your previous style much more insightful and useful. Thank you anyway for your work, your channel is my preferred one to keep updated on the subject, I'm a senior MLE in a big telco company in Italy. Thanks!

@sammay1540 2 жыл бұрын

I gotta be honest, your explanations are the best for me because you’re very good at explaining things whereas these researchers are a little more specialized in research. I do like that you interview them though. I’d always ask a question like “how did you come up with this idea” or “what was the inspiration for this idea?” Love your content! Keep experimenting.

@Yenrabbit 2 жыл бұрын

Long intro was great - we get your explanation and then the interview is a bonus!

@DamianReloaded 2 жыл бұрын

2:00 Why not both. If you're into recycling content, we could have 3 videos: The paper review, the interview with the authors and then the paper review interleaved with comments from the authors. Everyone is happy and you got more content for the same price (minus editting, tho if you script the interleaved video before the interview you already know where the commentary will be) EDIT: Oh, this video is kinda like this already.

@qwerty123443wifi 2 жыл бұрын

Really appreciate the time you take to make videos like this!

@hamandchees3 2 жыл бұрын

I love in depth conversations that aren't afraid to be technical

@boffo25 2 жыл бұрын

Jesus Christ. What an incredible result!

@UberSpaceCow 2 жыл бұрын

Damn I'm quick ;) Thanks for the content homie

@quickdudley 2 жыл бұрын

Regarding the comment at 8:34: in one of my projects I'm using a neural network for a regression type problem and I found I got much smoother interpolation by switching most of the hidden layers to use asinh as the activation function. I have no idea how general that is or whether smoothness is even a desirable feature when you're trying to output weights for another neural network.

@Guytron95 2 жыл бұрын

Livestream interview with chat Q&A from the viewers at the end (last 15 minutes or so) would be great. Nick Zentner has been doing geology interviews long form for the last couple months and it has been superlative for discovering new questions and ideas.

@norm7090 2 жыл бұрын

Long explanation w interview please.

@theodorosgalanos9663 2 жыл бұрын

Is it possible to try this approach but generate MLP models? I'm thinking whether a hypernetwork for NeRF models is possible

@BboyDschafar 2 жыл бұрын

Great Paper, Great Interview.

@KnowNothingJohnSnow 2 жыл бұрын

Is there any recommanded video talk about semi supervised learning research ? becuase i just know about teacher model and semi-GAN .... Thanks

@Supreme_Lobster 2 жыл бұрын

Question: how "Hyper"/meta can you get with a setup like this before the resulting performance gets worse/doesnt improve?

@АлексейТучак-м4ч 2 жыл бұрын

if we want to input a real number x into a nn, it is a lot better to represent it as a vector of sines sin(Ni*x) with various N (random fourier features, positional encodings etc) maybe if we want nn to output a number precisely we could make it output vector of sines and then guesstimate what number is encoded in that vector? or output it as a weighted sum of entries of a vector (like harsh and fine tuning knobs on old devices, but with a lot more knobs) with weights from geometric progression, like (0.8)^i x=1000*summ Xi*(0.8)^i

@oluwayomiolugbuyi6670 2 жыл бұрын

Love both methods more yours but lovely to have both sides

@patrickl5290 2 жыл бұрын

at this rate, we’ll see the hyper hyper transformer in another 4 years

@XOPOIIIO 2 жыл бұрын

You said his name very approximately correct so it turned out to be unintentionally insulting, lol.

@chinmayakaundanya3151 2 жыл бұрын

Long videos please.

@bojan368 2 жыл бұрын

it may be that self-attention is slightly conscious

@laurenpinschannels 2 жыл бұрын

hey no jokes here. the attention might get self conscious

@petevenuti7355 2 жыл бұрын

I find the terminology overly misleading at this level. Someday though, it will be used as evidence against us.

@mr.heuhoi1446 2 жыл бұрын

I really like the format, but i feel that the length of the videos is a bit intimidating, at least for me. I understand that it is hard to condense such in depth scientific discussion, but i think least videos at least under an hour would be more attractive for a lot of people

@kimchi_taco 2 жыл бұрын

➕Long intro

@vzxvzvcxasd7109 2 жыл бұрын

Maybe, maybe it might be more useful if the interviewees get to watch your explanation before the interview. Then they know what you've covered, or what they think you've made a incorrect impression of

@starkest 2 жыл бұрын

love it

@norik1616 2 жыл бұрын

The SOTA ML buffet.

@ssssssstssssssss 2 жыл бұрын

I prefer two videos. One with the interview and one with the explanation... But I also feel you are less critical when you do the interview also. I think it might be good for you to criticize and then the author can get a chance to rebut the criticism.

@CreativeBuilds 2 жыл бұрын

Hey Yannic, I much rather have two videos, one video of you formally taking your time to go over and explain the paper and another that is the interview with the author (if you feel the paper is good enough to where it warrants it) honestly I usually just watch your interpertation to get up to speed on what the papers are for, but I tend to not want to listen to the conversations with authors just because that 'flow of information' feels different to my brain and isn't what I want when watching these videos. I do like having the option though, which is why I feel two videos are better. Then you can cross-link between the videos to drive youtube engagement even more.