I have got to say I am deeply grateful for how you express jargon in other words. I realize scientists have to use formally-recognized vocabulary in their papers, but I do believe this hinders communicating science -- at least for beginners like myself.
@immortaldiscoveries30384 жыл бұрын
ya cus i have no frikin clue what he said in the video, and I'm the top AGI researcher on Earth as it stands, so...
@WannabePianistSurya4 жыл бұрын
Inspired by your videos, started a research paper study group thing, presenting papers in a similar way of your videos. I thank you for putting these kind of videos for us plebs who don't have access to high quality education (I'm from India). Your work is invaluable.
@DeepGamingAI4 жыл бұрын
Any vacancies in your study group? 👀 (Would love to have people to discuss these topics with since learning in isolation isn't proving to be very productive.)
@arjunashok49564 жыл бұрын
@@DeepGamingAI +1. Same question. I am an undergraduate and I have worked on a few research projects.
@manojb88764 жыл бұрын
Same here! Please let us know
@rahuldeora11204 жыл бұрын
@@manojb8876 Yes let us know
@WannabePianistSurya4 жыл бұрын
Hello, I am glad to see so many people are interested. Please message me on LinkedIn, Name-"Surya Kant Sahu", (not sure if I can post link here). I'll send you a link to the group.
@PortfolioCasWognum4 жыл бұрын
Hi Yannic, thank you for your consistent, high quality videos! I would be very interested in hearing your take on "Making sense of sensory input" by Richard Evans et al. (DeepMind). I have been lucky enough to attend a presentation by Richard Evans at my university on the paper, but have more recently, especially after your video on Chollet's paper, repeatedly found myself thinking that it might be (the start of) a big missing piece in current AI research, yet it doesn't seem to get any attention.
@andresfernandoaranda54984 жыл бұрын
Your contributions are valuable, thanks for doing these vids))
@creatiffshik4 жыл бұрын
Probably they should just go ahead allowing triple connection? Two papers down the line.... Great break down!
@abhirajkanse64184 жыл бұрын
Bruh, the exact same idea(attn between diff-size features) came into my mind while you were explaining the paper. Sounds quite interesting, won't be surprised to see a paper on it soon.
@chris--tech4 жыл бұрын
the first thing i need to do every morning is checking out your videos to see which new papers recently is proposed in deep learning. By the way i'm a student from Beijing, it's helpful to know the newest progress and thanks for sharing.
@slackstation4 жыл бұрын
As I was watching, I had the intuition of what if we could vary the blocks for learning on each image. Then maybe we could learn from the tags on those images what the best route of learning blocks is for that tag or combination of tags. We could learn that scenes with many things or overlapping things tend to do well with this route, outdoor scenes do better with this route, etc. Then I see your proposal at minute 28. It feels good for an intuition to be in line with someone of your expertise. As always, great work. This gave me a good insight into the reasoning and architecture of Resnet models.
@YannicKilcher4 жыл бұрын
I like your idea ;)
@florianhonicke54484 жыл бұрын
Nice idea to compute the perfect network using attention. We should try that out
@Tehom14 жыл бұрын
The paper mentions reward to the super-NN that controls the topology of the NN, implying that the super-NN is always optimizing for best final performance, but I have to wonder if it might be better for the super-NN to spend most of its time active-learning to predict final performance from initial configuration, and only optimize after that is learned well. But I like your design better.
@swanbosc53714 жыл бұрын
A year ago I read the paper octConv and thought of an architecture a little like what you are proposing there. However, I didn't use attention layer to route information : After each blocks, half of the features would be kept and the other half would be up or down sampled to be stacked with other feature maps. I started too investigate the addition of SEblocks to take care of the "dynamic" routing of information. Never finished testing this tho
@tylertheeverlasting4 жыл бұрын
The double input to layers is technically not a double input because there's almost always some small number of layers in between. The double skip output is a new thing though (that I know of) ... Typically the double outputs are to the FPN/UNet etc block, but the backbone usually has single skip output.
@herp_derpingson4 жыл бұрын
15:21 Reminds me of the paper "SqueezeNet", it used to be quite popular back in 2017. . 16:50 There must be some bandwidth limit for gradient information in a float32. We cant just reduce the dimensions of a matrix to a 1x1 and expect it to have the same performance. I wonder what happens if we use float64 for these bottlenecks. . 20:19 Doesnt our brain look like this too? In our brain the neurons are pretty much fully connected locally. It would be interesting if someone made a network which has skip connections going from all layers to all layers, even backwards. Ok, maybe not backwards. 28:15 Very similar idea, now also add skip connections. . Also, although the network has the same number of parameters, it does not mean that it consumes the same about of floating point operations. I think it should take significantly more due to the large number of upsample and downsample operations.
@YannicKilcher4 жыл бұрын
Yea I thought of SqueezeNet too, but I guess that was also hand-designed, so not as fancy cool :) I think they explicitly measure flops and show that theirs consumes less, but I agree, compared to something like a vgg, the TPU processing this jumbled mess must be constantly tripping up :D
@samkumar23773 жыл бұрын
Really this is very helpful. I wanna see it's improvisation with capsule networks.
@adamantidus4 жыл бұрын
Thank you, Yannic, for the great job you are doing! I am not a big fan of this kind of research. Of course, if you explore enough you will eventually come up with a weird architecture that happens to work better. This, however, comes at the cost of sacrificing the intuition behind the model and our (already limited) understanding of the whole thing. I think Yannic made the point when he says that the idea of doing the spine model deeper by simply concatenating blocks goes in the opposite direction of the core idea in the paper. I also agree with Yannic in that it is quite likely that the boost performance comes from the fact that the connections have been doubled, and that this should have been explored further. Finally, the idea of Yannic about implementing a sort of dynamic routing via attention is very interesting. It should even be computationally cheaper than using RL to explore architectures. The whole paper is interesting though. Thanks again for reviewing it!!
@YannicKilcher4 жыл бұрын
thanks for the feedback :)
@oneman70944 жыл бұрын
It would be interesting to see how the architecture found differs from one image dataset to the other. It seems to me that the found architecture could not be worse that ResNet50 (it could just use that then) and that this is just hyperparameter overfitting.
@deterministicalgorithmslab17444 жыл бұрын
I think the ablation with ResNet with 2 connections because each layer of ResNet already has 2 connection, one residual and one transformational. There are no residual connections in SpineNet.
@dshlai4 жыл бұрын
I wonder how this new backbone compared to the backbone used in more modern detection and segmentation network (SENet or CSP)
@samanthaqiu34164 жыл бұрын
7:10 Those skip connections seem to make ENTIRELY POINTLESS the bottleneck, since the goal was to force it to learn high-level global features Why is my conclusion wrong? Or do you agree that skip connections defeat the purpose of the bottleneck?
@YannicKilcher4 жыл бұрын
You might be right, but on the other hand, skip connections have no or few learnable parameters, so any computation still has to be done by the bottleneck
@UsmanAhmed-sq9bl4 жыл бұрын
Awesome. Great video. Keep going. 🎊👍
@noninvasive_rectal_probe89904 жыл бұрын
Waiting for ultimate YannicNet-69 with routing by attention🤗
@Ting36244 жыл бұрын
the juice of this video : 27:00
@siyn0074 жыл бұрын
You said it in the last video. It seems like this field is starting to become more and more difficult to navigate by just using a laptop. I wonder what the next field is to only need a laptop for...RL atm?
@YannicKilcher4 жыл бұрын
idk, let's just all become cattle ranchers :)
@siyn0074 жыл бұрын
@@YannicKilcher XD
@alexanderchebykin64484 жыл бұрын
The idea you propose looks like this paper (openreview.net/pdf?id=BkXmYfbAZ ), except with weird block sizes
@manojb88764 жыл бұрын
Link doesnt work
@alexanderchebykin64484 жыл бұрын
@@manojb8876 thanks; it didn't work because the bracket became part of the link; seems to work now
@Ronschk4 жыл бұрын
I don't think 1x1 convolutions were introduced in the resnet paper (as you say around 15:12), but in "Network in Network" [ arxiv.org/pdf/1312.4400.pdf ](?) Not that it matters that much :P
@YannicKilcher4 жыл бұрын
True, thanks :)
@joddden3 жыл бұрын
Why did you write the "c" in "cat" last?
@speed100mph4 жыл бұрын
isn't your idea very similar to inception network ?
@aishwaryabalwani75454 жыл бұрын
Thought the same too - except that attention allows the network to be a little more "dynamic" than InceptionNet...
@G12GilbertProduction4 жыл бұрын
But meta-supervised networks architecture in 153×26 for the block segments it's outfront for other 153 block company analysed by hyperstructure resource decrease in not themselves source data, but covered outerspace of RNN.
@pablovela20534 жыл бұрын
I'd love for you to go through the paper HRNet arxiv.org/abs/1908.07919 as its exploring a similar concept
@samjoel41524 жыл бұрын
Yannicks idea is awesome...but we need heavy computational resources to do this I guess😅
@444haluk4 жыл бұрын
This paper is a pseudo meta-learning. "Try every possible combination but eliminate some combination via RL, voila reward is ready."
@ahmadchamseddine68914 жыл бұрын
Terrorists can use it for their own purposes as much as any military/intelligence agency!! 35:28
@sheggle4 жыл бұрын
Is your proposal not just Google's 600B parameter language model?
@YannicKilcher4 жыл бұрын
mine's only 599
@marat614 жыл бұрын
It is very chinese article
@StanislavSchmidt14 жыл бұрын
Hi Yannic thanks for the effort you're putting into your videos. A couple of comments I have: 1. I find the fact that you giggle quite a lot throughout your videos a bit distracting. But maybe it's just me. 2. I see you say things like "to up the number of features" but I think saying "to increase the number of features" would sound better. Maybe a native speaker could give their opinion here.
@Coolguydudeness12344 жыл бұрын
Don't mind the giggles personally
@rbain164 жыл бұрын
I don't mind them either. The criticisms seem rather trivial, no offense.