ConvNeXt: A ConvNet for the 2020s | Paper Explained

  Рет қаралды 16,910

Aleksa Gordić - The AI Epiphany

Aleksa Gordić - The AI Epiphany

Күн бұрын

❤️ Become The AI Epiphany Patreon ❤️
/ theaiepiphany
👨‍👩‍👧‍👦 Join our Discord community 👨‍👩‍👧‍👦
/ discord
In this video I cover the recently published "A ConvNet for the 2020s" paper. They show that ConvNets are still in the game! - by adding new design ideas and training procedures they outperform vision transformers even in big data regimes and without any attention layers.
Convolutional prior continues to stand the test of time in the field of computer vision.
Note: I also partially cover the Swin transformer paper in case you missed out on that one. :)
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper: arxiv.org/abs/2201.03545
✅ GitHub: github.com/facebookresearch/C...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
⌚️ Timetable:
00:00 Intro - convergence of transformers and CNNs
05:05 Main diagram explained
07:40 Main diagram corrections
10:10 Swin transformer recap
20:20 Modernizing ResNets
24:10 Diving deeper: stage ratio
27:20 Diving deeper: misc (inverted bottleneck, depthwise conv...)
34:45 Results (classification, object detection, segmentation)
37:35 RIP DanNet
38:40 Summary and outro
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💰 BECOME A PATREON OF THE AI EPIPHANY ❤️
If these videos, GitHub projects, and blogs help you,
consider helping me out by supporting me on Patreon!
The AI Epiphany - / theaiepiphany
One-time donation - www.paypal.com/paypalme/theai...
Huge thank you to these AI Epiphany patreons:
Eli Mahler
Petar Veličković
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💼 LinkedIn - / aleksagordic
🐦 Twitter - / gordic_aleksa
👨‍👩‍👧‍👦 Discord - / discord
📺 KZbin - / theaiepiphany
📚 Medium - / gordicaleksa
💻 GitHub - github.com/gordicaleksa
📢 AI Newsletter - aiepiphany.substack.com/
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#convnext #visiontransformers #computervision

Пікірлер: 39
@chankhavu
@chankhavu 2 жыл бұрын
I like how in your videos, you not only explain the details within the paper but also the more "meta" stuffs that is harder for people to grasp without reading through a lot of papers. Reading and understanding one paper is easy. Develop an intuitive understanding of a whole research subfield and its general directions is the hard part.
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Thanks! Yes this one was rich in contextual information: DanNet, diagram correction from Twitter, and Swin transformer mainly I guess? Well, it's oftentimes hard to understand a specific paper without having all the necessary context - and it takes time to accumulate it.
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
We need to start working on reasoning - perception is converging we're out of ideas lol Bad jokes aside - at this point, it seems that CNN priors are quite adequate (in the case of natural images) - a hybrid approach (initial stages CNN-like and later stages transformer-like) seems to be the way to go, but the game is still on.
@yevhendiachenko3703
@yevhendiachenko3703 2 жыл бұрын
Thank you! The video is excellent. I like that you mix code + paper in explanation and the fact that you provide a context and highlight the most essential parts.
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Thank you!
@lalitmrinki
@lalitmrinki Жыл бұрын
Thank you for such an in-depth explanation. Your plan of explaining the history and convergence and then going through the paper and code is great way for learners to understand the concepts deeply. Its very important to select the important portions from the paper for further exposition and to leave-out unnecessary boilerplate stuff. I liked that you didn't say "go and read the paper yourself"!
@gauravlochab9614
@gauravlochab9614 2 жыл бұрын
Thanks for the amazing explanation. Yes mixing up the code and paper boosts the implementation speed many folds. I love your work, you are awesome!
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Thank you man
@adityakane5669
@adityakane5669 2 жыл бұрын
Excited for this one!
@armingh9283
@armingh9283 8 ай бұрын
Thank you. Very informative
@PritishMishra
@PritishMishra 2 жыл бұрын
Very thanks for the awesome content!
@MrSebastian12358
@MrSebastian12358 2 жыл бұрын
Thanks a lot for your amazing effort.
@sushantgautam773
@sushantgautam773 2 жыл бұрын
Nice Explanation. By the way, could I know which software you are using just showing multiple things in one.
@manub.n6223
@manub.n6223 2 жыл бұрын
Thank you so much for the brilliant explanation
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Thanks! 🚀
@a23956491z
@a23956491z 2 жыл бұрын
very nice content! I even didn't notice they use the old ResNet top-1 acc instead of wightman's. And that's make this model less comparative to the SOTAs.
@user-co6pu8zv3v
@user-co6pu8zv3v 2 жыл бұрын
Thank you!
@ansariyusuf4774
@ansariyusuf4774 2 жыл бұрын
mix of paper and code is great!
@francomarchesoni9004
@francomarchesoni9004 2 жыл бұрын
agreed
@marvlousdasta2566
@marvlousdasta2566 2 жыл бұрын
Great video as always. What software are using to present and annotate the paper?
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Thanks! OneNote.
@luna0609
@luna0609 2 жыл бұрын
This was a great video. The best I've seen about explaining a research paper. 👏
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Hah I don't know about that but thanks! 😂
@Kenny4PresidentFTW
@Kenny4PresidentFTW 2 жыл бұрын
this channles videos are amazing
@oscarllerena2980
@oscarllerena2980 6 ай бұрын
@ 2:32, @TheAIEpiphany explaines "it pretrified an image" ... can anyone please explain what does it mean "to petrify" in this context? Does "petrification" involves encoding the visual information in a proper format for transformers?
@mritunjaymusale
@mritunjaymusale 2 жыл бұрын
the pre-training they did on Imagenet-22k was supervised or unsupervised like the way transformer papers do ?
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Supervised - same as ImageNet 1k. :)
@eng_ajy5091
@eng_ajy5091 2 жыл бұрын
Hi , First of all, I would like to thank you for your excellent and wonderful videos on artificial intelligence. I am a PhD student working on fast video captioning and I hope to reach real time captioning But I am confused by too many articles and too many techniques and algorithms in this field I need your help in guiding me to choose the right path among the existing methods: (traditional CNN, Transformer, YOLO, self attention only or make combination or others ) While maintaining a trade-off between speed and accuracy
@JapiSandhu
@JapiSandhu 2 жыл бұрын
Can this be used for video classification?
@oncedidactic
@oncedidactic 2 жыл бұрын
always semirants!
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
My made up word just got its 1st validation - it's an official word from now on!
@oncedidactic
@oncedidactic 2 жыл бұрын
ayyyyyyyyyyyyyyyyyyyy :D
@mahmoodkashmiri
@mahmoodkashmiri 2 жыл бұрын
what tool do you use to read research papers on Ubuntu? Thank You!
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
I use OneNote on Windows!
@mahmoodkashmiri
@mahmoodkashmiri 2 жыл бұрын
Thank You 😊
@momcilomrkaic2214
@momcilomrkaic2214 2 жыл бұрын
Boss
@TheAIEpiphany
@TheAIEpiphany 2 жыл бұрын
Hugo
@jonathansum9084
@jonathansum9084 2 жыл бұрын
3, 3, 9, s3. What does the s3 mean?
@kirillmenke9573
@kirillmenke9573 2 жыл бұрын
It means the authors made a typo
Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers
22:39
ConvNeXt: A ConvNet for the 2020s - Paper Explained (with animations)
19:20
AI Coffee Break with Letitia
Рет қаралды 20 М.
狼来了的故事你们听过吗?#天使 #小丑 #超人不会飞
00:42
超人不会飞
Рет қаралды 63 МЛН
CAN YOU HELP ME? (ROAD TO 100 MLN!) #shorts
00:26
PANDA BOI
Рет қаралды 36 МЛН
КАК СПРЯТАТЬ КОНФЕТЫ
00:59
123 GO! Shorts Russian
Рет қаралды 3,2 МЛН
H-1 NMR Spectroscopy Live  Session 13
46:23
Dauds Chemistry Academy
Рет қаралды 12
Machine Learning with Flax - From Zero to Hero
1:18:16
Aleksa Gordić - The AI Epiphany
Рет қаралды 17 М.
THE SERVANT (1963) | 4K Restoration | Official Trailer | Dir. by  Joseph Losey
1:01
Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
24:57
Aleksa Gordić - The AI Epiphany
Рет қаралды 39 М.
Efficient Geometry-aware 3D Generative Adversarial Networks | GAN Paper Explained
42:50
Aleksa Gordić - The AI Epiphany
Рет қаралды 15 М.
How does Groq LPU work? (w/ Head of Silicon Igor Arsovski!)
1:11:46
Aleksa Gordić - The AI Epiphany
Рет қаралды 16 М.
DETR: End-to-End Object Detection with Transformers (Paper Explained)
40:57
狼来了的故事你们听过吗?#天使 #小丑 #超人不会飞
00:42
超人不会飞
Рет қаралды 63 МЛН