How DINO learns to see the world - Paper Explained

  Рет қаралды 6,052

Boris Meinardus

Boris Meinardus

Күн бұрын

Пікірлер
@akshaymundra1052
@akshaymundra1052 7 ай бұрын
Loved your series on self-supervised learning. Are you also planning to cover DINOv2? I am particularly curios about the emergence property of the model -- how it is able to regress semantically consistent features for different parts of the objects (and not simple FG-BG separation as in DINOv1)!
@江楓漁火-e5u
@江楓漁火-e5u 4 ай бұрын
Hi, I'm a bit confused about the centering method you described in this video(3:25). In your video, you're adding the center to the online network's output, which is different from what I've seen in other implementations of DINO (kzbin.info/www/bejne/nmTMm2Z8aMiDf80si=BUj7iQMXKaEs0Nr1&t=1296). Most implementations subtract the center from the output. Could you please clarify if there's an error in the video or if this is a different approach to centering?
@nasosgerontopoulos5267
@nasosgerontopoulos5267 Жыл бұрын
Very good content. Congrats 👍. Reading papers can be tough for many people, and such videos make it a lot easier to keep up with these state of the art advancements. As a fellow researcher, do you think investing time in self-supervised learning research is worth it right now? Considering that me and my team do not have access to such computational power as META and Google, I am not sure if we can keep up.
@borismeinardus
@borismeinardus Жыл бұрын
Hey, thanks! 😊 I think it is worth it! SSL is a broad field and SSL in the case of Multi-Modal Learning is very relevant. Yes, you will likely not be able to build the largest foundation models and go for scale, but you can definitely work on more nuanced research. E.g. Imagebind is a great example of a simple idea that does not require all the data and compute in the world. Btw. I also have a video on that paper :) kzbin.info/www/bejne/h4KtZHyIZcabg80si=VYxxIQPiyAXnlsw9
@benmainbird
@benmainbird Жыл бұрын
Great video! Keep it up👍
@borismeinardus
@borismeinardus Жыл бұрын
Genuinely happy to hear you liked it, thanks! ☺️
@menkiguo7805
@menkiguo7805 6 ай бұрын
it dose has the projection head though
@yossefdiab7452
@yossefdiab7452 9 ай бұрын
great explaination
@borismeinardus
@borismeinardus 9 ай бұрын
thank you ☺️
@carsongutierrez7072
@carsongutierrez7072 Жыл бұрын
Transformers~ ML bro~
@borismeinardus
@borismeinardus Жыл бұрын
👾
Fixing SimCLRs Main Problem - BYOL Paper Explained
12:18
Boris Meinardus
Рет қаралды 4,6 М.
The Ultimate Sausage Prank! Watch Their Reactions 😂🌭 #Unexpected
00:17
La La Life Shorts
Рет қаралды 8 МЛН
How To Choose Mac N Cheese Date Night.. 🧀
00:58
Jojo Sim
Рет қаралды 91 МЛН
If people acted like cats 🙀😹 LeoNata family #shorts
00:22
LeoNata Family
Рет қаралды 15 МЛН
How to parametrize the contour of an elliptic BW image
1:11
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,3 МЛН
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 1,8 МЛН
Can Contrastive Learning Work? -  SimCLR Explained
9:35
Boris Meinardus
Рет қаралды 11 М.
OpenAI CLIP: ConnectingText and Images (Paper Explained)
48:07
Yannic Kilcher
Рет қаралды 136 М.
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
31:51
Algorithmic Simplicity
Рет қаралды 203 М.
The Harsh Reality of Being an ML Researcher
11:17
Boris Meinardus
Рет қаралды 18 М.
DETR: End-to-End Object Detection with Transformers (Paper Explained)
40:57
DINO: Self-Supervised Vision Transformers
21:12
Soroush Mehraban
Рет қаралды 3,2 М.
The Ultimate Sausage Prank! Watch Their Reactions 😂🌭 #Unexpected
00:17
La La Life Shorts
Рет қаралды 8 МЛН